[jdom-interest] memory leak in 1.1.8 version of org.jdom.input.SAXHandler.flushCh aracters at line 724

Tom Oke tomo at elluminate.com
Mon Apr 8 14:43:50 PDT 2002


In the b8 build of SAXHandler for 1.1.8 the routine flushCharacters creates
the String data from textBuffer.toString(), where in the 1.2+ versions the
code simply does a substring to data from textBuffer.

        String data = textBuffer.toString();
        textBuffer.setLength(0);

The unfortunate side effect of the 1.1.8 version of the code is that the
4096
characters allocated to the StringBuffer in its declaration, are tied up and
left as a memory leak whenever data is kept, due to the writing action of
textBuffer.setLength(0).

This is due to the fact that StringBuffer has its shared (java.lang internal
boolean flag) flag set, to indicate that it is to do a copy on write, when 
toString is called on textBuffer.

This causes the setLength(0) call to duplicate the total storage allocated
to textBuffer, bleeding 4096 byte (or more if textBuffer has eaten anything 
large enough to grow it bigger).

The following correction doubles the copying, but only locks down the
used length of textBuffer, allocating it to String data, and then copying
it.

        StringBuffer dummy = new StringBuffer(textBuffer.toString());
        String data = dummy.toString();
        textBuffer.setLength(0);

The end effect is that on a large XML, in which the application hit a high
watermark of heap of 184M, the corrected code kep the heap allocation
down to 15M.


The full routine now looks like:


    /**
     * <p>
     * This will flush any characters from SAX character calls we've
     * been buffering.
     * </p>
     *
     * @throws SAXException when things go wrong
     */
    protected void flushCharacters() throws SAXException {

        if (textBuffer.length() == 0) {
            previousCDATA = inCDATA;
            return;
        }

        /*
         * Note: When we stop supporting JDK1.1, use substring instead
        String data = textBuffer.substring(0);
         */
//        String data = textBuffer.toString();
        StringBuffer dummy = new StringBuffer(textBuffer.toString());
        String data = dummy.toString();
        textBuffer.setLength(0);

/**
 * This is commented out because of some problems with
 * the inline DTDs that Xerces seems to have.
if (!inDTD) {
  if (inEntity) {
    getCurrentElement().setContent(factory.text(data));
  } else {
    getCurrentElement().addContent(factory.text(data));
}
*/

        if (previousCDATA) {
            getCurrentElement().addContent(factory.cdata(data));
        }
        else {
            getCurrentElement().addContent(factory.text(data));
        }

        previousCDATA = inCDATA;
    }

Tom Oke



More information about the jdom-interest mailing list