[jdom-interest] Fwd: I found a bug in JDOM-b8: "org.jdom.input.SAXHandler"

Brett McLaughlin brett at newInstance.com
Sat Apr 19 05:54:10 PDT 2003


Begin forwarded message:

> From: "Glas, Florian (TUM-VT)" <florian.glas at vt.bv.tum.de>
> Date: Fri Apr 11, 2003  5:26:42 PM US/Central
> To: <brett at jdom.org>
> Subject: I found a bug in JDOM-b8: "org.jdom.input.SAXHandler"
>
> Dear Mr. McLaughlin,
>
> my name is Florian Glas.
> I'm working at the Univeristy of Technology in Munich at the Chair of
> Traffic Technology.
> We are developing a software for traffic control on the road network 
> of the
> Munich Area. To exchange data we decided to use an XML format. As our
> Software was completely developed in pure JAVA I'm using JDOM for 
> parsing
> these data files.
> I'm using the new JDK 1.4
>
> Recently I had to parse XML-files of 2 MB and more containing up to 
> 120000
> XML-tags. In this case the wastage of RAM increased up to 450 MB!! 
> Then I
> inspected the original source code of the used JDOM library and found 
> the
> place of memory wastage in the "flushCharacters()"-Method of the
> "org.jdom.input.SAXHandler" class:
>
> Instead of setting the length of the existing textBuffer object to 0,
> I replaced this line by creating a new StringBuffer object.
>
> This simple replacement had the incredible effect to reduce the waste 
> of RAM
> from 8 kB per tag to some few byte. I don't quite understand this 
> behavior
> but I'm very happy that now the application only needs 50 MB RAM 
> instead of
> 450 MB without having any performance loss.
>
> Here is the "flushCharacters()"-method I'm using now:
>
>
>     protected void flushCharacters() throws SAXException {
>
>         if (textBuffer.length() == 0) {
>             previousCDATA = inCDATA;
>             return;
>         }
>
>         /*
>          * Note: When we stop supporting JDK1.1, use substring instead
>         String data = textBuffer.substring(0);
>          */
>         String data = textBuffer.toString();
> /* This was changed by Florian Glas at 10th April 2003.*/
>         textBuffer = new StringBuffer();
> //        textBuffer.setLength(0);
>
> /**
>  * This is commented out because of some problems with
>  * the inline DTDs that Xerces seems to have.
> if (!inDTD) {
>   if (inEntity) {
>     getCurrentElement().setContent(factory.text(data));
>   } else {
>     getCurrentElement().addContent(factory.text(data));
> }
> */
>
>         if (previousCDATA) {
>             getCurrentElement().addContent(factory.cdata(data));
>         }
>         else {
>             getCurrentElement().addContent(factory.text(data));
>         }
>
>         previousCDATA = inCDATA;
>     }
>
>
> I only replaced one line of the source code:
>
>         textBuffer = new StringBuffer();
>
> instead of
>
> //        textBuffer.setLength(0);
>
>
>
> I don't know if this may cause any negative effects at other places in 
> JDOM,
> but yet I didn't realise any troubles using the changed method.
>
> Maybe I found a little bug in the source code that may cause great 
> effects -
> maybe not.
> I think the rest is up to You (or someone else) to check if there's 
> really a
> potential to raise the performance of JDOM.
> Thank You for reading my notice.
>
> Best regards,
> Florian Glas
>
> ================================
>
> Florian Glas
> Technische Universität München
> Lehrstuhl für Verkehrstechnik
>
> mailto:florian.glas at vt.bv.tum.de
> Tel.:  +49 (89) 289-23837
>
>
>
>


Thanks
---
Brett McLaughlin		
O'Reilly and Associates		http://www.oreilly.com
Author and Editor			http://www.newInstance.com




More information about the jdom-interest mailing list