[jdom-interest] JDOM2 and Performance.

Rolf jdom at tuis.net
Tue Oct 18 20:11:24 PDT 2011


Hi Again.

Just committed a new snapshot of JDOM2 together with the JavaDocs, jUnit 
and coverage reports, and a performance update to:
http://hunterhacker.github.com/jdom/jdom2/

The performance has been mostly restored, and there are big improvements 
in the XPath processing (even though I changed nothing in that area... 
;- )  , it is all to do with more efficient Iterator implementations in 
the ContentList.

See http://hunterhacker.github.com/jdom/jdom2/performance.html

I have done the first major refactor of JDOM2 code, essentially 
rewriting the XMLOutputter code. It is much neater, consistent, and, 
should you need to, it is now completely 'extensible'.

By changing the way the code is structured, the XMLOutputter is now 
reentrant, and yet still just as fast, if not faster for some things.

I have found and fixed a lot of obscure bugs that may have been plaguing 
people even if they did not know it..., like if you have 
xml:space="preserve" embedded in your XML document, then JDOM would 
happily insist on outputting whatever content was inside that in the 
UFT-8 encoding, even if you had requested some other encoding...

This particular refactor has taken a lot of time, so I have to back off 
a little and catch up on some other things in life... back to just 'JDOM 
on the train' for a bit.

Rolf

On 15/10/2011 6:17 PM, Rolf wrote:
> Hi all.
>
> I've come close to restoring the JDOM 1.1.2 levels of performance.
>
> When 'fixing' code in JDOM2 I cam accross a numbr of different places
> where namespace processing is performed (calculating the 'in scope' and
> the 'added' namespaces for an Element). This code was scattered in
> various places, inconsistent, and some places were buggy. I ended up
> stripping all of these places and replacing them all with the
> Content.getNamespacesInScope() concepts.
>
> While convenient, the Content.getNamespacesInScope() methods were (much
> too) slow because they dynamically calculate the Namespaces each time
> they are called (which is fine for unstructured requirements where the
> document structure could change from one moment to the next).
>
> I have thus re-implemented a new 'Namespace Stack' which is much faster
> than a completely dynamic calculation, and it is able to replace the
> various other 'stacks' that were removed before.
>
> This has (mostly) 'restored' the performance of JDOM2's 'guts', I seem
> to be about 1-2% slower at the moment than JDOM 1.1.2
>
> If you look at the numbers you will see that the 'Dump' code is still
> slow though. The Dump code dumps the document in the three main formats:
> Pretty, Raw, and Compact. This is running slow, and is probably related
> to the changes made for Issue #31.
>
> I'm going to fix up that performance in XMLOutputter, and hopefully that
> will pull back the performance numbers on the other areas (the 1% - 2%)
> because each of those processes use the XMLOutputter in some way.
> The Dump is particularly slow because it uses the more complicated
> Pretty and Compact mechanisms....).
>
> The 'performance' page below has been updated...
>
> Rolf
>
> On 13/10/2011 8:27 PM, Rolf wrote:
>> Hi all.
>>
>> I have put together a 'simple' system for measuring the relative
>> performance of JDOM2. The idea is that I need to know whether I am
>> improving or breaking JDOM performance as the code evolves.
>>
>> Currently the metric code is only useful of you compare apples to
>> apples, and, in this case, it means processing a single (medium size)
>> XML document on my laptop, yada-yada-yada. But, it should be useful as a
>> tool to get a feel for what a code-change does.
>>
>> Already I can see that I probably have an issue in the SAXHandler
>> (possibly an issue in JDOM-1.1.2 actually) because 1.1.2 is 5-times
>> faster in that area than JDOM2.
>>
>> I have put together a results page here:
>>
>> http://hunterhacker.github.com/jdom/jdom2/performance.html
>>
>> It also describes what each test does. If you are interested in seeing
>> the code and what it does have a look here (it is not well documented
>> and it is still perhaps evolving):
>>
>> https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb
>>
>>
>>
>>
>> Rolf
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>



More information about the jdom-interest mailing list