[jdom-interest] Presences of Namespace dramatically slows SAXBuilder

Jason Hunter jhunter at acm.org
Tue Apr 3 12:52:08 PDT 2001


Ken Klose wrote:
> 
> I apologize if this has already been discussed.  I have been using JDOM to
> read and manipulate rather large documents (via SAXBuilder).  The documents
> are changing - moving the elements into a namespace - and performance has
> become horrible. I've whipped together an example that constructs a document
> similar to the one in question (both with namespaces and without) and reads
> it in using SAXBuilder and DOMBuilder.  A summary of the results:
> 
> No namespace DOMBuilder read: 551ms
> Namespace DOMBuilder read: 2383ms
> No namespace SAXBuilder read: 371ms
> Namespace SAXBuilder read: 24,075ms
> 
> Questions:
> 1. Do namespaces really add so much complexity that they alone should
> account for a four-hold increase in the parse time of DOMBuilder?
> 2. Is the problem resulting in the 64 fold increase in SAXBuilder parse time
> in SAX or JDOM?
> 
> Again I apologize if this is old news.
> 
> I've attached the Java source file I used to produce the above results.

Short summary:  The 4x DOM slowness is primarily due to the test not
measuring precisely.  The 64x SAX slowness is due to something that can
be fixed in SAXHandler.

On the DOM side: Your tests aren't "priming the pump" before doing the
performance timing, so the "read" times listed include the time to load
the JAR and classes for the parser.  Your test does the "Namespace" read
first, and since that test has to do the "priming" that's why it's so
much slower.  With a modified test that "pre-primes" the parser I see
Xerces is 1.8x slower with namespace handling than without and Crimson
is just 1.4x slower.  That makes sense for a basically empty data file
like you're testing, because without other data the namespace
calculation will consume a fair chunk of the time.  Also, if you're
using beta6 there have been some significant performance enhancements in
Namespace.

On the SAX side: Priming alone didn't bring the SAX build in line, so I
did a little OptimizeIt testing and found that one line was taking the
vast bulk of time:

  availableNamespaces.remove(element.getAdditionalNamespaces());

I changed it to:

  List addl = element.getAdditionalNamespaces();
  if (addl.size() > 0) {
      availableNamespaces.remove(addl);
  }

And now what did take 13219ms now takes 580ms (on Xerces).  Quite a
change, and right in line with the no-NS build that takes 470ms.  I'll
be checking in the enhancement after this.

Thanks, Ken, for bringing this to my attention.

-jh-



More information about the jdom-interest mailing list