[jdom-interest] Internal DTD subset verification

Philip Nelson panmanphil at yahoo.com
Wed May 8 20:15:58 PDT 2002


--- Dennis Sosnoski <dms at sosnoski.com> wrote:

> Elliotte Rusty Harold wrote:
> 
> > ...Keep in mind that in many scenarios I/O concerns are likely to 
> > swamp any issues with verification, and when they don't the speed of 
> > the underlying SAX parser is probably the second biggest factor. 
> 
> Not even close. The build time for a document model is much larger than 
> the parsing time for fast parsers. 

Yup, most of my xml files come from a filesytem (very fast) or a database
(quite fast).  IO turns out to be pretty insignificant.  And IO doesn't eat cpu
cycles which are much more valuable than time.

> > Have you checked out DOM lately? Several implementation have gotten a 
> > lot better in the last couple of years. 
> 
> That's definitely true. Xerces is faster than JDOM pretty much across 
> the board - they've obviously put a lot of work into optimizations.

That would seem to indicate we should be moving towards a faster api, wouldn't
it?  I still believe, based on no solid evidence, that JDOM should be able to
be faster than xerces dom.  I'll admit to getting nervous now...

> 
> > ...I've never seen anybody pick JDOM for performance or memory 
> > reasons. For one thing, it's not at all clear that JDOM is faster or 
> > uses less memory than modern DOMs like Xerces-2. The benchmarks in 
> > this area range from abominable to non-existent, and are typically 
> > written to prove that the author's pet API is better than the 
> > alternatives. 
> 
> Gee, thanks! :-) I actually started on my set of tests because I got 
> tired of seeing unsubstantiated PR claims about JDOM performance 
> compared to other models.

Yes, I think Dennis' benchmarks have been a wakeup call.  We deliberately took
the approach of make it work first and then tune it later.  As it turns out,
tuning involves some compromises that are difficult to make.  I have to agree
that the performance hit from verification won't be enough to close the gap.  I
can't agree that the performance doesn't matter enough to make a significant
number of people choose JDOM over DOM, in spite of the short term productivity
gains. The "unsubstantiated pr claims" however were based on untested ideas
rather than any sort of deception I think.

> Aside from this whole dispute over verification and performance, it's 
> worth noting that most applications where people are currently using 
> document models (DOM, JDOM, dom4j, etc.) are much better suited for data 
> binding. <snip>. As data 
> binding becomes more prevalent I think usage of document models will 
> fade away except in applications that really need to work with the XML 
> document structure (generic document handlers such as editors or 
> transformation applications). I think this may be a point in favor of 
> Elliotte's view of verification, though I'd personally prefer to see 
> verification as an optional feature.

I think you are probably right.  I hadn't thought about it in these terms, but
I have been using a data binding api for the last six months and doubt I would
ever go back in a data oriented application.  Between strongly typed accessors,
typed collections, schema validation and more, the programming convienience is
hard to part with.  I made a comment to Jason similar to yours some months ago
after getting exposed to Microsoft's ado.net.  These api's make xml much less
important to the developer.  Your application uses xml but it is not an xml
application.  As you point out, the developer expects that all of the xml
details are handled for them, tipping the argument toward Elliotte's point of
view. More use of a data binding tool leaves jdom with a target audience of
higher end tool developers.  What would that group value most from jdom?



__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Mother's Day is May 12th!
http://shopping.yahoo.com



More information about the jdom-interest mailing list