[jdom-interest] About JDOM performance

Ken Rune Helland kenh at csc.no
Tue May 22 03:04:49 PDT 2001


At 11:19 AM 5/22/2001 +0200, phil at triloggroup.com wrote:
>Hi,
>
>As I'm using JDOM, I did many performance checks with the Beta6/7 release.
>
>First,  it seems that 2 methods create unwanted PartialList : 
>Element.getMixedContent() & Element.getAttributes(). Since
>the partial lists in these cases are an exact copy of the contained lists, 
>it is safer to return these lists instead of
>creating a temporary partial list. As we are parsing a tree by calling 
>several times these methods, JProbe pointed out a
>gain of more than 50% by using modified methods!! I think this is an 
>error, since the Document.getMixedContent() is
>correctly implemented.

This is under reimplemettation using som kind of filterlist, I dont know 
the details.



>Secundo, getChildren()/getAttributes() are also a performance bottleneck. 
>I think that methods like getChildrenIterator
>()/getAttributesIterator(), using a filtering iterator, will prevent the 
>partial list creation and greatly enhance
>performance. I saw that many times we are doing code like Iterator 
>it=element.getChildren().iterator(). If you need such
>filtering iterator implementation, I can send a simple one to you.

Same reply as above.


>Tierco, the LinkedList use is really memory consumming. For example, I'm 
>parsing HTML files and JProbe stated that more
>than 11,000 LinkedList$Entry objects were created! Moreover, debugging 
>(with JBuilder) these list content is really
>cumbersome since you cannot directly view an absolute item.
>According to many books on Java perfomance, linked lists are definitively 
>slower than an array based collection, except
>for some kind of Queue, which is not
>how JDOM use the collections. Please, let us using the List implementation 
>we want...

If you can make an educated guess about the number of items in a list and 
you are not
inserting items int the front or midle of the list, the array list is 
definitively
most effective.

But in the most time-critcal (IMHO) case is building a document from an 
external source
and when parsing a document you are inserting an unknown number of items in 
the list
and you have no way of knowing how many items are to be inserted beforehand.

Also if inserting someting in the front or midle of a large ArrayList will 
be ineffective.

So if a single listtype is to be used a linked list must be the choice.

On the other hand it woud be wery nice if the programmers when they know 
someting
about the document they are building could decide what kind of list to use. 
but this
is not trivial.

The elements woud have to have their list assigned to them  in the constuctor,
or have a useList(List list) metod. And the filterList implementtation
woud have to work on top of another List instead of being the List, wich is 
probarly
somewhat less effective, and more complex to handle.

But since we already have introduced factories for builders it woud be easy 
to get the builders
to use the modified Elements.

>Finally, the systematic checking of parameters is really annoying: to 
>overcome them, I created my own classes derivated
>from the original one (Element...) and it created a tree more than 2 times 
>faster. I think that such checking must be
>optional. It is a really good option but may be disabled: when I read my 
>DOM from a database, I sure that it is already
>valid...
>But the question is how to know if it has to be checked or not. A static 
>option is not thread safe, a parameter in the
>underlying document is not always available (for example, an Element 
>constructor does not know about the Document).
>Well, after thinking about this, I founded that the more reliable solution 
>is to add a 'boolean validate' to each ctor
>and methods that do a validation (addAttribute()...). We will then have 
>something like:
>Element( String name, boolean validate) {
>     if( validate ) {...} // Do validation
>}
>Element( String name) {
>     this(name,true); // Validation is done by default.
>}

I agree on this one Especially the builders need to be able to circumvent 
this checks
as all elements a new created and can't belong to another document or 
already be
in the new document.



>Please, give me your feedback about that. Today, JDOM is a great framework 
>but really needs some optimizations to be use
>in a production context.
>
>Phil.

Greetings
KenR




More information about the jdom-interest mailing list