[jdom-interest] List's in JDOM - a small essay
Jools
jools at jools.org
Thu Mar 8 13:36:03 PST 2001
Hi
Well seeing as we've had a fair bit of traffic on the subject of
lists in JDOM, I have decided to write this email, with which I hope
to shed a little light on the way we are going with lists in JDOM.
Thanks to Philip Nelson for giving me a prod, I'm a bit behind on my
email right now (only 900 to go !)
I'm currently rewriting specific portions of JDOM to use a FilterList
which internally uses a singly linked list and allows filtering of the
items contained therein.
Iterating
~~~~~~~~~
Why use a singly linked list ? Because most operations on a JDOM object
are via an iterator, and this is normally in one direction. Thus
the FilterList is optimized for this operation, in JDOM terms it means
that the following is very efficient;
Iterator iter = element.getChildren("foo").iterator();
while (iter.hasNext()) {
iter.next();
}
Because we examine each item to see if it matches the criteria specified
in the iterator, and we only move over the list once any one single
iteration is vey efficient.
With PartialListwe would have have to iterate over every item first in
order to copy the required items into the PartialList, then we iterator
over it again using the iterator. So we save one iteration. So far so
good!
But there's no such thing as a free lunch! Here's the down side, random
access causes us an interesting problem.
Take the original code and do it the hard way (and backwards!);
List foos = element.getChildren("foo");
for (int i=foos.size(); i > 0; i--) {
foos.get(i-1);
}
This is our worst case ! for the size() call we have to iterate over the
entire list in order to see how many foos we have. Then for each item
we have to iterate through the entire list until we get to the correct
index checking each one along the way.
I can here you say "What about a sparse index?" or "Why not just keep a
reference to the last element?". Because this list needs to be totally
lightweight. Just go through your code add see how many times you call
getChildren().iterator() then you'll see what I mean.
Plus, if we get hammered on this performance side of things, then we
could
always optimize after we get it all working (Knuth).
Liveness
~~~~~~~~
FilterList always uses a 'live' reference to the data in child lists.
no copying is required in order to create a 'view' of the data in a
list.
This means that any modifications made to a child list will
automatically
be seen in the parent list.
Typically, when using iterators if a change is made to the parent list
during iteration, you might find that a ConcurrentModificationException
get thrown because the parent has been modified whilst you are
iterating.
This was one area of PartialList which bothered me (amongst others) as
we never used the backing list to get our iterators, it was from the
PartialList. So if a modification was made to the parent list, and
we had an iterator on the PartialList ConcurrentModificationException
would not be thrown and we would possibly risk our document getting
out of order.
Conclusion
~~~~~~~~~~
The preservation of order is very important to us and so is getting an
API which is not too heavy, or too CPU bound.
But all if this is in vain if we loose the correct ordering of the
document.
--Jools
More information about the jdom-interest
mailing list