[jdom-interest] JDOM BOF at SD West: problem resolving
Jason Hunter
jhunter at collab.net
Thu Apr 12 21:37:19 PDT 2001
> > * Do we keep original list data in case of illegal add?
> >
> > YES, absolutely.
> >
>
> That's how I sided initially, but the fact that IllegalAddException is
> unchecked, plus concerns about interaction with subclass implementions
> caused me to change my mind. Please see:
>
> http://lists.denveronline.net/lists/jdom-interest/2001-April/005620.html
>
> In the case of setMixedContent, I suggest we view it as a composite of more
> primitive operations. We should strive to make the primitive 'add'
> operations failure atomic, but, on failure, we should simply abort
> setMixedContent at the point of failure.
Hmm...
So the steps would be:
* Remove all content from the original tree
* Thereby removing all parentage
* Add content from the new list
* Adding parentage as you go
* Let exceptions bubble up
* Leave the old content removed and the new content partially added
* But as a special case in the finally block of doc.setMixedContent()
make sure that a Document is never left without a root element, even if
that root is new Element("bogus")
I could live with that. In fact, that was my first position on the
issue. But I was won over by the argument that, all things being equal,
it's better to be atomic. Also in favor of atomic is the logic that to
a user a set() method would seem to just do this.content = content so
they don't see it as a series of adds.
???
> By the way, tweaking JDOM for subclassing is a TODO topic in itself. There
> are some specific things we can do, such as never calling overridable
> methods from the constructors or the "pseudo-constructors" (clone and
> readObject). There are also some design/documentation issues concerning
> what subclasses are allowed to do. I'll send this TODO item along with some
> others in a separate message.
Yes, this is going to be very interesting.
> > * Singly or doubly linked list?
> >
> > Singly was unanimously agreed on, at least for now.
> > In the rare circumstances where you need to iterate
> > backward or do indexed access, you can always get
> > the List, copy it to your own List, and iterate backward.
> > Makes the performance O(n).
> >
>
> When does SinglyLinkedList beat ArrayList? I'd like to see the
> memory/performance benchmarks that are driving the SinglyLinkedList design.
Jools has his partially done FilterList in the jdom-wip module. You can
check it out and compare for yourself.
> I know it's rude to beatup on something that isn't even implemented yet. I
> support the iterative approach of implementing and testing, and so on.
> Still... Based on the comparison below, I don't see what SinglyLinkedList
> has to offer.
>
> Space: ArrayList is almost always more space efficient than *any* linked
> list. One exception: remove most of the elements from the ArrayList without
> calling trimToCapacity.
An ArrayList *can* be more space efficient, but only at the cost of
CPU. In real life you never call trimToSize() because it takes time to
allocate new memory for the trimmed version and takes time later to
realloc space if you're going to add any items. So what you generally
have is a list of 10 slots of which generally only 3 are filled, wasting
the rest of the space.
My empirical testing between LinkedList and ArrayList without doing a
trim showed them roughly equal on the documents I threw at them. A
SinglyLinkedList will be far more space efficient than LinkedList so
we'll probably do better on memory.
> Time: ArrayList outperforms SinglyLinked list on *all* operations except
> add(beginning), remove(beginning), and clear() -- which aren't the most
> common ones.
add(end) can be fast too because a SinglyLinked can store it's tail
pointer. Happily add(end) is what's done in a tight loop during the
build.
My empirical testing between LL and AL showed similar results too with
LL being slightly faster. Jools said SLL was much faster than LL, so
that looks good too.
> Processing in reverse order is the real killer with SinglyLinkedList. (And
> who would suspect, unless they happened to read the warning in the JDOM
> documentation?) I think processing in reverse order is a common enough
> situation that it makes SinglyLinkedList a bad choice. Time-ordered
> documents are one example where it's fairly common to start with the
> messages at the end (the most recent) and work backwards.
Yes, that's a good use case. The easy workaround for someone with such
a long list that this slowness is noticeable is to get the List, copy it
into a LinkedList, and iterate backward. Not great, but not bad either.
Depending on the memory and CPU savings of SLL, we'll have to decide if
it's worth it. I'd be happy if you want to do some investigation.
> > * Keep getSerializedForm() methods?
> >
> > No, deprecate and remove them.
> >
>
> OK. So I guess it doesn't matter that Attribute.getSerializedForm is broken
> with respect to the special characters [& < > "] and the characters greater
> than '\u007f' ...
It does matter. That logic will have to move into XMLOutputter. Did
you post about that problem before?
-jh-
More information about the jdom-interest
mailing list