[jdom-interest] Code submision: JDOM2 the dual tree implement ation...

James Strachan james at metastuff.com
Tue Nov 28 02:14:07 PST 2000


----- Original Message -----
From: "Jason Hunter" <jhunter at acm.org>
> (Thinking out loud about the 2-level JDOM.)
>
> As I'm thinking about the 2-level JDOM I'm wondering if it can solve
> other pressing problems we have.  For example, one other performance
> issue with JDOM is that some want JDOM to validate all names and content
> on construction to ensure only well-formed documents are created.  This
> causes a runtime performance penalty that others would like to avoid.
>
> One approach to solving this (following the model of the 2-level split)
> is to have a base class set that doesn't do the checking and a subclass
> set that does (or vice versa).  The same arguments employed right now to
> split JDOM classes for parentage would seem to apply to well-formedness
> checking.
>
> But the 2-level solution doesn't scale well to 3-levels.  I don't want a
> "simple" baseclass and an "everything" subclass.  There will be people
> who want parentage but not well-formedness, and there will be people who
> want well-formedness but not parentage, and there will be people who
> want both.  After debate we could end up with a simple baseclass, a
> formedness subclass, a parentage subclass, and a formedness+parentage
> subclass.

I'd say that validation is a complex area. Some may want to validate as they
build; never allowing an incorrect data structure to ever be built. Others
want to parse a whole document tree then validate to find out whats wrong
(e.g. user validation in more UI type areas).
The base class approach doesn't seem a good fit to me for tackling
validation issues. I'd propose one of the following:-

Validator interface or abstract base class which is a property of Document
(for use in doubly linked trees) or by a custom builder (e.g. SAXBuilder).
If someone wishes to write their own custom builder logic then they
integrate with the Validator if they wish to support validation.

Then we have 2 options.

1) singly linked trees - have no reference to the owning document so the
builder code / object must do the validation. Or the validation is done
explicitly after the tree is built. This is fine as singly linked trees are
usually used for 'simple' XML data processing anyway - speed and performance
are paramount so post-build validation or builder-level validation is OK.

2) doubly linked trees - the document can (if desired) contain a validator
so it can throw exceptions rather than build an invalid tree. The builder
can do the same too if need be. Again its all configurable to keep everyone
happy.

The default operation of JDOM would be 2) by default and custom plugin
validators can be written. (XMLSchema / DTD and other custom validators).

> Let's look then at user data.  Some want it, some don't.  Some who need
> getParent() -- because they want to modify attribs in place for example
> -- don't want user data, and perhaps vice versa.  Do we want to see a
> base JDOM, a parentage JDOM subclass set, and a user data JDOM subclass
> set?  What about people who want user data but not parentage?

People who want to add user data need to decide what instance data they wish
to store in their Element, Attribute, ..., instances. They can derive from a
lightweight, singly linked tree or a doubly linked tree. I don't think thats
a big problem its just a choice. If they are very concerned about RAM and
performance, they'll go singly linked. If they want power & functionality
(XPath et al) then they'll go doubly linked. Its a fair trade-off and
developers need to make choices sometimes ;-)

> What about well-formedness?

Well-formedness has nothing to do with a singly linked or doubly linked tree
implementation does it?

> I'm starting to think that for each contentious issue people who want
> the less popular model are going to see a baseclass/subclass split as
> desirable, but unfortunately it's a model that doesn't scale well.

I disagree. I don't think there is a scaling problem.

> (Again, just thinking out loud.)

Cool - me too, fun isn't it ;-)


<James/>


James Strachan
=============
email: james at metastuff.com
web: http://www.metastuff.com




If you are not the addressee of this confidential e-mail and any
attachments, please delete it and inform the sender; unauthorised
redistribution or publication is prohibited. Views expressed are those of
the author and do not necessarily represent those of Citria Limited.



More information about the jdom-interest mailing list