[jdom-interest] JDOM JSR

Amy Lewis amyzing at talsever.com
Fri May 18 05:03:34 PDT 2001


On Thu, May 17, 2001 at 10:41:53PM -0500, Brett McLaughlin wrote:
>  Could you sum up your main concerns about scalability? How do factories
>relate to that? Would the subclassing and specifying an implementation to
>use to a builder suffice? I'd like to hear what you see as the problems you
>have run into.

All right.  Quickly this morning; if I'm unclear, ask, and I'll try to
clarify this evening.

JDOM, in my experience, has three major goals: ease of use, lightweight
default implementation, and well-formedness verification.  Of these,
lightweight and easy tend to reinforce each other; well-formedness
provides a tension in the direction of greater complexity, but tends to
reinforce the ease-of-use issue as well.

The main difficulty that I see with the current implementation is the
lack of a generic (oh, no, she's going to *say* it!  Break out the
asbestos frillies!), unifying, 'node' interface for all classes that
participate in the tree.  This is, in fact, largely bearable, except in
one case: String as node.

For all other classes, a custom implementation can do the work of
defining the extensibility mechanism (that is, of defining the shared
interface).  XML is a little odd: it is generally the case that if the
particular node you're handed isn't a 'branch', then its parent is--all
nodes are one step away from a crossroad.

The particular implementations that drive this need have typically
needed to modify both structure and content of the document being
processed, in multiple ways.  The mechanism is often methods with a
relatively simple signature (using DOM): Document doSomething(Document,
Node) (it can be further simplified to void doSomething(Document,
Node), but that's kinda poor style, and sometimes the return value is
non-void, non-document, and the Document parameter may be changed). 
Sometimes the signature is just doSomething(Node), if the model isn't
pipeline, but hub (which determines transformations and order of
transformation) and spoke (what would be filters in the pipeline
model).

Using a Builder, I can decorate implementation classes (subclasses)
without too much trouble.  Except for String.  Text nodes end up
special-cased; the developers have to be warned to treat them
completely differently (pass the parent, not the node that you care
about, and maybe do a search to find the part that you care about, if
there are multiple children).  Note that this doesn't require, but does
encourage, the subclasser to create that unifying node interface, even
if it only contains "getParent()" (and a test of some sort, perhaps
instanceof, to see whether there are other available axes--there always
will be for the parent of the given node, if one exists).

A part of the concern is driven by the cost of xpathery--using internal
APIs, simple XPaths (developers can be restricted to a subset of
"cheap" XPath expressions) are *fast*.  Instantiating Xalan *isn't*;
even descendant-or-self::node()/*[1] munches tens and hundreds of
milliseconds.

Summary: I'm not calling for the creation of a heavyweight API; one
already exists.  I want a lightweight API *that can be extended*. 
Perhaps the extension will make it very heavy (lopsided?  :-) in one
direction (memory, speed, complexity of the decorators); JDOM should
not *prevent* that in the name of any of its goals, if it's possible
not to.

Right now, the chief impediments are the lack of a unifying interface
(meaning that the implementor prolly has to define something, which
means the implementor has to understand the API fully), and the
impossibility of unifying String into a Node interface.  I understand
the arguments against defining the interface ... but reject them; I
have no problems with even marker interfaces, that contain no methods
and really only provide an instanceof test.  But everything breaks on
the rock of String.

I realize that the choice of String is intended to make things faster
(note that this is not always true; when there's a lot of content
mangling going on in the document, each change creates at least one
additional String object, and management of the problem rapidly becomes
one of the major profiling issues) and lighter, but again, I don't
accept the argument.  Equally good effect could probably be achieved
(for instance) by storing char [], with String getValue() and
StringBuffer getValueBuffer() and corresponding mutators ... on a
"Text" or "Chars" node, not on Element.

As a final note: about nine months ago, after I made a nuisance of
myself, management sent one of the more-senior architects to look at
JDOM (I was getting really sick of DOM, and even sicker of some of the
Java==Perl string manipulation tricks that some others were doing to
try to reduce DOMishness).  The critique was a one-liner: "It's fine
for reading configuration files."  Actually, there was more, but that
was the main substance; JDOM hasn't been something that can be
customized, because it's specifically optimized, in several ways, for
reading (or for static construction: build it once, don't change it
afterwards).

Hope that helps,

Amy!
-- 
Amelia A. Lewis         alicorn at mindspring.com           amyzing at talsever.com
I don't know that I ever wanted greatness, on its own.  It seems rather like
wanting to be an engineer, rather than wanting to design something--or
wanting to be a writer, rather than wanting to write.  It should be a
by-product, not a thing in itself.  Otherwise, it's just an ego trip.
                -- Merlin, son of Corwin, Prince of Chaos (Roger Zelazny)



More information about the jdom-interest mailing list