[jdom-interest] setText() to replace children?

Patrick Dowler patrick.dowler at nrc.ca
Tue Jul 10 11:15:02 PDT 2001


I've stayed out of this one so far, but I have to support Alex here.

On 10 July 2001 10:12, Alex Rosen wrote:
> Argh! Once again, the conflict between data-oriented and document-oriented
> XML rears its head.
>
> I propose one thing: if your XML uses (or can use) mixed content, you
> should *NOT* be calling getText() or setText() or whatever we call them.
> Using mixed content is complex, and I think that trying to make these
> methods work better for mixed content will only end up making them worse.
> For the non-mixed content case, the get/setText() and get/setChildren()
> methods are really convenient, but in the mixed content case they just
> sweep too much important information under the rug.

If you look at the Collections API (List in particular), there is no facility
to "set" the content of the list. You can "set" an individual element, but for more
list-oriented calls you get to use addAll, retainAll, and removeAll. 

For mixed content, we are talking about lists of things and one doesn't
really "set" a whole list. In List, you would have to explicitly do a removeAll 
followed by an addAll. 

> > For example, consider this XHTML element:
> >
> > <p>
> >   Do <strong>NOT</strong> eat the yellow snow!
> > </p>
> >
> > Currently invoking getText() on the p element would produce
> > the string "Do eat  the yellow snow!" This is very
> > unexpected. Worse yet, it might not be noticed at first
> > glance. It is a hidden bug that could produce potentially
> > catastrophic results.
>
> I think that your proposed change has the same problems. It works well for
> your example, but it will lull users into using getText() for mixed
> content, when I believe they should not. If the XHTML gets more complex,
> then your proposed solution falls apart. For example, in the XHTML
> "<p>ONE</p><p>TWO</p>", the whitespace value of getText() would probably
> not be what you're looking for. What if the XHTML contain a <ul> element or
> a <table>? Are we leading people down the wrong path if we make it work for
> the simple cases but not for the more complicated ones? I think we should
> keep things as-is.

Exactly. The only plausible way I can see to have the convenience methods 
for simple cases would be to make them fail-fast in the non-simple case. 
That would probably mean a RuntimeException to not make code ugly (it is a 
big in logic, after all) and I doubt people want to deal with that... It also means
extra overhead for those using the simple access safely, say in a data-oriented
application... 

-- 
Patrick Dowler
Canadian Astronomy Data Centre
National Research Council
Victoria, BC



More information about the jdom-interest mailing list