[jdom-interest] Merging text nodes

Bradley S. Huffman hip at a.cs.okstate.edu
Sat Feb 16 20:45:26 PST 2002


Elliotte Rusty Harold writes:

> At 7:43 PM -0600 2/16/02, Bradley S. Huffman wrote:
> 
> >Just looked at ContentList and it should be fairly easy to do, come
> >to think of it the merge should probably be taken out of Element and
> >moved to ContentList so Text, CDATA adds are consistent.
> >
> 
> OK, but if the code's going to go in ContentList then somebody else 
> is going to need to write it because I don't have a clue what's going 
> on in there and would feel very uncomfortable messing with its 
> internals.

No problem, I know that code by heart :) The kicker is add(int,Object),
for example

    element.addContent( new Element("e1"));          // index = 0
    element.addContent( new Text("This is a test")); // index = 1
    element.addContent( new Element("e2"));          // index = 2
    List list = element.getContent();
    list = element.add(2, new Text(", only a test");

with Text nodes merged then list.get(2) yields e2 instead of a Text node.
But I can live with that if others can.

> Wherever the change is made, the tricky bit is going to be deciding 
> whether to merge adjacent Text and CDATA nodes. If we do merge them, 
> do we merge them into a Text node or a CDATA node? Probably the 
> latter. If someone's deliberately added a CDATA node, then they're 
> probably more concerned about having it preserved than the text node.

Right now adjacent Text and CDATA nodes are not merge because XMLOutputter
escapes <, &, etc. in Text and not in CDATA (but there are other ways to
handle that problem).

In Element.getText() Text and CDATA are concatinated into a single String.
What about EntityRef?  For example with

    <title>Cats &amp; Dogs</title>

the element.getText() yields "Cats  Dogs", not "Cats &amp; Dogs" which I
would find more useful.

Would a method like Element.getText(Map) be useful for concatinating Text,
CDATA, and EntityRef into a String?

Brad



More information about the jdom-interest mailing list