[jdom-interest] Merging text nodes
Bradley S. Huffman
hip at a.cs.okstate.edu
Sat Feb 16 20:45:26 PST 2002
Elliotte Rusty Harold writes:
> At 7:43 PM -0600 2/16/02, Bradley S. Huffman wrote:
>
> >Just looked at ContentList and it should be fairly easy to do, come
> >to think of it the merge should probably be taken out of Element and
> >moved to ContentList so Text, CDATA adds are consistent.
> >
>
> OK, but if the code's going to go in ContentList then somebody else
> is going to need to write it because I don't have a clue what's going
> on in there and would feel very uncomfortable messing with its
> internals.
No problem, I know that code by heart :) The kicker is add(int,Object),
for example
element.addContent( new Element("e1")); // index = 0
element.addContent( new Text("This is a test")); // index = 1
element.addContent( new Element("e2")); // index = 2
List list = element.getContent();
list = element.add(2, new Text(", only a test");
with Text nodes merged then list.get(2) yields e2 instead of a Text node.
But I can live with that if others can.
> Wherever the change is made, the tricky bit is going to be deciding
> whether to merge adjacent Text and CDATA nodes. If we do merge them,
> do we merge them into a Text node or a CDATA node? Probably the
> latter. If someone's deliberately added a CDATA node, then they're
> probably more concerned about having it preserved than the text node.
Right now adjacent Text and CDATA nodes are not merge because XMLOutputter
escapes <, &, etc. in Text and not in CDATA (but there are other ways to
handle that problem).
In Element.getText() Text and CDATA are concatinated into a single String.
What about EntityRef? For example with
<title>Cats & Dogs</title>
the element.getText() yields "Cats Dogs", not "Cats & Dogs" which I
would find more useful.
Would a method like Element.getText(Map) be useful for concatinating Text,
CDATA, and EntityRef into a String?
Brad
More information about the jdom-interest
mailing list