[jdom-interest] a CDATA implementation

Schaffer, Dan Dan.Schaffer at time0.com
Thu Aug 24 05:38:11 PDT 2000


I wrote a CDATA implementation for jdom.  I'd like to get some feedback on
the implementation and usage.  I have a working version I've been using on
my own and I described and showed the code to Jason and Brett.  

I'm not really a big fan on CDATA sections but here is some justification.
I've been using jdom on a project for a month or so.  The project takes user
maintained xml files as input and some of the elements contain series of
html
tags and servlet urls (full of <,>,&).  Users often use CDATA sections to
keep
the text readable and they were not too happy when I changed their pretty
"<![CDATA[<html><head>]]>" patterns into "&lt;html&gt;&lt;head&gt;".

Whether or not anyone likes using CDATA sections it is nice to have
the option to use them (and if org.w3c.dom api can do it, so should jdom).
It's also nice to be able to maintain CDATA section and not lose readable
after jdom writes xml.

There were two main goals in the cdata support.  First, during input of the
xml maintain the cdata section so output restores the CDATA (above
example).  The api should just get the text value (via Element's getText)
the same whether it has entities or cdata sections.  Second, allow the
addition
of new CDATA text (Element's addContent method) text so the output will be
surronded by the <![CDATA[]]>.

I added a class called org.jdom.CDATA (it is the Comment class modified a
little).  CDATA has a constructor CDATA(String) and methods getText,
setText,
toString, getSerializedForm, equals, hashCode, clone.

Basic usage is like:

Element element=new Element();
element.addContent(new CDATA("a <cdata> section"));

element.getText() or getTextTrim()
returns "a <cdata> section"

if you output the document using XMLOutputter the element will
contain:
<!CDATA[a <cdata> section]]>

if you do:
element.getMixedContent()
will return a List containing 1 object of type CDATA
in general the list may contain CDATA, Entities, String
(CDATA sections used to be converted to Strings now are CDATA objects)

Modifications to existing code involved Element.java, the
input classes, and output classes.  An addContent(CDATA) method was
added to Element.  Some other methods were modified to handle CDATA
classes.  The input classes create CDATA objects when encountering a
CDATA section instead of converting them to Strings.  The output classes
maintain the CDATA sections (surrounds text with the <!CDATA[]]>) when
CDATA objects are encountered.

Let me know if you have any questions or ideas.

-Dan




More information about the jdom-interest mailing list