[jdom-interest] Bug in CDATA reading/writing?

Mark Roder roder at is.com
Thu Mar 15 09:44:25 PST 2001


This is not just a pretty printing issue, it is a data integrity issue.

I think both input and output should handle the following cases:

<root>
<Some>This &amp; That</Some>
</root>

<root>
<Some>This <![CDATA[&]]> That</Some>
</root>

If I say to add whitepace it should write out
<root>
   <Some>This &amp; That</Some>
</root>

<root>
   <Some>This <![CDATA[&]]> That</Some>
</root>

>
>Perhaps
>CDATA is a special case where you wouldn't want it ever to appear on a
>separate line if it's surrounded by strings.


Yes - I think CDATA should NOT appear on a line by itself.  A string is
never hung out there, a CDATA should not as well.
This is what CDATA is having done to it when it is written out.
<root>
   <Some>
       This &amp; That
   </Some>
</root>


>
>Not really.  If you create an XMLOutputter with indent and new lines,
>then you've expressly said that preserving whitespace isn't important.
>

No, I said I want unimportant whitespace added, not changed data because
white space was added.


I think this is a bug.  I will work on a fix.

Mark


-----Original Message-----
From: Jason Hunter
To: Mark Roder
Cc: ''jdom-interest at jdom.org' '
Sent: 3/15/01 10:05 AM
Subject: Re: [jdom-interest] Bug in CDATA reading/writing?

> To me having:
>     protected void printCDATASection(...)
>     {
>         indent(out, indentLevel);
>         out.write(cdata.getSerializedForm());
>         maybePrintln(out);
>     }
> is just like having the code read like this - and this makes no sense
to me.
>    protected void printString(...)
>    {
>         indent(out, indentLevel);
>         out.write(string);
>         maybePrintln(out);
>    }
> 
> Wouldn't the following changes "fix" this?
>     protected void printCDATASection(...)
>     {
>         out.write(cdata.getSerializedForm());
>     }

The existing code allows for pretty printing if you request it:

<root>
  Here is some text
  <![CDATA[Here is some CDATA]]>
  And here is more text
</root>

If you create "new XMLOutputter()" there'll be no whitespace added (no
indent or new lines).  You have to request having it added.  Perhaps
CDATA is a special case where you wouldn't want it ever to appear on a
separate line if it's surrounded by strings.

> Am I way off base here?  To me it looks like XMLOutputter is tainting
data.

Not really.  If you create an XMLOutputter with indent and new lines,
then you've expressly said that preserving whitespace isn't important.

This make sense?  Let me know if you think you found a bug.  There were
reports of bugs involving CDATA whitespace, but I couldn't reproduce.

-jh-



More information about the jdom-interest mailing list