[jdom-interest] raw bytes turned into string and inserted
into xml
Elliotte Rusty Harold
elharo at metalab.unc.edu
Mon Mar 18 12:20:57 PST 2002
At 4:47 PM +0200 3/18/02, Jeff Singer wrote:
>Hi all,
>
>I have a situation in which my applications input is a raw stream of
>bytes, these bytes are actually ascii strings which occassionaly will
>contain characters which are illegal to xml like 0x4, 0x0, 0x1 - quite a
>few in this range. I use the string constructor which takes an encoding
>to form them into java.lang.String objects. This works fine, I then
>insert them as content onto a JDOM Element object and eventually after
Aha! You've just demonstrated one case in which JDOM does need to be
verifying the character data content. From the moment you inserted
the first 0x4, 0x0, etc. your document was no longer well-formed XML,
and JDOM should have immediately thrown an exception to let you know
this. I apologize that it didn't.
As to how to fix your code, that depends on what the strings are for
and what you're ultimately doing with them. Some have suggested that
you Base-64 encode your data. This is one possibility. Another is
that you replace each illegal character by an an element like this:
<control value="4"/>
There are other solutions. However, the simple fact is you can't
include these characters directly in an XML 1.0 document.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.cafeconleche.org/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
More information about the jdom-interest
mailing list