[jdom-interest] non-ascii characters in xml document
Dave Neuendorf
dneuendorf at earthlink.net
Fri Nov 30 14:16:02 PST 2001
I don't know what it will do to performance, but I found that it works to use URLEncoder and
URLDecoder for this purpose. Thanks for everyone's help.
Dave Neuendorf
Ian Lea wrote:
> You might also like to look at the Javaworld article
> "Java Tip 117: Transfer binary data in an XML document"
> at http://www.javaworld.com/javaworld/javatips/jw-javatip117.html
>
> --
> Ian.
> ian.lea at blackwell.co.uk
>
> "John L. Webber - Jentro AG" wrote:
> >
> > Dave,
> >
> > This solution is pretty inelegant and may seem like overkill, but it
> > works pretty well (as long as we're talking about attribute values or
> > text content): try Base64-encoding the "suspect" strings before
> > inserting them, and simply decode them when you need to use the text. We
> > use that method frequently for handling things like encrypted passwords
> > in files, and I've even sent rather large (7000+ lines) files completely
> > Base64-encoded. The performance loss is small, as long as the operations
> > are not too frequent.
> >
> > Regards,
> >
> > John
> >
> > Dave Neuendorf wrote:
> > >
> > > To look at a simpler test case, I commented out my code that saves xml in gzip format,
> > > and just used straight UTF-8 xml to and from a file. The "curly" single and double
> > > quote characters give me exceptions like this:
> > >
> > > [java] org.jdom.JDOMException: Error on line 1 of document
> > > file:/C:/Development/Projects/HierarchicalPIM/default.xml: Character
> > > conversion error: "Unconvertible UTF-8 character beginning with
> > > 0x92" (line number may be too low).
> > > [java] at org.jdom.input.SAXBuilder.build(SAXBuilder.java:296)
> > >
> > > It sees the single and double quote chars as 0x92 and 0x93, respectively. Maybe these
> > > characters aren't Unicode. Could they be Windows-specific character codes, since the
> > > text is being pasted from a Windows application into a Java app?
More information about the jdom-interest
mailing list