[jdom-interest] Encoding issue

Michael Kay mike at saxonica.com
Fri Oct 24 08:08:00 PDT 2008


It looks to me as if SAXBuilder().build() doesn't realize that the data is
in UTF-8 and thinks it is in iso-8859-1. So there's something wrong in the
way data is being passed from the Base64 decoding step to the XML parsing
step. Nothing to do with JDOM.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: jdom-interest-bounces at jdom.org 
> [mailto:jdom-interest-bounctaes at jdom.org] On Behalf Of Piller Sébastien
> Sent: 24 October 2008 11:07
> To: jdom-interest at jdom.org
> Subject: [jdom-interest] Encoding issue
> 
> I've a problem with JDom in one of my webapps.
> 
> It runs under linux centos, Tomcat5.5.27, JDom v1.1, etc.
> 
> My customer send me a file which is created like this:
> 
> - exported to XML UTF8
> - converted to Base64
> - POSTed to my webapp. (headers are set to the correct encoding)
> 
> I decode it like this:
> - get the data
> - convert it back from base64
> - parse the data with new SAXBuilder().build(...)
> 
> After that, when I get strings using 
> "mynode.getChildText("bla")", it is misencoded, ie: "ü" comes "ä".
> 
> I was thinking that JDom will handle all possible conversion 
> himself. I really don't want to convert extracted strings 
> using Charset.forName().encode or else....
> 
> Any idea on what am I doing wrong?
> 
> Thank you very much ;)
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@you
> rhost.com




More information about the jdom-interest mailing list