[jdom-interest] XMLOutputter and utf-8

Jason Hunter jhunter at xquery.com
Fri May 20 00:21:23 PDT 2005


You're not actually outputting the file to a byte stream.  You're 
outputting it to a String, then printing the string using 
System.out.println().  System.out is a PrintStream and per the 
PrintStream Javadocs, "All characters printed by a PrintStream are 
converted into bytes using the platform's default character encoding."

Try this: out.output(doc, System.out);

That way JDOM gets to control the bytes being output.

-jh-

Chris Curvey wrote:

> Hi all,
> 
> I'm having a little trouble figuring out utf-8 encoding with JDom.  The 
> output from this sample program is returning a single hex value, \xc9 
> for an E-acute, but according to this page 
> http://www.fileformat.info/info/unicode/char/00c9/index.htm, the UTF-8 
> encoding for E-acute should be a hex pair \xc3 and \x89.  (\xc9 appears 
> to be right value for UTF-16.)
> 
> Any idea what I'm doing wrong?  Or am I just misinterpreting something?
> 
> import org.jdom.Document;
> import org.jdom.Element;
> import org.jdom.output.XMLOutputter;
> import org.jdom.output.Format;
> 
> class JdomTest
> {
>     public static void main (String[] argv)
>     {
>         Document doc = new Document();
>         Element element = new Element("foobar");
>         element.setText("CLOISONNÉ");
>         doc.addContent(element);
> 
>         Format format = Format.getPrettyFormat();
>         format.setEncoding("UTF-8");
>         XMLOutputter out = new XMLOutputter(format);
>         System.out.println(out.outputString(doc));
>     }
> }
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com


More information about the jdom-interest mailing list