[jdom-interest] XMLOutputter and utf-8

Jason Hunter jhunter at xquery.com
Fri May 20 11:27:03 PDT 2005


Well, just be aware the renderedDoc string there is going to be a 
character String not a byte stream, so you can't look at it for 
diagnostics about how the encoding's going.

The out.output(doc, output) looks like the proper way to send UTF-8 
characters.  I don't recall if there were any issues with beta7 about 
this.  Beta7 was long, long ago.  You may also want to specify the 
encoding in the HTTP headers you're sending so the receiver will know 
how to parse the bytes.

-jh-

Chris Curvey wrote:

> Thanks to Jason & Paul for their responses.  I tried Jason's suggestion 
> for my example, and it works great.  (And I realize that this question 
> is increasingly off-topic, please forgive me.)
> 
> In my real-world problem, I'm not writing to System.out, I'm writing to 
> an output stream returned from an HttpsURLConnection.  So I tried this:
> 
>     Document doc = getXML();
>     XMLOutputter out = new XMLOutputter();
>     out.setEncoding("UTF-8");
>     String renderedDoc = out.outputString(doc);
> 
>     // Construct the request headers
>     setupHeaders(theConnection, renderedDoc.length());
> 
>     // Send the request
>     OutputStream output = theConnection.getOutputStream();
>     out.output(doc, output);
> 
> I don't have access to the server on the other end of that connection, 
> and the connection is encrypted, so I can't just put in a proxy server 
> to capture the stream to see what's really being sent.
> 
> One more data point, which may or may not be important.  I have to use 
> the Beta-7 version of JDOM, because it's distributed as part of my app 
> server, and putting jdom 1.0 earlier in the classpath causes the app 
> server to choke. 
> 
> Many, many thanks for any help.
> 
> -Chris
> 
> On 5/20/05, *Jason Hunter* <jhunter at xquery.com 
> <mailto:jhunter at xquery.com>> wrote:
> 
>     You're not actually outputting the file to a byte stream.  You're
>     outputting it to a String, then printing the string using
>     System.out.println().  System.out is a PrintStream and per the
>     PrintStream Javadocs, "All characters printed by a PrintStream are
>     converted into bytes using the platform's default character encoding."
> 
>     Try this: out.output(doc, System.out);
> 
>     That way JDOM gets to control the bytes being output.
> 
>     -jh-
> 
>     Chris Curvey wrote:
> 
>      > Hi all,
>      >
>      > I'm having a little trouble figuring out utf-8 encoding with
>     JDom.  The
>      > output from this sample program is returning a single hex value, \xc9
>      > for an E-acute, but according to this page
>      > http://www.fileformat.info/info/unicode/char/00c9/index.htm, the
>     UTF-8
>      > encoding for E-acute should be a hex pair \xc3 and \x89.  (\xc9
>     appears
>      > to be right value for UTF-16.)
>      >
>      > Any idea what I'm doing wrong?  Or am I just misinterpreting
>     something?
>      >
>      > import org.jdom.Document;
>      > import org.jdom.Element;
>      > import org.jdom.output.XMLOutputter ;
>      > import org.jdom.output.Format;
>      >
>      > class JdomTest
>      > {
>      >     public static void main (String[] argv)
>      >     {
>      >         Document doc = new Document();
>      >         Element element = new Element("foobar");
>      >         element.setText("CLOISONNÉ");
>      >         doc.addContent(element);
>      >
>      >         Format format = Format.getPrettyFormat();
>      >         format.setEncoding("UTF-8");
>      >         XMLOutputter out = new XMLOutputter(format);
>      >         System.out.println(out.outputString(doc));
>      >     }
>      > }
>      >
>      >
>      >
>     ------------------------------------------------------------------------
> 
>      >
>      > _______________________________________________
>      > To control your jdom-interest membership:
>      >
>     http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>     <http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com>
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com


More information about the jdom-interest mailing list