<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman,new york,times,serif;font-size:12pt"><div>It does seem that it was me being a bit of a dummy!<br><br>I should have been using a ByteArrayOutputStream to get the actual bytes.<br><br>Many thanks for the hint.<br></div><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;"><br><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;"><font size="2" face="Tahoma"><hr size="1"><b><span style="font-weight: bold;">From:</span></b> Michael Kay <mike@saxonica.com><br><b><span style="font-weight: bold;">To:</span></b> Mike Kyle <m_t_k_nospam@yahoo.co.uk>; jdom-interest@jdom.org<br><b><span style="font-weight: bold;">Sent:</span></b> Wednesday, 22 October, 2008 13:47:42<br><b><span style="font-weight: bold;">Subject:</span></b> RE: [jdom-interest] Format problem?<br></font><br>
<style type="text/css">DIV {
MARGIN:0px;}
</style>
<div dir="ltr" align="left"><span class="046324412-22102008"><font size="2" color="#0000ff" face="Arial">I think it's more likely that System.out is not displaying
the Unicode string correctly - generally my experience is that the operating
system console is not capable of handling full Unicode, though it no doubt
depends on the operating system and its configuration.</font></span></div>
<div dir="ltr" align="left"><span class="046324412-22102008"><font size="2" color="#0000ff" face="Arial"></font></span> </div>
<div dir="ltr" align="left"><span class="046324412-22102008"><font size="2" color="#0000ff" face="Arial">I'm not sure why you would expect to see UTF-8 (as distinct
from other representations of Unicode).</font></span></div>
<div dir="ltr" align="left"><span class="046324412-22102008"><font size="2" color="#0000ff" face="Arial"></font></span> </div>
<div dir="ltr" align="left"><span class="046324412-22102008"><font size="2" color="#0000ff" face="Arial">Michael Kay</font></span></div>
<div dir="ltr" align="left"><span class="046324412-22102008"><font size="2" color="#0000ff" face="Arial"><a rel="nofollow" target="_blank" href="http://www.saxonica.com/">http://www.saxonica.com/</a></font></span></div><br>
<blockquote style="border-left: 2px solid rgb(0, 0, 255); padding-left: 5px; margin-left: 5px; margin-right: 0px;">
<div class="OutlookMessageHeader" dir="ltr" align="left" lang="en-us">
<hr tabindex="-1">
<font size="2" face="Tahoma"><b>From:</b> jdom-interest-bounces@jdom.org
[mailto:jdom-interest-bounces@jdom.org] <b>On Behalf Of </b>Mike
Kyle<br><b>Sent:</b> 22 October 2008 11:35<br><b>To:</b>
jdom-interest@jdom.org<br><b>Subject:</b> [jdom-interest] Format
problem?<br></font><br></div>
<div></div>
<div style="font-size: 12pt; font-family: times new roman,new york,times,serif;">
<div>The following code does NOT produce the UTF-8 that I had expected. As far
as I can tell the Text element only seems to work with ASCII text. I would
have expected it to work with non-ASCII text. Or am I doing something
dumb?<br><br> private void jdomTest() throws
IOException<br>
{<br> Element element = new
Element("doc");<br>
element.addContent(new
Text("\u4E2D\u6587"));<br> Document
document = new
Document(element);<br><br>
StringWriter out = new
StringWriter();<br> Format f =
Format.getPrettyFormat();<br> new
XMLOutputter(f).output(document,
out);<br> System.out.println("XML:
"+out);<br>
}<br><br></div></div><br></blockquote></div></div></div><br>
</body></html>