[jdom-interest] Kana symbols and UTF-8? (was Re: Kana characters?)
Alan Deikman
Alan.Deikman at znyx.com
Tue May 22 09:43:55 PDT 2007
OK, now I'm a little confused. I guess this is an XML question and not
really a JDOM question, but perhaps someone can explain it.
Angela Amoateng wrote:
>
> This is the code in my XML document (by the way, romaji is romanised
> Japanese):
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <dictionary>
> <word>
> <noun>
> <english>book</english>
> <romaji>hon</romaji>
> <hiraganaSym>ほん</hiraganaSym>
> <hiraganaNum>ほん</hiraganaNum>
> </noun>
Where I get lost is in the <hiriganaSym> tag. Those characters inside
are not part of any 8-bit code (ASCII, UTF-8 or whatever). Java has no
problem with it because all String objects are built on unicode, but
what does the _encoding="UTF-8"_ mean in the header if these symbols can
show up in the document?
--
Alan Deikman
ZNYX Networks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.jdom.org/pipermail/jdom-interest/attachments/20070522/20d4b3a8/attachment.htm
More information about the jdom-interest
mailing list