[jdom-interest] Fwd: Re: Kana symbols and UTF-8? (was Re: Kana
characters?)
Angela Amoateng
angela.amoateng at kcl.ac.uk
Tue May 22 13:44:26 PDT 2007
----- Forwarded message from angela.amoateng at kcl.ac.uk -----
Date: Tue, 22 May 2007 21:42:42 +0100
From: Angela Amoateng <angela.amoateng at kcl.ac.uk>
Reply-To: Angela Amoateng <angela.amoateng at kcl.ac.uk>
Subject: Re: Kana symbols and UTF-8? (was Re: Kana characters?)
To: Alan Deikman <Alan.Deikman at znyx.com>
Hi Alan,
To clear confusion, the symbols used in the <hiraganaSym> tags are
actual fonts of the UTF-8 hexadecimal value. As long as its true fonts
and not an image, it is exactly the same as putting the hexadecimal
value in its place. When I used the font symbols(which is what the
encoding produces) and the hexadecimal values, when I viewed it in the
browser, they both came up with the symbols (check the other message
for the output)
I hope that clears things up!
Angela
Quoting Alan Deikman <Alan.Deikman at znyx.com>:
> OK, now I'm a little confused. I guess this is an XML question and
> not really a JDOM question, but perhaps someone can explain it.
>
> Angela Amoateng wrote:
>>
>> This is the code in my XML document (by the way, romaji is romanised
>> Japanese):
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> <dictionary>
>> <word>
>> <noun>
>> <english>book</english>
>> <romaji>hon</romaji>
>> <hiraganaSym>ほん</hiraganaSym>
>> <hiraganaNum>ほん</hiraganaNum>
>> </noun>
>
> Where I get lost is in the <hiriganaSym> tag. Those characters
> inside are not part of any 8-bit code (ASCII, UTF-8 or whatever).
> Java has no problem with it because all String objects are built on
> unicode, but what does the _encoding="UTF-8"_ mean in the header if
> these symbols can show up in the document?
>
> --
> Alan Deikman
> ZNYX Networks
>
>
--
Angela Amoateng
angela.amoateng at kcl.ac.uk
----- End forwarded message -----
--
Angela Amoateng
angela.amoateng at kcl.ac.uk
More information about the jdom-interest
mailing list