[jdom-interest] encoding="MS950"
Stuart
stuart at truetel.com
Tue Nov 23 02:41:31 PST 2004
All,
Regarding the encoding problem I initially thought I may need to install
Chinese version of windows (I still may try this at some point) however with
jdk1.4 and jdk1.3 I am able to use MS950 encoding as follows:
String test = "hello"; //hack
byte[] bytes = test.getBytes("MS950"); //hack
If I make up some unknown encoding it will fail (as expected):
String test = "hello"; //hack
byte[] bytes = test.getBytes("dhhfg"); //hack
java.io.UnsupportedEncodingException: dhhfg
at sun.io.Converters.getConverterClass(Converters.java:125)
at sun.io.Converters.newConverter(Converters.java:156)
at
sun.io.CharToByteConverter.getConverter(CharToByteConverter.java:64)
at java.lang.StringCoding.encode(StringCoding.java:368)
at java.lang.String.getBytes(String.java:591)
What is different about JDOM or the SAXBuilder? The XML (VXML) document I
am testing with is as follows (note: I just added the encoding attribute
myself i.e. the document was not created using any Chinese input and it does
not need the encoding attribute. The 'real' xml documents I am parsing are
much more complicated but the problem is the same):
<?xml version="1.0" encoding="MS950"?>
<vxml version="1.0">
<form id="hello">
<block>Hello World!</block>
</form>
</vxml>
Any help will be much appreciated.
Regards,
Stuart
-----Original Message-----
From: Stuart [mailto:stuart at truetel.com]
Sent: Tuesday, November 23, 2004 12:36 AM
To: jdom-interest at jdom.org
Subject: RE: [jdom-interest] encoding="MS950"
All,
Sorry for the multiple postings but I think I was wrong about MS950 not
being supported in jdk1.4. I also discovered the following entry in the
jdk1.4 information:
x-windows-950 MS950 Windows Traditional Chinese
Not sure what I am doing wrong. *8-(
Regards,
Stuart
-----Original Message-----
From: Stuart [mailto:stuart at truetel.com]
Sent: Monday, November 22, 2004 10:51 PM
To: Elliotte Harold
Cc: jdom-interest at jdom.org
Subject: RE: [jdom-interest] encoding="MS950"
All,
I originally posted a question about the SAXBuilder supporting the encoding
format MS950. I recieved a reply stating that the encoding format support
is determined by the JDK (not the parser). I also found that MS950 no
longer appears supported under jdk1.4 BUT in jdk1.3 it seems to be supported
(http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html). I
downladed the international jre for jdk1.3.1_13 but I still get the encoding
not supported error:
STUART$java -version
java version "1.3.1_13"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_13-b03)
Java HotSpot(TM) Client VM (build 1.3.1_13-b03, mixed mode)
Here is the error I am getting:
org.jdom.input.JDOMParseException: Error on line 0: The encoding "MS950" is
not
supported.
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:468)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:810)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:789)
...
Do I need to do something in order to 'enable' the internation support? I
opened the i18n.jar and inside could see a class called
CharToByteMS950.class.
Also is there a way of disabling the encoding check (basically just ignore
this field and parse the rest of the document)?
Regards,
Stuart
More information about the jdom-interest
mailing list