[jdom-interest] special characters breaking parse??

markus_tripp at sonynetservices.com markus_tripp at sonynetservices.com
Fri Jan 26 00:55:15 PST 2001


Note: The output method of XMLOutputter uses the character encodings
implemented by the Java platform. They vary between different
implementation. For example the US-only version of Suns Java 2 SE v 1.3
supports the following encodings:

ASCII
Cp1252 (Windows Latin-1)
ISO8859_1
UTF8
UTF-16
UnicodeBid
UnicodeBigUnmarked
UnicodeLittle
UnicodeLittleUnmarked

For more information see:
http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html

Markus

PS: On my Windows computer (with Java 2 SE v 1.3) the default Java encoding
is "Cp1252".





Jason Hunter <jhunter at collab.net>@jdom.org on 26.01.2001 05:58:24

Gesendet von:  jdom-interest-admin at jdom.org


An:   matt at xmlglobal.com
Kopie:    jdom-interest at jdom.org
Thema:    Re: [jdom-interest] special characters breaking parse??


It works OK if you specify in the decl:

<?xml version="1.0" encoding="ISO-8859-1"?>

When files look ASCII, I believe the parser defaults to UTF-8 unless you
have an encoding to say differently.  See
http://www.w3.org/TR/REC-xml#sec-guessing.

For the record, I saw the same error with DOMBuilder (why are you using
DOMBuilder?).  In SAXBuilder you get a better description:

org.jdom.JDOMException: Error on line 3: An invalid XML character
(Unicode: 0x84) was found in the element content of the document.
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:348)

BTW, make sure you outputter.setEncoding("ISO-8859-1") on output.

-jh-


Matthew MacKenzie wrote:
>
> Hello,
>
> I am parsing an XML file,  and when characters with accents and such are
> encountered,
> the following stack trace is thrown.  I tried changing the encoding to
> UTF-8, but that didn't work.
>
> Has anyone else had this problem?
>
> <stackTrace>
>
> org.jdom.JDOMException: The element type "TITLE" must be terminated by
the
> matching end-tag "</TITLE>".: Error on line 180: The element type "TITLE"
> must be terminated by the matching end-tag "</TITLE>".
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:315)
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:337)
> </stackTrace>
>
> Relevant Data:
>
> 169      <TRACK>
> 170       <TRACKID>41676</TRACKID>
> 171        <TITLE>Tannhäuser / Derivè</TITLE>
> 172        <ALBUM>The Shape Of Punk To Come</ALBUM>
> 173       <ARTIST>Refused</ARTIST>
> 174       <GENRE></GENRE>
> 175
>
176<FILENAME>Refused-The_Shape_Of_Punk_To_Come-11-Tannhäuser_Derivè.mp3</FILENAME>

> 177        <SIZE>7797864</SIZE>
> 178       <FORMAT>.mp3</FORMAT>
> 179       <QUALITY>128000</QUALITY>
> 180        <CHANNELS>2</CHANNELS>
> 181        <DURATION>489</DURATION>
> 182      </TRACK>
>
> --
> Matthew MacKenzie
>
> _______________________________________________
> To control your jdom-interest membership:
>
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com

_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com








More information about the jdom-interest mailing list