[jdom-interest] SaxBuilder.build(url) and encoding
Rodrigo Alvarez
ralvarez at dybox.cl
Thu Dec 12 07:14:46 PST 2002
Hi,
I have now done further testing and it cannot be the file contents cause my
workaround for the problem is to open the URL, read the stream contents
into a StringBuffer and then use the SaxBuilder.build(String) method to
parse the XML. This works works fine.
I use JDOM with Xerces and Xalan. Does Xerces get the encoding part right?
Anyone knows?
/Rodrigo
At 09:31 12-12-2002 -0500, you wrote:
>At 9:34 PM -0800 12/11/02, Jason Hunter wrote:
>>When you use a URL the underlying parser determines the encoding,
>>typically by looking at the declaration.
>
>Not necessarily. In an HTTP environment, the encoding specified by the
>MIME type takes precedence over the encoding specified by the XML document
>(though not all parsers get this right). If the HTTP header says the
>document is UTF-8 and the encoding declaration says ISO 8859-1, then the
>parser uses UTF-8. I have to double check this, but I also think that if
>the HTTP header says the document is text/xml without any encoding, then
>the parser picks US-ASCII regardless of what the encoding declaration
>says. Again, only some parsers correctly implement the spec here.
>--
>
>+-----------------------+------------------------+-------------------+
>| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
>+-----------------------+------------------------+-------------------+
>| XML in a Nutshell, 2nd Edition (O'Reilly, 2002) |
>| http://www.cafeconleche.org/books/xian2/ |
>| http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/ |
>+----------------------------------+---------------------------------+
>| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
>| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
>+----------------------------------+---------------------------------+
Rodrigo Alvarez
DyBOX Consulting and Development
Hernando de Aguirre 906 Providencia.
Santiago, Chile.
(562) 231 7840
More information about the jdom-interest
mailing list