[jdom-interest] SaxBuilder.build(url) and encoding
Elliotte Rusty Harold
elharo at metalab.unc.edu
Thu Dec 12 06:31:12 PST 2002
At 9:34 PM -0800 12/11/02, Jason Hunter wrote:
>When you use a URL the underlying parser determines the encoding,
>typically by looking at the declaration.
Not necessarily. In an HTTP environment, the encoding specified by
the MIME type takes precedence over the encoding specified by the XML
document (though not all parsers get this right). If the HTTP header
says the document is UTF-8 and the encoding declaration says ISO
8859-1, then the parser uses UTF-8. I have to double check this, but
I also think that if the HTTP header says the document is text/xml
without any encoding, then the parser picks US-ASCII regardless of
what the encoding declaration says. Again, only some parsers
correctly implement the spec here.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| XML in a Nutshell, 2nd Edition (O'Reilly, 2002) |
| http://www.cafeconleche.org/books/xian2/ |
| http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
More information about the jdom-interest
mailing list