[jdom-interest]   not getting converted to
Per Norrman
pernorrman at telia.com
Fri Apr 23 08:43:20 PDT 2004
Hi,
you're not doing anything wrong. Note that is an entity that
is declared somewhere in the html dtd (here,
http://www.w3.org/TR/html4/sgml/entities.html, for html 4.01). Most
(all?) browsers can handle html character entities even if the
html file doesn't explicitly refer to the dtd.
But you are generating XML, and XML does not recognize . XML 1.0
defines five pre defined entities: lt, gt, amp, apos and quote.
You can extend XMLOutputter and override escapeElementEntities to
produce  , but then you must also declare this entity.
/pmn
Robert Taylor wrote:
> Thanks for the reply Jason,
>
> I must still be doing something wrong.
>
> Here is the relavent snippet of my code:
>
> SAXBuilder builder = new SAXBuilder();
> Document doc = builder.build(docname);
>
> XSLTransformer transformer = new XSLTransformer(sheetname);
> Document doc2 = transformer.transform(doc);
> XMLOutputter outp = new XMLOutputter();
> outp.output(doc2, System.out);
>
> XML:
> <?xml version="1.0"?>
> <data>123456</data>
>
> XSL:
> <?xml version="1.0" ?>
> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
> <xsl:output method="html" indent="yes" encoding="ASCII"/>
>
> <xsl:template match="/">
> <html>
> <head></head>
> <body>
> <xsl:apply-templates/>
> </body>
> </html>
> </xsl:template>
>
> <xsl:template match="data">
> This   is   
> some   text    <xsl:value-of select="."/>
> </xsl:template>
>
> </xsl:stylesheet>
>
> JDOM output in the Windows DOS console with encoding="ASCII":
> <?xml version="1.0" encoding="UTF-8"?>
> <html><head /><body>
> This?á?á?áis?á?á?ásome?á?á?átext?á?á?á 123456</body></html>
>
> Xalan output in Windows DOS cosole with encoding="ASCII"
> <html>
> <head>
> <META http-equiv="Content-Type" content="text/html; charset=ASCII">
> </head>
> <body>
> This is some text 123456</body>
> </html>
>
> If I change the code such that I manually set the Format of the XMLOutputter
> (XMLOutputter seems to ignore any formatting information in the XSL document):
> Format format = Format.getPrettyFormat();
> format.setEncoding("ASCII");
> XMLOutputter outp = new XMLOutputter(format);
>
> JDOM output:
> <html>
> <head />
> <body>This   is   some   text    123456</body>
> </html>
>
>
> So the question is, how do I set up the JDOM XMLOutputter to
> convert the   such that when I view them in the Windows DOS console
> they are rendered as characters (like Xalan does)?
>
>
> robert
>
>
>
>
>>-----Original Message-----
>>From: jdom-interest-admin at jdom.org
>>[mailto:jdom-interest-admin at jdom.org]On Behalf Of Jason Hunter
>>Sent: Thursday, April 22, 2004 5:18 PM
>>To: Robert Taylor
>>Cc: jdom-interest at jdom.org
>>Subject: Re: [jdom-interest]   not getting converted to
>>
>>
>>The output you see contains the direct UTF-8 character for a
>>non-breaking space. It shows up like a funny character because the
>>environment in which you're viewing the file probably isn't UTF-8 aware.
>> Semantically though the files are identical. The JDOM one uses one
>>char where the others use six. If you want ASCII encoding, set the
>>outputter to use ASCII. It'll then automatically encode chars that
>>can't be represented within ASCII. You can also just set an escape
>>strategy on the outputter directly if you want UTF-8 but want to encode
>>characters that wouldn't ordinarily need to be encoded.
>>
>>-jh-
>>
>>Robert Taylor wrote:
>>
>>
>>>Greetings, I'm using JDOMBeta10 and am trying to transform an XML document into an HTML document.
>>>I've chosen Xalan-Java v2.6.0 for transformation and have set the system property
>>>javax.xml.transform.TransformerFactory with org.apache.xalan.processor.TransformerFactoryImpl as
>>>discussed here:
>>>
>>>http://www.jdom.org/docs/apidocs/org/jdom/transform/XSLTransformer.html
>>>
>>>based on this documentation:
>>>
>>>http://www.dpawson.co.uk/xsl/sect2/nbsp.html#d6353e246
>>>
>>>it appears that there is an encoding issue.
>>>
>>>I can use the same xml document and style sheet with "pure" Xalan classes
>>>and the document is transformed as expected.
>>>
>>>XML:
>>><?xml version="1.0"?>
>>><data>123456</data>
>>>
>>>XSL:
>>>
>>><?xml version="1.0" ?>
>>><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
>>> <xsl:output method="html" indent="yes"/>
>>>
>>><xsl:template match="/">
>>><html>
>>><head></head>
>>><body>
>>><xsl:apply-templates/>
>>></body>
>>></html>
>>></xsl:template>
>>>
>>><xsl:template match="data">
>>>This is some text  <xsl:value-of select="."/>
>>></xsl:template>
>>>
>>></xsl:stylesheet>
>>>
>>>
>>>JDOM output:
>>><?xml version="1.0" encoding="UTF-8"?>
>>><html><head /><body>
>>>This is some text 123456</body></html>
>>>
>>>Xalan output:
>>><html>
>>><head>
>>><META http-equiv="Content-Type" content="text/html; charset=UTF-8">
>>></head>
>>><body>
>>>This is some text 123456</body>
>>></html>
>>>
>>>Any ideas?
>>>
>>>robert
>>>
>>>_______________________________________________
>>>To control your jdom-interest membership:
>>>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
>>>
>>
>>_______________________________________________
>>To control your jdom-interest membership:
>>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
>>
>
>
> _______________________________________________
> To control your jdom-interest membership:
> http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
>
More information about the jdom-interest
mailing list