[jdom-interest] JDOMSource problems with Xalan 2.5.2: vanishing
DOCTYPE white spaces
Laurent Bihanic
laurent.bihanic at atosorigin.com
Wed May 5 06:31:09 PDT 2004
Indeed, this is a bug in SAXOutputter.
To fix it, look for the method dtdEvents(Document document). In the following
code:
// No internal subset defined => Try to parse original DTD
if ((publicID != null) || (systemID != null)) {
if (publicID != null) {
buf.append(" PUBLIC ");
buf.append('\"').append(publicID).append('\"');
}
else {
buf.append(" SYSTEM ");
}
buf.append('\"').append(systemID).append('\"');
}
else {
// Doctype is totally empty! => Skip parsing
buf.setLength(0);
}
Replace the lines:
buf.append(" SYSTEM ");
}
buf.append('\"').append(systemID).append('\"');
by:
buf.append(" SYSTEM");
}
buf.append(" \"").append(systemID).append('\"');
Hope this helps,
Laurent
Gary Lawrence Murphy wrote:
> I have a legacy application that worked fine with jdom-b8 and
> Xalan 2.0, but for other reasons we have to upgrade to Xalan 2.5.2
> which we've tried with both the XercesImpl shipped with it and
> with Xerces 2.6.0 with the same results:
>
> 1) First, we create the jdom.Document dom ...
>
> SAXBuilder builder = new SAXBuilder(getSaxClass(), false);
>
> builder.setFeature(
> "http://apache.org/xml/features/nonvalidating/load-external-dtd",
> false);
>
> dom = builder.build(new StringReader(doc.toString()));
>
> 2) Then create the JDOMSource and transformer
>
> JDOMSource jds = new JDOMSource(dom);
>
> TransformerFactory tfact = TransformerFactory.newInstance();
> // set tfact optimize on and incremental off
>
> Transformer xslt = tfact.newTransformer(new StreamSource(xsl));
> // xsl is a File object to our transform
>
> 3) I then apply the transform ...
>
> JDOMResult newdom = new JDOMResult();
> xslt.transform( jds, newdom);
>
> and get an untrappable error on stderr:
>
> [Fatal Error] :1:53: White spaces are required between publicId and systemId.
>
> "1:53" does indeed refer to the space between the public and system
> identifiers in the input files, but those identifiers are clearly
> seperated by a space, a real ASCII space.
>
> when I check the input documents ...
>
> DocType dtd = dom.getDocType();
>
> cat.debug( "Process " + doc.getTag()
> + "\n PublicID = " + dtd.getPublicID()
> + "\n SystemID = " + dtd.getSystemID());
>
> cat.debug( dom.toString());
>
> I get data that looks just fine ...
>
> 2004-01-16 15:19:07,396 [Thread-2] DEBUG
> ca.cbc.sportwire.dochandler.ToNewsMLFilter -
> Process AutoRacingDriverProfile.dtd #1167699
> PublicID = -//TSN//DTD Leader 1.0/EN
> SystemID = file:///home/ticker/ticker/fantasysports/tsn/AutoRacingDriverProfile.dtd
>
> 2004-01-16 15:19:07,396 [Thread-2] DEBUG
> ca.cbc.sportwire.dochandler.ToNewsMLFilter -
> [Document: [DocType: <!DOCTYPE message PUBLIC "-//TSN//DTD Leader 1.0/EN" "file:///home/ticker/ticker/fantasysports/tsn/AutoRacingDriverProfile.dtd">], Root is [Element: <message/>]]
>
> This suggests a problem somewhere between the jdom.Document and the
> Transformer, as if the JDOMSource is somehow trimming this space.
>
> This code worked fine with the older Xalan and I have upgraded the
> following jars to match the Xalan release:
>
> xalan-2.5.2.jar
> xercesImpl-2.6.0.jar
> xml-apis-2.6.0.jar
>
> and the associated JDOM jar files ...
>
> 28404 Jan 15 16:35 jaxp.jar
> 160967 Jan 15 15:14 jaxen-core.jar
> 5949 Jan 15 15:14 jaxen-jdom.jar
> 135363 Jan 15 15:14 jdom.jar
> 23563 Jan 15 15:14 saxpath.jar
>
> I am using Linux 2.4.21-0.13mdk i686 and java version "1.4.1_03"
>
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_03-b02)
> Java HotSpot(TM) Client VM (build 1.4.1_03-b02, mixed mode)
>
> I've verified the same results using 1.4.2 both on the Sun and the
> Blackdown ports.
>
> What's worse, while we get this "Fatal Error" on every document, it
> does not appear to be fatal at all; processing continues on after the
> transform, and the transformed doc is correctly generated!
>
> Any and all insights, ideas or probable cause theories are most
> welcome.
More information about the jdom-interest
mailing list