[jdom-interest] possible bug handling JTidy outputted Document
Jon Garfunkel
yochanan at coopdata.net
Mon Oct 1 12:14:06 PDT 2001
I've joined this list to submit a note about this buggy behavior. I have
not done an exhaustive search to see whether this has been reported
before, though I did find that interoperability with JTidy is on the TODO
list.
Here's some incompatible behavior between JDOM-beta7 and JTidy 04aug2000r7-dev.
/***/
FileInputStream input = new FileInputStream(HTMLfile);
FileOutputStream output = new FileOutputStream(XHTMLfile);
Tidy tidy = new Tidy();
tidy.setXHTML(true);
org.w3c.dom.Document myDOM = tidy.parseDOM(input,output);
// this fails -- see explanation
org.jdom.Document myDoc = new org.jdom.Document(myDOM);
// at best we can use a SAXBuilder or DOMBuilder to build from the file
SAXBuilder db = new DOMBuilder();
org.jdom.Document doc = db.build(XHTMLfile);
/***/
The exception raised in JDOM isn IllegalAddException when adding the XHTML
namespace. Apparently JTidy (1) sets this namespance [in
org.w3c.tidy.Lexer.setXHTMLDocType()], and (2) furthermore gives it as an
xmlns attribute to the html element. When JDOM finds the xmlns attribute,
in attempts to add it [using org.jdom.Element.addNamespaceDeclaration()],
but finds that it collides with the existing namespace added at (1).
My guess is that JTidy produces a legal DOM, but I don't know DOM super
well.
A brute workaround is simply to comment out the parts of
addNamespaceDeclaration() that do the checking-- I do not have any
documents which may fail here, because any XML documents I plan to work
with that don't come from JTidy will come from my XML editor.
Keep up the good work with JDOM. I am using it in many places for my
software and look forward to introducing my work to you.
Jon
More information about the jdom-interest
mailing list