[jdom-interest] possible bug handling JTidy outputted Document

Jon Garfunkel yochanan at coopdata.net
Mon Oct 1 12:14:06 PDT 2001


I've joined this list to submit a note about this buggy behavior. I have
not done an exhaustive search to see whether this has been reported
before, though I did find that interoperability with JTidy is on the TODO
list.

Here's some incompatible behavior between JDOM-beta7 and JTidy 04aug2000r7-dev.

/***/
FileInputStream input = new FileInputStream(HTMLfile);
FileOutputStream output = new FileOutputStream(XHTMLfile);
Tidy tidy = new Tidy();
tidy.setXHTML(true);
org.w3c.dom.Document myDOM = tidy.parseDOM(input,output);

// this fails -- see explanation
org.jdom.Document myDoc = new org.jdom.Document(myDOM);

// at best we can use a SAXBuilder or DOMBuilder to build from the file 
SAXBuilder db = new DOMBuilder();
org.jdom.Document doc = db.build(XHTMLfile);
/***/

The exception raised in JDOM isn IllegalAddException when adding the XHTML
namespace. Apparently JTidy (1) sets this namespance [in 
org.w3c.tidy.Lexer.setXHTMLDocType()], and (2) furthermore gives it as an
xmlns attribute to the html element.  When JDOM finds the xmlns attribute,
in attempts to add it [using org.jdom.Element.addNamespaceDeclaration()],
but finds that it collides with the existing namespace added at (1). 

My guess is that JTidy produces a legal DOM, but I don't know DOM super
well.

A brute workaround is simply to comment out the parts of
addNamespaceDeclaration() that do the checking-- I do not have any
documents which may fail here, because any XML documents I plan to work
with that don't come from JTidy will come from my XML editor.

Keep up the good work with JDOM. I am using it in many places for my
software and look forward to introducing my work to you.

Jon






More information about the jdom-interest mailing list