[jdom-interest] Content missing after conversion from W3C Element to JDOM2 Element
Larsen
larsen007 at web.de
Thu Nov 8 01:20:52 PST 2012
Hi Rolf,
first of all, thanks for your extensive help!
> The Java API documentation is a mess in this area.... JDK 1.5 package
> information indicates that the org.w3c.dom API supports DOM Level 2:
> http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/package-summary.html
That´s nice to hear. I was already wondering wether my English is too bad
or if the javadoc is so crudely written that I can´t understand it.
> What would be useful is if you could determine the library that you are
> using. Since you have already 'hacked' the code, why don't you
> temporarily add the line: System.out.println(text.getClass()); to the
> method. This will tell you the concrete implementation of DOM that's
> broken.
It´s "org.w3c.tidy.DOMTextImpl". I use JTidy to bring HTML code I obtain
from a customer´s database into Java objects.
So, should I file a bug against JTidy?
My code in that area in case it helps:
private org.w3c.dom.Document getDocFromTidy(String html) {
Tidy tidy = new Tidy();
tidy.setShowWarnings(false);
tidy.setQuiet(true);
tidy.setXHTML(true);
tidy.setDocType("omit");
// convert text representation to Document
InputStream bais = new ByteArrayInputStream(html.getBytes());
try {
bais.close();
} catch (IOException e) {
log.error("Exception on closing the InputStream", e);
}
return tidy.parseDOM(bais, null);
}
Lars
More information about the jdom-interest
mailing list