[jdom-interest] don't validate comments
Christian Peter
cpeter at rostock.igd.fhg.de
Fri Dec 6 00:51:03 PST 2002
Okay, you've convinced me. Since I use NekoHTML as HTML scanner
(which also does tag balancing), I think I will set up a filter
removing all comments, and then it should be fine.
Thanks for your answers,
Christian
--
Bradley S. Huffman wrote:
> "Christian Peter" writes:
>
>
>>I get a org.jdom.IllegalDataException telling me that "Comments
>>cannot contain double hyphens (--)", which doubtlessly is true
>>(if you are interested, http://www.nasa.org causes the exception).
>>
>>However, I need to parse this document and since I'm not interested
>>in the comments, I would like JDOM to simply ignore the content of a
>> comment. I thought I can achieve this by setting
>>DOMBuilder.setValidation to false, but I still get this Exception.
>>
>>Now I wonder, is it just me, who I have to set another option of
>>another component, or is it a bug, or do I have to change the
>>org.jdom.Verifyer class myself? Or is it done already on CVS? I use
>>the 0.7 beta-8 version.
>
>
> Since DOM is suppose to represents well-formed XML, DOMBuilder shouldn't be
> throwing any type of well-formness exception. And if that's a typo and
> SAXBuilder is really what's being used, then the SAX parser shouldn't be
> reporting "--" through SAX's LexicalHandler.comment. Hopefully SAX is
> throwing a SAXException and JDOM's SAXBuilder is just wrapping a
> IllegalDataException around it.
>
> Either way JDOM should *never* see a string with "--" in it as the text of
> a comment from a SAX or DOM source, so there is really nothing JDOM can do.
>
> Brad
More information about the jdom-interest
mailing list