[jdom-interest] Determining malformed chars during conversion

Sat May 4 14:28:56 PDT 2002

Hey guys,

I was wondering if anyone could give me a tip on how to determine what
char(s) is causing SAXBuilder to puke on a read during processing. I have
been looking through the archives about "Malformed UTF-8" chars... but I do
not believe I have any special characters to trigger the exception. Here is
the error I am receiving:

org.jdom.JDOMException: Error on line 1425 of document
file:/C:/cygwin/home/Dana/vulscan/xmlPrepPlugins/vsAuditPlugins.xml:
Character conversion error: "Malformed UTF-8 char -- is an XML encoding
declaration missing?" (line number may be too low).

When tracing this down, I get this error when I make a call to
doc.getRootElement(); I have looked at the "offending" line in the xml
file... and it seems fine to me. This is an exact cut-and-paste of the line:

<Name>OpenSSH Channel Code Off by 1</Name>

What I find weird is that if I structure the XML file differently (ie: nuke
50 lines before and after the offending line), it still fails on the same
line.

More interesting is that if I go and do something weird like move all the
elements and treat them as attributes, I get the same error on even
DIFFERENT lines. I am unsure just how to determine what char(s) is causing
the exception.

Is there any way I can get more information on what is causing the
exception? Can I find out the actual offending character? Can anyone give me
a debugging tip on how to track this down, or point me to some documentation
which I may have missed? I am hoping I am doing something stupid that you
guys have encountered so I don't have to check out the code from CVS and
track down the offending chars by altering jdom to spew forth this
information.

I'd appreciate any help or tips you could pass on.

---
Regards,
Dana M. Epp