[jdom-interest] DOCTYPE still giving me the worst headache!
Elliotte Rusty Harold
elharo at metalab.unc.edu
Wed Jan 30 04:49:49 PST 2002
At 7:33 PM -0600 1/29/02, Jason Long wrote:
>org.jdom.JDOMException: Error on line 2 of document
>file:/G:/www.che.com/companylistings/10.html: White space is required
>between the public identifier and the system identifier.
>
>This is the line JDOM is comlpaining about(I am using Beta 7).
>
><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
>
Your documents do not have system IDs. This is malformed according
tot he XML 1.0 spec.
>JDOM had no problem writing this to disk. I cannot understand why it cannot
>read it back again. I would appreciate any help with this matter. I have
>posted this problem before and seen others postings as well, but I have yet
>to see anything that will help me. In the past I used a regex to strip this
>out of the file before sending to JDOM, but I do not like this approach at
>all.
>
If JDOM's allowing to create documents like this, then that's a bug
that needs to be fixed.
A quick look at the code of DocType shows that there's no test
whether the public or system ID is null. This should probably be
fixed.
More importantly, there's a design flaw in this class. The empty
string is a legal public ID. However, we don't distinguish between
the empty string as a public ID and no public ID at all. We should
probably allow null public IDs to indicate that there is only a
system ID. That is, change
public DocType(String elementName, String systemID) {
this(elementName, "", systemID);
}
to
public DocType(String elementName, String systemID) {
this(elementName, null, systemID);
}
The system ID can also legally be the empty string so we also need to change
public DocType(String elementName) {
this(elementName, "", "");
}
to
public DocType(String elementName) {
this(elementName, null, null);
}
The next step is to change the XMLOutputter and SAXBuilder and
DOMBuilder logic to use null to indicate no public ID and no system
ID instead of the empty string. More on that shortly.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.ibiblio.org/xml/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/ |
+----------------------------------+---------------------------------+
More information about the jdom-interest
mailing list