[jdom-interest] Internal DTD subset verification

Alex Rosen arosen at silverstream.com
Tue May 7 08:04:33 PDT 2002


> >It would mean *gasp* that there exists the possibilty of a jdom
> >document living in
> >memory that could not be produced as an xml document.
>
> This possibility is a major flaw in XInclude and a few other
> technologies now. It is causing implementors and users problems
> *today*.

Can you give some examples?

> If we let it into JDOM, it will cause JDOM problems too. We
> really, really do not want to allow this.

I've never figured out what real-world problems would occur more than
rarely, if we were less than 100% perfect in our well-formedness checking.
As I mentioned, that doesn't seem to have slowed down DOM (or Xerces or
Crimson). The user of JDOM must take some responsibility for writing a
correct program. You want to protect them from their own XML ignorance, as
well as protecting you and me from having to deal with the result of that
ignorance. But even if we stamp out malformed documents, the user can still
create invalid documents, or even valid but semantically nonsensical
documents. We can't protect everyone from everything. Why draw this
particular line in the sand? The line with most APIs is, protect users from
themselves when it's cheap but not when it's expensive, because otherwise
they'll use a different API. That's the line I would draw. If character
verification slows JDOM down by 10% or more in a benchmark, that'll make
some percentage of people not use it, which is just counterproductive.

> A JDOM Document object represents an XML document. The only
> definition of an XML document is syntactic, a sequence of characters
> that adheres to certain constraints. If there is not a 1-1 mapping
> from Document objects onto well-formed character sequences, then the
> Document class does not properly model an XML document. It's
> modelling a superset of XML documents. And sooner or later (probably
> sooner) that's going to cause problems.

Well, the XML declaration is not modeled by a JDOM Document, and the
whitespace between attributes is not modeled by JDOM at all. And this isn't
our fault even, it's SAX's fault. (I'm not sure if this is actually relevent
to the discussion or if I'm just being nit-picky).

Alex




More information about the jdom-interest mailing list