[jdom-interest] Fast Factory
Elliotte Rusty Harold
elharo at metalab.unc.edu
Sat May 24 04:16:52 PDT 2003
At 10:36 AM -0700 5/23/03, Dennis Sosnoski wrote:
>Ah, the "war on terrorism" as an API design criteria! Original, but
>not something I'd consider persuasive. Your hypothetical "malicious
>developers" have no need for any special API in order to create
>malformed documents - simple println calls are more than sufficient.
Yes, but I'm damned if I'm going to enable this, especially when the
reality is not so much maliciousness as cluelessness. On this list
and others I constantly hear from developers who want to generate and
accept malformed documents. These developers do not understand XML.
They need to be educated. It is good that they experience cognitive
dissonance so that they will ask questions about why they can't parse
their database dump that contains nulls (for example) so we can
explain to them why they shouldn't be doing what they think they need
to do and what they need to instead.
>It's very broad-minded of you to be willing to accept that a
>sactioned parser may justify bypassing this. Of course, this change
>in attitude comes only after you've already implemented this in your
>own XOM API, with a claimed substantial speedup.
Historically inaccurate, I'm afraid. I've been on record as being
willing to accept this for some time now, well before I actually
implemented it in XOM a couple of weeks ago.
> It does seem a very lax approach, though. Why not make a list of
>the parsers you consider sufficiently compliant with the XML
>recommendation? That way you could check in the code at runtime to
>see if one of those parser is being used, and only turn off the
>checks if the parser is on your approved list?
I originally thought it would be OK to allow essentially any SAX
XMLReader. However, within 48 hours of making this change in XOM, one
XOM/JDOM user immediately piped up that he wanted the option to
perform verification on build in order to allow him to use some of
his own non-standard filters.
> Better yet, just refuse to work with any parser not on the approved
>list. Of course, then you open up the issue of malicious
>substitution of sabotaged parser code. :-)
I'm pretty much willing to accept any of the common parsers. The
issue that was raised with XOM is the case of SAX filters, that act
like an XMLReader but can do very weird things like generating
endElement() calls without matching startElement() calls or including
C0 controls in text. The same issue arises in custom readers that
aren't really XML parsers but instead do things like feed a database
query into a SAX ContentHandler to make it appear that the database
is an XML document, even when it really isn't. Not the most common
scenario, certainly, but common enough that we need to think about it.
I'm not sure how JDOM deals with it, but in XOM I rely heavily on
assumptions guaranteed by well-formedness, such as every Document has
exactly one root element and an Element's children can only have the
types Element, Comment, Text, or ProcessingInstruction. Thus I don't
have to waste a lot of code checking for the unusual case where
that's not true. I am able to rely on my class invariants. I don't
know if this makes the code faster or smaller, but it absolutely
makes the code cleaner and easier to read. Every public class and API
takes responsibility for verifying preconditions and maintaining
class invariants.
>I have a hard time understanding the emotional weight this issue
>apparently carries for you. I'm not suggesting removing verification
>from JDOM, but only making it optional where practical. That would
>give developers the freedom to turn it off if they want, at the risk
>of shooting themselves in the foot. That's a tradeoff that most APIs
>provide and that Java developers generally seem to like.
The whole point of an API is to keep developers from shooting
themselves in the foot. A developer uses an API because the API
designer knows more about the problem domain than the client
programmer does. I'm glad the java.net APIs remove from me the
responsibility of knowing every last detail of IP packet formats. I
can rely on the API to do the work for me. Similarly in XML we must
not assume the client programmers know as much as we do. They are
looking to JDOM precisely so they don't have to know every last
detail of XML like which C0 characters they can and cannot use in
text nodes.
I have noticed that Java developers as a group tend to be relatively
uneducated about preconditions, postconditions, and class invariants
as opposed to, say, Eiffel or even C++ programmers. This probably has
to do with Java's lack of language level support for such features, a
lack of support that continues with the incredibly broken assertions
mechanism introduced in Java 1.4. However, the fact remains that
class invariants are at the core of data encapsulation, and are a
critical aspect of object oriented programming. Turning off class
invariants is at the same level as exposing fields as public,
directly accessible data. (Indeed the two practices go hand in hand.)
Now maybe you don't want to write object oriented programs, and
that's OK; but if we are going to design an object oriented API for
an object oriented language, then class invariants shouldn't be an
issue, any more than private fields.
--
Elliotte Rusty Harold
elharo at metalab.unc.edu
Processing XML with Java (Addison-Wesley, 2002)
http://www.cafeconleche.org/books/xmljava
http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA
More information about the jdom-interest
mailing list