[jdom-interest] Fast Factory

Elliotte Rusty Harold elharo at metalab.unc.edu
Sat May 24 04:16:52 PDT 2003


At 10:36 AM -0700 5/23/03, Dennis Sosnoski wrote:


>Ah, the "war on terrorism" as an API design criteria! Original, but 
>not something I'd consider persuasive. Your hypothetical "malicious 
>developers" have no need for any special API in order to create 
>malformed documents - simple println calls are more than sufficient.

Yes, but I'm damned if I'm going to enable this, especially when the 
reality is not so much maliciousness as cluelessness. On this list 
and others I constantly hear from developers who want to generate and 
accept malformed documents. These developers do not understand XML. 
They need to be educated. It is good that they experience cognitive 
dissonance so that they will ask questions about why they can't parse 
their database dump that contains nulls (for example) so we can 
explain to them why they shouldn't be doing what they think they need 
to do and what they need to instead.

>It's very broad-minded of you to be willing to accept that a 
>sactioned parser may justify bypassing this. Of course, this change 
>in attitude comes only after you've already implemented this in your 
>own XOM API, with a claimed substantial speedup.

Historically inaccurate, I'm afraid. I've been on record as being 
willing to accept this for some time now, well before I actually 
implemented it in XOM a couple of weeks ago.

>  It does seem a very lax approach, though. Why not make a list of 
>the parsers you consider sufficiently compliant with the XML 
>recommendation? That way you could check in the code at runtime to 
>see if one of those parser is being used, and only turn off the 
>checks if the parser is on your approved list?

I originally thought it would be OK to allow essentially any SAX 
XMLReader. However, within 48 hours of making this change in XOM, one 
XOM/JDOM user immediately piped up that he wanted the option to 
perform verification on build in order to allow him to use some of 
his own non-standard filters.

>  Better yet, just refuse to work with any parser not on the approved 
>list. Of course, then you open up the issue of malicious 
>substitution of sabotaged parser code. :-)

I'm pretty much willing to accept any of the common parsers. The 
issue that was raised with XOM is the case of SAX filters, that act 
like an XMLReader but can do very weird things like generating 
endElement() calls without matching startElement() calls or including 
C0 controls in text. The same issue arises in custom readers that 
aren't really XML parsers but instead do things like feed a database 
query into a SAX ContentHandler to make it appear that the database 
is an XML document, even when it really isn't. Not the most common 
scenario, certainly, but common enough that we need to think about it.

I'm not sure how JDOM deals with it, but in XOM I rely heavily on 
assumptions guaranteed by well-formedness, such as every Document has 
exactly one root element and an Element's children can only have the 
types Element, Comment, Text, or ProcessingInstruction.  Thus I don't 
have to waste a lot of code checking for the unusual case where 
that's not true. I am able to rely on my class invariants. I don't 
know if this makes the code faster or smaller, but it absolutely 
makes the code cleaner and easier to read. Every public class and API 
takes responsibility for verifying preconditions and maintaining 
class invariants.

>I have a hard time understanding the emotional weight this issue 
>apparently carries for you. I'm not suggesting removing verification 
>from JDOM, but only making it optional where practical. That would 
>give developers the freedom to turn it off if they want, at the risk 
>of shooting themselves in the foot. That's a tradeoff that most APIs 
>provide and that Java developers generally seem to like.

The whole point of an API is to keep developers from shooting 
themselves in the foot. A developer uses an API because the API 
designer knows more about the problem domain than the client 
programmer does. I'm glad the java.net APIs remove from me the 
responsibility of knowing every last detail of IP packet formats. I 
can rely on the API to do the work for me. Similarly in XML we must 
not assume the client programmers know as much as we do. They are 
looking to JDOM precisely so they don't have to know every last 
detail of XML like which C0 characters they can and cannot use in 
text nodes.

I have noticed that Java developers as a group tend to be relatively 
uneducated about preconditions, postconditions, and class invariants 
as opposed to, say, Eiffel or even C++ programmers. This probably has 
to do with Java's lack of language level support for such features, a 
lack of support that continues with  the incredibly broken assertions 
mechanism introduced in Java 1.4. However, the fact remains that 
class invariants are at the core of data encapsulation, and are a 
critical aspect of object oriented programming. Turning off class 
invariants is at the same level as exposing fields as public, 
directly accessible data. (Indeed the two practices go hand in hand.) 
Now maybe you don't want to write object oriented programs, and 
that's OK; but if we are going to design an object oriented API for 
an object oriented language, then class invariants shouldn't be an 
issue, any more than private fields.
-- 

   Elliotte Rusty Harold
   elharo at metalab.unc.edu
   Processing XML with Java (Addison-Wesley, 2002)
   http://www.cafeconleche.org/books/xmljava
   http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA



More information about the jdom-interest mailing list