[jdom-interest] RE: [xml-dev] SAX startDocument and endDocument when there's only a document fragment

Michael Kay michael.h.kay at ntlworld.com
Mon Nov 25 01:44:07 PST 2002


This is one of those edge cases where there's no easy answer, because
the different standards don't mesh well.

SAX is designed to handle well-formed documents; someone who writes a
ContentHandler is entitled to assume a particular sequence of calls,
essentially of the form

startDocument
  startElement
    {startElement, ..., endElement}*
  endElement
endDocument

XSLT is capable of outputting "well-balanced" documents (often called
fragments) that don't have this structure, and there is no obvious
solution to how such a document can be fed to a ContentHandler. Saxon
won't give an ill-formed document to a ContentHandler unless the
ContentHandler indicates that it is prepared to accept it. 

Saxon's protocol for getting this indication is extremely arcane,
because the only way a ContentHandler can communicate information to its
caller is through exceptions.

The solution to this mismatch between XSLT and SAX really belongs in
JAXP, which is probably where your message should have been directed. A
suitable solution might be a marker interface that a ContentHandler can
implement to indicate that it is prepared to handle well-balanced
document fragments.

Michael Kay





> -----Original Message-----
> From: Elliotte Rusty Harold [mailto:elharo at metalab.unc.edu] 
> Sent: 24 November 2002 14:57
> To: sax-devel at lists.sourceforge.net
> Cc: xsl-list at lists.mulberrytech.com; xml-dev at lists.xml.org; 
> jdom-interest at jdom.org
> Subject: [xml-dev] SAX startDocument and endDocument when 
> there's only a document fragment
> 
> 
> Apologies in advance for the cross-posting, but this is one of those 
> nasty problems that crosses the usual boundaries.
> 
> 1. Consider an XSLT transform that generates a document fragment 
> rather than a complete document; that is, there is no single root 
> element.
> 
> 2. Suppose TrAX is used to apply this transform and generate a result.
> 
> 3. Suppose the result is a JAXP SAXResult object; that is, it fires 
> the contents of the result into a user-specified ContentHandler.
> 
> Question: should the the transformer fire startDocument() and 
> endDocument() events? even though this isn't a complete document, 
> only a document fragment?
> 
> 
> The SAX API doc is not absolutely clear on this point. However, my 
> interpretation is that yes, it should call startDocument() and 
> endDocument().
> 
> The JAXP spec does not appear to have anything relevant to 
> say about this.
> 
> In the course of adding XSLT support to JDOM, Laurent Bihanic has 
> discovered that different engines behave differently. In particular, 
> the Oracle XML Parser for Java does not call startDocument and 
> endDocument. This is a roadblock in adding full XSLT support to JDOM, 
> so some clarification would be appreciated.
> -- 
> 
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> |          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
> |              http://www.cafeconleche.org/books/xian2/              |
> |  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
> +----------------------------------+---------------------------------+
> |  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
> |  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
> +----------------------------------+---------------------------------+
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org 
> <http://www.xml.org>, an initiative of OASIS 
<http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>





More information about the jdom-interest mailing list