[jdom-interest] ElementScanner - causing SAXHandler to mistake
nonroot element for root element
Laurent Bihanic
laurent.bihanic at atosorigin.com
Wed May 5 04:10:30 PDT 2004
Hi Richard,
Sorry for the long delay. I had a chance to look at your problem. Indeed, this
is a problem in ElementScanner and your analysis is correct.
> To fix, I removed the if (this.activeRules.size() != 0) test that contained
> the startElement() call to XMLScanner, so that it always propogates the
> event to the SAXHandler.
Your fix proposal to always propagate the startElement events to SAXHandler is
quite dangerous as it forces SAXHandler to build a full JDOM document from the
parser output (which is précisely what ElementScanner aims at avoiding).
Thus, I think we should keep the "if (this.activeRules.size() != 0)" test to
support extracting some nodes from huge document while using as little memory
as possible.
Attached is another patch proposal: Instead of directly using SAXHandler, it
relies on a subclass (FragmentHandler, borrowed from JDOMResult) that inserts
a dummy root document in SAXHandler's document.
This guarantees that, whatever your matching rules, SAXHandler will always
have a single root document.
What do you think,
Laurent
Richard Allen wrote:
> Hi All,
>
> With the following XML:
> <blah>
> <huh>1234</huh>
> <blam>
> <yay>woohoo</yay>
> </blam>
> <blam>
> <yay>mwuhahaha</yay>
> </blam>
> <nah>5678</nah>
> </blah>
>
> And listeners on the following:
> /blah/huh
> /blah/blam
>
> The /blah/huh element is processed sweet as..
> But when the /blah/blam element is being processed, the
> SAXHandler.startElement() throws the following exception:
>
> org.xml.sax.SAXException: Ill-formed XML document (multiple root elements
> detected)
> at org.jdom.input.SAXHandler.getCurrentElement(SAXHandler.java:906)
> at org.jdom.input.SAXHandler.startElement(SAXHandler.java:553)
> at
> org.jdom.contrib.input.scanner.ElementScanner.startElement(ElementScanner.java:554)
>
> This is a bit weird, given that the //blam element isn't the root element
> ;-)
>
> The problem is that the XMLScanner is not being notified until after the
> first element that contains active rules has been found.
> This causes SAXHandler to think that the /blah/huh element is actually the
> root.
> When the ElementScanner notifies SAXHandler of the /blah/blam element it
> throws a hissy fit as it has already ended what it thinks is the root
> element ;-)
>
> To fix, I removed the if (this.activeRules.size() != 0) test that contained
> the startElement() call to XMLScanner, so that it always propogates the
> event to the SAXHandler.
>
> Comments appreciated as to whether this fix is the ideal fix, or if there
> is a better way to fix this problem.
> cheers,
> Rich
-------------- next part --------------
Index: ElementScanner.java
===================================================================
RCS file: /home/cvspublic/jdom-contrib/src/java/org/jdom/contrib/input/scanner/ElementScanner.java,v
retrieving revision 1.11
diff -u -r1.11 ElementScanner.java
--- ElementScanner.java 28 Feb 2004 03:47:08 -0000 1.11
+++ ElementScanner.java 5 May 2004 12:53:26 -0000
@@ -707,7 +707,7 @@
//----------------------------------------------------------------------
protected SAXHandler createContentHandler() {
- return (new SAXHandler(new EmptyDocumentFactory(getFactory())));
+ return (new FragmentHandler(new EmptyDocumentFactory(getFactory())));
}
//----------------------------------------------------------------------
@@ -768,6 +768,31 @@
}
//-------------------------------------------------------------------------
+ // FragmentHandler nested class
+ //-------------------------------------------------------------------------
+
+ /**
+ * FragmentHandler extends SAXHandler to support matching nodes
+ * without a common ancestor. This class inserts a dummy root
+ * element in the being-built document. This prevents the document
+ * to have, from SAXHandler's point of view, multiple root
+ * elements (which would cause the parse to fail).
+ */
+ private static class FragmentHandler extends SAXHandler {
+ /**
+ * Public constructor.
+ */
+ public FragmentHandler(JDOMFactory factory) {
+ super(factory);
+
+ // Add a dummy root element to the being-built document as XSL
+ // transformation can output node lists instead of well-formed
+ // documents.
+ this.pushElement(new Element("root", null, null));
+ }
+ }
+
+ //-------------------------------------------------------------------------
// EmptyDocumentFactory nested class
//-------------------------------------------------------------------------
More information about the jdom-interest
mailing list