<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman, new york, times, serif;font-size:12pt"><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;">Laurent,<br><br>Commenting out those lines results in an exception:<br><br>Exception in thread "main" org.xml.sax.SAXException: Ill-formed XML document (multiple root elements detected)<br> at org.jdom.input.SAXHandler.getCurrentElement(SAXHandler.java:918)<br> at org.jdom.input.SAXHandler.startElement(SAXHandler.java:556)<br> at org.jdom.contrib.input.scanner.ElementScanner.startElement(ElementScanner.java:548)<br> at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:533)<br> at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:330)<br> at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1693)<br> at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)<br> at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)<br> at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)<br> at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)<br> at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)<br> at
org.xml.sax.helpers.XMLFilterImpl.parse(XMLFilterImpl.java:333)<br> at org.jdom.contrib.input.scanner.ElementScanner.parse(ElementScanner.java:442)<br><br>It looks like that causes SaxHandler to add multiple root elements to the document. I tried tracking this down but had no luck finding the problem and don't have more time to dedicate to it. I think it has something to do with the isRoot flag in the SaxHandler not being reset.<br><br>-Brian<br><br><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;">----- Original Message ----<br>From: Laurent Bihanic <laurent.bihanic@atosorigin.com><br>To: Brian Nahas <briannahas@yahoo.com><br>Cc: jdom-interest@jdom.org<br>Sent: Tuesday, November 14, 2006 6:53:40 PM<br>Subject: Re: [jdom-interest] ElementScanner and Memory<br><br><div>Hi,<br><br>Yes, this looks like a bug.<br><br>ElementScanner relies on the EmptyDocument and EmptyDocumentFactory nested <br>classes to
prevent any document to be built.<br>So, something has gone wrong here!<br><br>Comparing the current code (from CVS) and the original code (2002), I think <br>the problem may come from FragmentHandler which was imported from JDOMResult <br>as a replacement for the original ElementBuilder.<br><br>The following statement was imported:<br> // Add a dummy root element to the being-built document as XSL<br> // transformation can output node lists instead of well-formed<br> // documents.<br> this.pushElement(new Element("root", null, null));<br><br>It makes sense for JDOMResult (the comment explains why) but not here.<br><br>I suspect the root element it adds allows SAXHandler to attach the build <br>Elements hence causing the memory
leak.<br><br>Could you remove these lines from FragmentHandler's constructor et verify this <br>fixes the problem ?<br><br>Laurent<br><br><br>Brian Nahas a écrit :<br>> I have a 1.2 GB xml file I need to parse. Since it's nicely <br>> partitioned, I planned on using ElementScanner from the contrib package <br>> to only load one item at a time. Here's an equivalent schema:<br>> <br>> <data><br>> <item>...</item><br>> <item>...</item><br>> <item>...</item><br>> ...<br>> </data><br>> <br>> The path for I'm using for my listener is "/data/item".<br>> <br>> I assumed any previous items would be released by the parser upon <br>> completion. ElementScanner was very simple to set up to handle this, <br>> however I ran into an OutOfMemory error on my first
try. I was a little <br>> confused as I thought ElementScanner was specifically designed to <br>> prevent this. Upon investigation, I found that the SAXHandler used by <br>> the ElementScanner was holding onto the previous items after I was done <br>> with them. It adds them to the default root element that <br>> FragmentHandler creates and nothing removes them after the listeners are <br>> called. This seems to be in direct conflict with this message I found <br>> which states that ElementScanner doesn't build a document (this message <br>> is fairly old though):<br>> <br>> <a target="_blank" href="http://www.servlets.com/archive/servlet/ReadMsg?msgId=350607&listName=jdom-interest">http://www.servlets.com/archive/servlet/ReadMsg?msgId=350607&listName=jdom-interest</a> <br>> <<a target="_blank"
href="http://www.servlets.com/archive/servlet/ReadMsg?msgId=350607&listName=jdom-interest">http://www.servlets.com/archive/servlet/ReadMsg?msgId=350607&listName=jdom-interest</a>><br>> <br>> I worked around this by explicitly detaching the element in my listener <br>> when I was done with it, but since it seems like this would be a common <br>> pattern and subtle trap, so I thought I'd ask and see if I was missing <br>> some setting or improperly using ElementScanner. There's a namespace <br>> declared on the data element so I don't know if that has something to do <br>> with it.<br>> <br>> Thanks,<br>> -Brian<br></div></div><br></div></div></body></html>