org.jdom.contrib.input.scanner
Class ElementScanner

java.lang.Object
  extended by org.xml.sax.helpers.XMLFilterImpl
      extended by org.jdom.contrib.input.scanner.ElementScanner
All Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler, org.xml.sax.XMLFilter, org.xml.sax.XMLReader

public class ElementScanner
extends org.xml.sax.helpers.XMLFilterImpl

An XML filter that uses XPath-like expressions to select the element nodes to build and notifies listeners when these elements becomes available during the parse.

ElementScanner does not aim at providing a faster parsing of XML documents. Its primary focus is to allow the application to control the parse and to consume the XML data while they are being parsed. ElementScanner can be viewed as a high-level SAX parser that fires events conveying JDOM elements rather that XML tags and character data.

ElementScanner only notifies of the parsing of element nodes and does not support reporting the parsing of DOCTYPE data, processing instructions or comments except for those present within the selected elements. Application needing such data shall register a specific ContentHandler of this filter to receive them in the form of raw SAX events.

To be notified of the parsing of JDOM Elements, an application shall register objects implementing the ElementListener interface. For each registration, an XPath-like expression defines the elements to be parsed and reported.

Opposite to XPath, there is no concept of current context or current node in ElementScanner. And thus, the syntax of the "XPath-like expressions" is not as strict as in XPath and closer to what one uses in XSLT stylesheets in the match specification of the XSL templates:
In ElementScanner, the expression "x" matches any element named "x" at any level of the document and not only the root element (as expected in strict XPath if the document is considered the current context). Thus, in ElementScanner, "x" is equivalent to "//x".

Example:

  ElementScanner f = new ElementScanner();

  // All descendants of x named y
  f.addElementListener(new MyImpl(), "x//y");
  // All grandchilden of y named t
  f.addElementListener(new MyImpl(), "y/* /t");

  ElementListener l2 = new MyImpl2();
  f.addElementListener(l2, "/*");     // Root element
  f.addElementListener(l2, "z");      // Any node named z

  ElementListener l3 = new MyImpl3();
  // Any node having an attribute "name" whose value contains ".1"
  f.addElementListener(l3, "*[contains(@name,'.1')]");
  // Any node named y having at least one "y" descendant
  f.addElementListener(l3, "y[.//y]");

  f.parse(new InputSource("test.xml"));
  

The XPath interpreter can be changed (see XPathMatcher). The default implementation is a mix of the Jakarta RegExp package and the Jaxen XPath interpreter.

ElementScanner splits XPath expressions in 2 parts: a node selection pattern and an optional test expression (the part of the XPath between square backets that follow the node selection pattern).

Regular expressions are used to match nodes applying the node selection pattern. This allows matching node without requiring to build them (as Jaxen does).
If a test expression appears in an XPath expression, Jaxen is used to match the built elements against it and filter out those not matching the test.

As a consequence of using regular expressions, the or" operator ("|" in XPath) is not supported in node selection patterns but can be achieved by registering the same listener several times with different node patterns.

Note: The methods marked with "[ContentHandler interface support]" below shall not be invoked by the application. Their usage is reserved to the XML parser.

Author:
Laurent Bihanic

Constructor Summary
ElementScanner()
          Construct an ElementScanner, with no parent.
ElementScanner(org.xml.sax.XMLReader parent)
          Constructs an ElementScanner with the specified parent.
 
Method Summary
 void addElementListener(ElementListener listener, java.lang.String pattern)
          Adds a new element listener to the list of listeners maintained by this filter.
 void characters(char[] ch, int start, int length)
          [ContentHandler interface support] Receives notification of character data.
 void endDocument()
          [ContentHandler interface support] Receives notification of the end of a document.
 void endElement(java.lang.String nsUri, java.lang.String localName, java.lang.String qName)
          [ContentHandler interface support] Receives notification of the end of an element.
 void endPrefixMapping(java.lang.String prefix)
          [ContentHandler interface support] Ends the scope of a prefix-URI Namespace mapping.
 void ignorableWhitespace(char[] ch, int start, int length)
          [ContentHandler interface support] Receives notification of ignorable whitespace in element content.
 void parse(org.xml.sax.InputSource source)
          Parses an XML document.
 void processingInstruction(java.lang.String target, java.lang.String data)
          [ContentHandler interface support] Receives notification of processing instruction.
 void removeElementListener(ElementListener listener, java.lang.String pattern)
          Removes element listeners from the list of listeners maintained by this filter.
 void setExpandEntities(boolean expand)
          Sets whether or not to expand entities for the builder.
 void setFactory(org.jdom.JDOMFactory factory)
          Sets a custom JDOMFactory for the builder.
 void setFeature(java.lang.String name, boolean state)
          Sets the state of a feature.
 void setIgnoringElementContentWhitespace(boolean ignoringWhite)
          Specifies whether or not the parser should elminate whitespace in element content (sometimes known as "ignorable whitespace") when building the document.
 void setProperty(java.lang.String name, java.lang.Object value)
          Set the value of a property.
 void setValidation(boolean validate)
          Activates or desactivates validation for the builder.
 void skippedEntity(java.lang.String name)
          [ContentHandler interface support] Receives notification of a skipped entity.
 void startDocument()
          [ContentHandler interface support] Receives notification of the beginning of a document.
 void startElement(java.lang.String nsUri, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes attrs)
          [ContentHandler interface support] Receives notification of the beginning of an element.
 void startPrefixMapping(java.lang.String prefix, java.lang.String uri)
          [ContentHandler interface support] Begins the scope of a prefix-URI Namespace mapping.
 
Methods inherited from class org.xml.sax.helpers.XMLFilterImpl
error, fatalError, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getParent, getProperty, notationDecl, parse, resolveEntity, setContentHandler, setDocumentLocator, setDTDHandler, setEntityResolver, setErrorHandler, setParent, unparsedEntityDecl, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ElementScanner

public ElementScanner()
Construct an ElementScanner, with no parent.

If no parent has been assigned when parse(org.xml.sax.InputSource) is invoked, ElementScanner will use JAXP to get an instance of the default SAX parser installed.


ElementScanner

public ElementScanner(org.xml.sax.XMLReader parent)
Constructs an ElementScanner with the specified parent.

Method Detail

addElementListener

public void addElementListener(ElementListener listener,
                               java.lang.String pattern)
                        throws org.jdom.JDOMException
Adds a new element listener to the list of listeners maintained by this filter.

The same listener can be registered several times using different patterns and several listeners can be registered using the same pattern.

Parameters:
listener - the element listener to add.
pattern - the XPath expression to select the elements the listener is interested in.
Throws:
org.jdom.JDOMException - if listener is null or the expression is invalid.

removeElementListener

public void removeElementListener(ElementListener listener,
                                  java.lang.String pattern)
Removes element listeners from the list of listeners maintained by this filter.

if pattern is null, this method removes all registrations of listener, regardless the pattern(s) used for creating the registrations.

if listener is null, this method removes all listeners registered for pattern.

if both listener and pattern are null, this method performs no action!

Parameters:
listener - the element listener to remove.

setFactory

public void setFactory(org.jdom.JDOMFactory factory)
Sets a custom JDOMFactory for the builder. Use this to build the tree with your own subclasses of the JDOM classes.

Parameters:
factory - JDOMFactory to use.

setValidation

public void setValidation(boolean validate)
Activates or desactivates validation for the builder.

Parameters:
validate - whether XML validation should occur.

setIgnoringElementContentWhitespace

public void setIgnoringElementContentWhitespace(boolean ignoringWhite)
Specifies whether or not the parser should elminate whitespace in element content (sometimes known as "ignorable whitespace") when building the document. Only whitespace which is contained within element content that has an element only content model will be eliminated (see XML Rec 3.2.1). For this setting to take effect requires that validation be turned on.

The default value is false.

Parameters:
ignoringWhite - whether to ignore ignorable whitespace.

setExpandEntities

public void setExpandEntities(boolean expand)
Sets whether or not to expand entities for the builder.

A value true means to expand entities as normal content; false means to leave entities unexpanded as EntityRef objects.

The default value is true.

Parameters:
expand - whether entity expansion should occur.

setFeature

public void setFeature(java.lang.String name,
                       boolean state)
                throws org.xml.sax.SAXNotRecognizedException,
                       org.xml.sax.SAXNotSupportedException
Sets the state of a feature.

Specified by:
setFeature in interface org.xml.sax.XMLReader
Overrides:
setFeature in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
name - the feature name, which is a fully-qualified URI.
state - the requested state of the feature.
Throws:
org.xml.sax.SAXNotRecognizedException - when the XMLReader does not recognize the feature name.
org.xml.sax.SAXNotSupportedException - when the XMLReader recognizes the feature name but cannot set the requested value.

setProperty

public void setProperty(java.lang.String name,
                        java.lang.Object value)
                 throws org.xml.sax.SAXNotRecognizedException,
                        org.xml.sax.SAXNotSupportedException
Set the value of a property.

Specified by:
setProperty in interface org.xml.sax.XMLReader
Overrides:
setProperty in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
name - the property name, which is a fully-qualified URI.
value - the requested value for the property.
Throws:
org.xml.sax.SAXNotRecognizedException - when the XMLReader does not recognize the property name.
org.xml.sax.SAXNotSupportedException - when the XMLReader recognizes the property name but cannot set the requested value.

parse

public void parse(org.xml.sax.InputSource source)
           throws java.io.IOException,
                  org.xml.sax.SAXException
Parses an XML document.

The application can use this method to instruct ElementScanner to begin parsing an XML document from any valid input source (a character stream, a byte stream, or a URI).

Applications may not invoke this method while a parse is in progress. Once a parse is complete, an application may reuse the same ElementScanner object, possibly with a different input source.

This method is synchronous: it will not return until parsing has ended. If a client application wants to terminate parsing early, it should throw an exception.

Specified by:
parse in interface org.xml.sax.XMLReader
Overrides:
parse in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
source - the input source for the XML document.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.
java.io.IOException - an IO exception from the parser, possibly from a byte stream or character stream supplied by the application.

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of the beginning of a document.

Specified by:
startDocument in interface org.xml.sax.ContentHandler
Overrides:
startDocument in class org.xml.sax.helpers.XMLFilterImpl
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of the end of a document.

Specified by:
endDocument in interface org.xml.sax.ContentHandler
Overrides:
endDocument in class org.xml.sax.helpers.XMLFilterImpl
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
                        throws org.xml.sax.SAXException
[ContentHandler interface support] Begins the scope of a prefix-URI Namespace mapping.

Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler
Overrides:
startPrefixMapping in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
prefix - the Namespace prefix being declared.
uri - the Namespace URI the prefix is mapped to.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
                      throws org.xml.sax.SAXException
[ContentHandler interface support] Ends the scope of a prefix-URI Namespace mapping.

Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler
Overrides:
endPrefixMapping in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
prefix - the prefix that was being mapped.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

startElement

public void startElement(java.lang.String nsUri,
                         java.lang.String localName,
                         java.lang.String qName,
                         org.xml.sax.Attributes attrs)
                  throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of the beginning of an element.

Specified by:
startElement in interface org.xml.sax.ContentHandler
Overrides:
startElement in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
nsUri - the Namespace URI, or the empty string if the element has no Namespace URI or if Namespace processing is not being performed.
localName - the local name (without prefix), or the empty string if Namespace processing is not being performed.
qName - the qualified name (with prefix), or the empty string if qualified names are not available.
attrs - the attributes attached to the element. If there are no attributes, it shall be an empty Attributes object.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

endElement

public void endElement(java.lang.String nsUri,
                       java.lang.String localName,
                       java.lang.String qName)
                throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of the end of an element.

Specified by:
endElement in interface org.xml.sax.ContentHandler
Overrides:
endElement in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
nsUri - the Namespace URI, or the empty string if the element has no Namespace URI or if Namespace processing is not being performed.
localName - the local name (without prefix), or the empty string if Namespace processing is not being performed.
qName - the qualified name (with prefix), or the empty string if qualified names are not available.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of character data.

Specified by:
characters in interface org.xml.sax.ContentHandler
Overrides:
characters in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
ch - the characters from the XML document.
start - the start position in the array.
length - the number of characters to read from the array.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of ignorable whitespace in element content.

Specified by:
ignorableWhitespace in interface org.xml.sax.ContentHandler
Overrides:
ignorableWhitespace in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
ch - the characters from the XML document.
start - the start position in the array.
length - the number of characters to read from the array.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
                           throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of processing instruction.

Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
Overrides:
processingInstruction in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
target - the processing instruction target.
data - the processing instruction data, or null if none was supplied.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.

skippedEntity

public void skippedEntity(java.lang.String name)
                   throws org.xml.sax.SAXException
[ContentHandler interface support] Receives notification of a skipped entity.

Specified by:
skippedEntity in interface org.xml.sax.ContentHandler
Overrides:
skippedEntity in class org.xml.sax.helpers.XMLFilterImpl
Parameters:
name - the name of the skipped entity.
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception.


Copyright © 2007 Jason Hunter, Brett McLaughlin. All Rights Reserved.