<!doctype html public "-//w3c//dtd html 4.0 transitional//en">

<html>

James Strachan wrote:

<blockquote TYPE=CITE>From: "Randall J. Parr" &lt;RParr@TemporalArts.COM>

<br>> James Strachan wrote:

<br>>

<br>[snip]

<p>> I, in general, agree this would be very useful. Saxon has a mechanism

in

<br>its Java

<br>> API that is somewhat like this and I like it (I just can't get it

to work

<br>very

<br>> well).

<br>>

<br>> I would like to point out though that, for my use, when doing event

<br>processing I

<br>> almost always need to handle the startElement and endElement in the

order

<br>they are

<br>> encountered.

<br>>

<br>> For example when I encounter &lt;TABLE name="customer"> I have to

open/verify

<br>a

<br>> connection to the database and initialize my metadata. When I encounter

<br>&lt;/TABLE> I

<br>> have to commit/rollback the transaction and close/release the database

connection.

<br>>

<br>> Even more simply, when I encounter &lt;TABLE ... > I want to output

open a

<br>new file and

<br>> output &lt;TABLE ... >, then I encounter a lot of &lt;ROW> ... &lt;/ROW>

elements

<br>(each of

<br>> which I massage, output, and then discard), finally when I encounter

<br>&lt;/TABLE> I want

> to output the &lt;/TABLE> close that file and be done.

<br>>

> Your interface forces me to treat each as a start OR an end element.

Maybe

<br>> ElementHandler should be more like:

<br>>

<br>> package org.jdom;

<br>> public interface ElementHandler {

<br>>&nbsp;&nbsp;&nbsp;&nbsp; public void startElement( ... )

<br>>&nbsp;&nbsp;&nbsp;&nbsp; public void endElement( ... )

<br>> }

Thanks Randall.

<p>Yes I agree that its nice to know sometimes that the start or end has

<br>occurred. However I suppose these are 'sub element' events - the kind

of

things that SAX has been designed for.

<p>I'm tempted to still keep the simple interface

<p>&nbsp;public interface ElementHandler {

<br>&nbsp;&nbsp;&nbsp;&nbsp; public void handle( Element element );

<br>&nbsp;}

for processing 'whole' elements and element trees.

An additional sub-element handler could be useful.

<p>&nbsp;public interface SubElementHandler {

<br>&nbsp;&nbsp;&nbsp;&nbsp; public void onStart( Element element );

<br>&nbsp;&nbsp;&nbsp;&nbsp; public void onEnd( Element element );

<br>&nbsp;}

<p>But this seems to be too close to the problem space SAX is trying to

<br>tackle - which makes me think that in those sub-element conditions

we should

be using SAX directly rather than introducing another new interface.

<p>Another way of looking at the problem could be that we just implement

these

start &amp; end element semantics using ElementHandler.

<p>To handle the start &lt;TABLE> example you gave, you could just use

lazy

<br>contruction in your RowElementHandler. i.e. if the first row element

is

being processed, open a connection / file&nbsp; / whatever.

<p>To handle the end &lt;TABLE> example, we could have some way of specifying

that

<br>a TableElementHandler does not require any child elements. Afterall

implicit

<br>in the ElementHandler semantics is that the element has ended before

the

<br>handler is called. So we just want to filter out the &lt;ROW> elements

from our

TableElementHandler.

<p>So we may use (say) XPath to find the root of the sub-document tree

to

<br>build, we may use another XPath expression to determine how deep the

tree

<br>should be. In the 'end element' use case we will probably want an empty

<br>tree. The default case is probably 'from the sub-document tree root

downwards'.

<p>What are other peoples thoughts on this? Should we go sub-element events

or

keep them in SAX?

<br><a href="http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com"></a>&nbsp;</blockquote>

Whoa there. There is a big difference. In SAX&nbsp;processing the ContentHandler

defines ONE&nbsp;startElement and ONE&nbsp;endElement that handle the start/end

events for ALL&nbsp;elements encountered and you must keep track of which

element you're in, etc. Alternatively, you can use something like Robert

Hustead's SaxMapper package (described in "Mapping XML to Java, Part1/2",

www.javaworld.com).

<p>What I see you proposing (ala Saxon) is a BIG&nbsp;improvement because

you define a handler for a given element (or path/element expression) somewhat

equivalent to an XSLT&nbsp;template. Even if that handler has a startElement()

and endElement() it is a big step away from SAX&nbsp;level programming

because you've more cleanly and clearly defined the start/stop methods

for JUST&nbsp;THAT&nbsp;ELEMENT. You've eliminated most of the heinous

SAX&nbsp;level coding required to keep track of the parse state. You no

longer have a single startElement(), endElement() with a big if/then/else

(or you no longer have whatever you've implimented to track parse state,

etc.)

<p>In the &lt;TABLE> example, I personally, prefer the more direct and

intuitive approach where what should happen at the &lt;TABLE> event is

handled by a &lt;TABLE> handler startElement(). If the RowElementHandler

open the database. if not open, etc. ties the behaviour of the &lt;TABLE>

event to a particular child element event.

<p>I would also like to point out that attaching processing to a high-level

tag like &lt;TABLE> does NOT generally require storing the entire tree

in memory. That would only be the case if the attached processing in some

way dictated traversing the tree or searching for some child element. In

my database conversion examples that is never the case. Even if an example

such as a traditional "report" with a total, &lt;TABLE> would initialize

a total and output the header, &lt;/ROW> would increment the total and

output the row, and &lt;/TABLE> would output a footer with using total

value. The ROWS do NOT have to stay in memory after &lt;/ROW>.

<p>R.Parr

<br>TemporalArts

<br>&nbsp;

<br>&nbsp;

<br>&nbsp;

<br>&nbsp;

<br>&nbsp;

<br>&nbsp;

<br>&nbsp;</html>