[jdom-interest] Limitations wrt XPath

Patrick Dowler Patrick.Dowler at nrc.ca
Mon Oct 23 09:08:20 PDT 2000


On Sun, 22 Oct 2000, Jason Hunter wrote:
> <philosophy_question>
> For all these things we're looking at adding extra cost both in terms of
> memory use and CPU cycles to manage the interconnections.  Is that a
> price we're willing to pay?  If we do pay the price wouldn't that give
> someone a reason to modify their version to not pay that price?  Things
> like XPath need the parentage, but how often is it needed by a normal
> API user?  Is it enough for XPath to use a simple JDOM structure and add
> full parentage itself?  Or on the String vs. Text class issue, how much
> memory will be consumed by wrapping all strings with a Text class, and
> having that Text class remember its parent?  That's a lot of extra
> object creation overhead that's not needed by most people and could be
> added "above JDOM" by an XPath implementation.
> </philosophy_question>

Damn that final...

IMO if you want to do something, you have to "do it right". If we want to
support XPath, go all the way. Personally, there are many uses of XML
that are very simple. For example, I am using JDOM for two very simple
cases. 

The first is your standard gardden variety data format, typically a large
table with 100s to 1000s or rows and 5-10 columns.  No need to describe it
further, and its pretty boring anyway.

The second is to serialize and store Jini entry objects to a file or stream.
Entry objects have public members, so one can use relection to find them and
make a text representation. You end up with one Document for an Entry, with an
Element for each public member, maybe children if the member is a Collection,
and I also use Attributes to store the class name, a la

<?xml version="1.0" encoding="UTF-8"?>
<entry class="ca.nrc.cadc.arch.jini.entry.ScriptTask">
   <id class="java.lang.Long">1</id>
   <runnable class="java.lang.Boolean">true</runnable>
   <root class="java.lang.Boolean">false</root>
   <shell class="java.lang.String">/bin/sh</shell>
   <script class="java.lang.String">doit</script>
   <args class="java.util.ArrayList">
      <arg0 class="java.lang.String">-v</arg0>
      <arg1 class="java.lang.String">-o</arg1>
      <arg2 class="java.lang.String">output.txt</arg2>
   </args>
</entry>

This is really vanilla XML with only Document, Element, and Attribute in
the code. It suffices for this task (no pun intended :-). This is "input" for a
distributed computing framework based on Jini and Javaspaces. 

My point is that there are many, many projects that do not require a lot
of the more esoteric stuff.... 

***********

So, instead of always having to decide if such and such a feature
is worth the cost, IMO the way to go is to have base-level functionality
(the simple stuff that is easy to do) in the org.jdom package and then
add all the fancy stuff in a set of subclasses in org.jdom.ext (extended).

Base:
	org.jdom.Document
	org.jdom.Element
	org.jdom.Attribute
	org.jdom.DocType
Extended:
	org.jdom.ext.Document
	org.jdom.ext.Element
	org.jdom.ext.Attribute - probably adds nothing
	org.jdom.ext.DocType
	org.jdom.ext.Namespace
	org.jdom.ext.CDATA
	org.jdom.ext.Comment
	org.jdom.ext.ProcessingInstruction
	org.jdom.ext.Entity

Maybe Comment and/or ProcessingInstruction would go in base if they
don't add anything to the other classes. The point is that all the
members and methods neede in something like Element that are there
to support Namespace would be added to the org.jdom.ext.Element subclass.

If every class in org.jdom has a subclass in org.jdom.ext, then programmers
just have to change an import to start using the fancy stuff. 

** The builders need to know if they are to supply extended functionality or
base functionality, so you need to be able to set that property. One could
have a JDOMBuilder that converted base <-> extended as well.

This refactoring could be done with little or no effect on existing code
and on coding practices. Some imports would have to be changed for
people using Namespaces, XPath, etc. For those that want speed and
simplicity, they can get it too. It is important to note that some of the
simple uses - storing a large table as XML, for example, require lots
of Elements and hence speed is important. This type of architecture
would make it easier for a sensible developer (like me :-) to use
XML instead of putting the whole table in a CDATA section and having to
have another parser for the CDATA section :-(

************

Please keep in mind that many people don't need Namespaces, XPath,
etc. but they do need speed and simplicity. 

Thoughts? 

--

Patrick Dowler
Canadian Astronomy Data Centre




More information about the jdom-interest mailing list