[jdom-interest] Best strategy for caching JDom Document instance and provide concurrent read access to it?

Guillaume Berche guillaume.berche at eloquant.com
Thu Jan 8 07:00:07 PST 2004


Hello,

I'm very pleased with JDom API because it's simple and intuitive. Thanks
Jason for this great library! I looked into the FAQ and into this list
archive but could not find a definitive answer to my question. Please point
me to it if I missed it.


I'm trying to use the use-case described into the FAQ:
"Single thread reads an XML stream into JDOM and makes it available to a run
time system for read only access"

Actually, I am trying to have a cache of JDom trees, and from which a same
JDom document instance may be access in read only mode by concurrent
threads.

In this list Jason wrote the following in
http://www.servlets.com/archive/servlet/ReadMsg?msgId=157461&listName=jdom-i
nterest.

"> JDOM is generally not thread safe, as I understand it.

True.  We follow the same model as ArrayList, which is not by default
thread safe."


However ArrayList is actually safe for concurrent reads accesses (Iterators
and Enumerations keep their own state) as its javadoc specifies:

http://java.sun.com/j2se/1.3/docs/api/java/util/ArrayList.html

"Note that this implementation is not synchronized. If multiple threads
access an ArrayList instance concurrently, and at least one of the threads
modifies the list structurally, it must be synchronized externally. (A
structural modification is any operation that adds or deletes one or more
elements, or explicitly resizes the backing array; merely setting the value
of an element is not a structural modification.) "


Then I wonder whether JDom beta 8 or beta 9, would have problems with
concurrent read accesses. I've haven't yet read the code in details, but I
think I read somewhere that JDom was internally using lazy initialization
when traversing the tree and that concurrent accesses to it might cause
problems. Is this [still] true?


If this turns out that it is unsafe to read/traverse in concurrence the same
JDom document, then I would like the group opinion on the best way to
implement this cache while avoiding creating a contention point at the JDom
document read access: my system is supposed to scale as more computing
resources is added (i.e. more CPU in // on the same multiprocessor machine)

I'm thinking of maintaining a pool of JDom instances. Each concurrent thread
would take an instance before traversing it. The multiples instances of the
same JDom tree could be created by:
1- reparsing the same source
2- deep cloning the JDom document
3- serializing/unserializing the Jdom Document


Side question: my document cache needs to be bound in terms of memory usage.
I read some threads concerning this in the list, but again do anyone have
figures on the amount of bytes used by JDom for storing a parsed
representation of a XML stream of N bytes? The experiment I plan on doing is
to instanciate M Document instances and look in a profiler at the consummed
space once the GC is triggered. Did anybody ran this test before? I did read
some data at http://www.sosnoski.com/opensrc/xmlbench/index.html but this
does not quite answers this question.


Thanks in advance for your help,

Guillaume.





More information about the jdom-interest mailing list