[jdom-interest] JDOM XML Outputter vs. JDOM serialization
Dennis Sosnoski
dms at sosnoski.com
Thu Nov 14 17:45:13 PST 2002
Hi Martin,
I've had other projects going on and haven't taken XMLS further since
the release last year. The JDOM interface has changed since then so it's
no surprise that you'd need to fix the code to work with the current
beta or CVS JDOM.
The intent of XMLS was always to provide a faster way than serialization
(either Java serialization or text serialization) for communicating XML
document structures between applications - speed rather than size, in
other words. It does a good job at that, but this doesn't necessarily
help a lot if the document model is costly to build. If you're really
interested in compression you'll be able to do better with XML text and
standard compression programs such as bzip2.
FWIW your timings look very off - in my experience for normal documents
the time to serialize them out as text is going to perhaps half as long
as the time to parse and build the document model.
Many (if not most) parsers will automatically use the same String
objects for element and attribute names. The reason document models are
so bulky is because they use objects to represent every component of the
document, which adds up to a lot of components. There's some variation
between models, but generally you're talking about memory usage at least
5X the document size in bytes. The current JDOM beta code has known
severe problems build building larger documents, so if you're working
with that you might want to try the CVS version. Alternatively, data
binding approaches can give you a much more compact representation than
a document model. I've got an article coming out on IBM dW soon which
will give actual figures on this comparison.
- Dennis
Martin Schulz wrote:
>Regarding XMLS (www.sonoski.com), it appears to not be actively
>maintained. I had to apply a simple fix before I
>got it to work. The benchmarks are good, but not that good. And
>compression is best implemented on top of the XML document.
>But the main concern is exactly what I wanted to avoid by producing the
>document: Not tying myself to a specific platform or implementation.
>
>Subsequently, I collected some timing information and it appears that on
>a 1.6G Athlon:
>- producing the XML is cheap ( 2ms per document of 49k in a loop)
>- producing the gzip compressed XML is cheaper ( 0.8 ms )
>- gunzip cheap, similar to gzip
>- parsing and creating the JDOM Document: expensive (120ms)
>
>Using the Piccolo SAXParser, I could shave off 20% of the parsing cost,
>but that is still a lot!
>I am also interested in a solution which allows to save memory by
>reusing Element and Attribute name Strings.
>
>Also, I was having weird problems setting Piccolo as the default parser,
>SAXBuilder was completely
>failing, unless I gave it the class name as a String.
>
>Does anybody know whether that is being looked at?
>
> Martin
>
>
>-----Original Message-----
>From: jdom-interest-admin at jdom.org [mailto:jdom-interest-admin at jdom.org]
>On Behalf Of Bradley S. Huffman
>Sent: November 10, 2002 9:23 PM
>To: Martin Schulz
>Cc: jdom-interest at jdom.org
>Subject: Re: [jdom-interest] JDOM XML Outputter vs. JDOM serialization
>
>
>Martin Schulz writes:
>
>
>
>>Hello,
>>
>>I am looking at comparing JDOM to/from XML byte stream versus JDOM
>>(de)serialization.
>>
>>For general architectural reasons I'd be inclined to pick the XML byte
>>
>>
>
>
>
>>stream solution, But I also wouldn't be surprised if the XMLOutputter
>>beat the serialized version, not speak of gzipped XML byte streams.
>>
>>Anybody care to share their experience?
>>
>>
>
>Using Dennis Sosnoski's XMLBench code (www.sosnoski.com) for timing, it
>seems faster to use XMLOutputter/SAXBuilder for serialization then
>Java's serialization mechinism which does alot of housekeeping to
>prevent cycles (something a XML document never has).
>
>Brad
>_______________________________________________
>To control your jdom-interest membership:
>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@you
>rhost.com
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.385 / Virus Database: 217 - Release Date: 04/09/2002
>
>
>---
>Outgoing mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.385 / Virus Database: 217 - Release Date: 04/09/2002
>
>
>_______________________________________________
>To control your jdom-interest membership:
>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
>
>
>
More information about the jdom-interest
mailing list