[jdom-interest] Is beta-5 much slower than beta-4? Tweak the tests please.

Rajendra Paul rajendra at loudeye.com
Fri Oct 20 10:12:27 PDT 2000


I sent the original email including the test code. I have an important
update about the test itself, which does not change the fact that beta-5 is
slower, but whoever does performance testing should know:

1. Isolate the parsing and building tests: The test code I sent has two
tests: 1) parsing xml documents, and 2) building xml documents. The speed
test for building XML documents is affected by the parse test. So if you use
the code, run only build test or parse test at a time (by commenting out the
other test, or tweaking the code to take a command line parameter and do
only one test at a time.

After this isolation of tests, the build-document test shows that
JDOM beta-4 takes 5.1 seconds for building the XML document, whereas 
JDOM beta-5 task 9.7 seconds for building the same XML document (~75%
worse).

2. Method selection: In JDOM, there are sometimes more than one methods to
create/edit elements/attributes. I am positive that using different methods
would give different performance numbers. I might send an email when I have
more specifics on this.

3. A standard test please:  I have a good feel for what JDOM performance
tests should be so they work for us, but would some of you guys (I mean,
persons) who have spent a lot of time on JDOM mind sharing what a good
performance test/s should be?  I think I MIGHT be able to implement them and
give the code to you guys. Without a standard test (which could change with
time), it will be difficult to handle performance issues. 


Rajendra Paul
Grand Canyon Explorer
Loudeye Technologies
Seattle
www.loudeye.com



-----Original Message-----
From: jdom-interest-request at jdom.org
[mailto:jdom-interest-request at jdom.org]
Sent: Thursday, October 19, 2000 11:03 PM
To: jdom-interest at jdom.org
Subject: jdom-interest digest, Vol 1 #302 - 16 msgs


Send jdom-interest mailing list submissions to
	jdom-interest at jdom.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://lists.denveronline.net/mailman/listinfo/jdom-interest
or, via email, send a message with subject or body 'help' to
	jdom-interest-request at jdom.org

You can reach the person managing the list at
	jdom-interest-admin at jdom.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of jdom-interest digest..."


Today's Topics:

   1. is JDOM 100% pure Java ? (ivanooijdom at mymailbag.com)
   2. Re: Is beta-5 much slower than beta-4? (Roslan Amir)
   3. Looking for jdom beta4 binary (ivanooijdom at mymailbag.com)
   4. Re: Is beta-5 much slower than beta-4? (Jason Hunter)
   5. Re: Is beta-5 much slower than beta-4? (Roslan Amir)
   6. Re: Is beta-5 much slower than beta-4? (Jason Hunter)
   7. Re: is JDOM 100% pure Java ? (Jools Enticknap)
   8. How many Beta's will there be / how soon b4 v1.0? (adam flinton)
   9. Re: Is beta-5 much slower than beta-4? (Elliotte Rusty Harold)
  10. Using JDOM....to get Parser / SAX / Namespace neutrality / indepe
       ndance? (adam flinton)
  11. RE: PROPOSAL: Remove most constructors from XMLOutputter (Rosen, Alex)
  12. Hello (Amin)
  13. (no subject) (Amin)
  14. Re: RE: PROPOSAL: Remove most constructors from
       XMLOutputter (Jason Hunter)

--__--__--

Message: 1
From: ivanooijdom at mymailbag.com
Date: 19 Oct 2000 00:12:48 -0700
To: jdom-interest at jdom.org
Subject: [jdom-interest] is JDOM 100% pure Java ?

hi,

  Is JDOM 100% written in pure Java ?

thanks


   
-------------------------------------------------
Get personalized e-mail and a web address or your
own free e-mail at http://www.networksolutions.com.





--__--__--

Message: 2
Date: Thu, 19 Oct 2000 15:16:50 +0800
From: Roslan Amir <roslan at xybase.com>
Organization: XYBASE MSC Sdn Bhd
To: jdom-interest at jdom.org
Subject: Re: [jdom-interest] Is beta-5 much slower than beta-4?

Jason Hunter wrote:

>So what I was thinking perhaps we do this:
>
>1) Have a ValidChecker which checks the document is "valid" against a
>DTD.  This has always been planned, and I think it's a good idea.  How
>it's going to work is we validate a document in one fell swoop instead
>of as the doc is created.
>
>2) Have a WellFormedChecker that checks the document is "well formed"
>per the specification, following the same approach.  We check
>well-formedness in one fell swoop.  It's not as nice because you have to
>ask for error checking, but it's simply unacceptable to require all
>content is checked.  It's nice in theory but in the real world will
>involve people hacking up the API to remove that feature.
>
>I'll leave it open for discussion if we should do name checking on
>construction.
>
>What do people think?

Exactly what I have been looking for. Two methods that return a boolean
to check that the document is well-formed and to check that the document
is valid against a DTD. But instead of two methods, just one oveloaded
method:

public boolean isValid()		// check that the document is "well
formed"

public boolean isValid(String dtd)	// check that the document is
"valid"
against the given DTD


Roslan Amir.

--__--__--

Message: 3
From: ivanooijdom at mymailbag.com
Date: 19 Oct 2000 00:20:23 -0700
To: jdom-interest at jdom.org
Subject: [jdom-interest] Looking for jdom beta4 binary

hi,

  I looking for JDOM beta4 binary. any ideas ?
I only can find the JDOM beta5. but i know that
the JDOM beta5 was slower then JDOM beta4.

thanks


   
-------------------------------------------------
Get personalized e-mail and a web address or your
own free e-mail at http://www.networksolutions.com.





--__--__--

Message: 4
Date: Thu, 19 Oct 2000 00:49:29 -0700
From: Jason Hunter <jhunter at collab.net>
To: Roslan Amir <roslan at xybase.com>
CC: jdom-interest at jdom.org
Subject: Re: [jdom-interest] Is beta-5 much slower than beta-4?

> public boolean isValid() // check that the document is "well formed"

Nope, the method to check if it's well formed shouldn't be called
isValid().  You could have isWellFormed() and isValid() on a class but
it's really two dramatically different things and imho should be in
different classes.  Maybe call the Validator something with "DTD" in the
title so you could write another class that checked against schemas.

-jh-

--__--__--

Message: 5
Date: Thu, 19 Oct 2000 16:03:28 +0800
From: Roslan Amir <roslan at xybase.com>
Organization: XYBASE MSC Sdn Bhd
To: Jason Hunter <jhunter at collab.net>
CC: jdom-interest at jdom.org
Subject: Re: [jdom-interest] Is beta-5 much slower than beta-4?

Jason Hunter wrote:
> 
> > public boolean isValid() // check that the document is "well formed"
> 
> Nope, the method to check if it's well formed shouldn't be called
> isValid().  You could have isWellFormed() and isValid() on a class but
> it's really two dramatically different things and imho should be in
> different classes.  Maybe call the Validator something with "DTD" in the
> title so you could write another class that checked against schemas.
> 
> -jh-

Fair enough. Any idea when can we have the two methods? Or three if you
include one against schemas?

Amazing. Looking at the mail times, we're 15 hours different.

Roslan.

--__--__--

Message: 6
Date: Thu, 19 Oct 2000 02:12:46 -0700
From: Jason Hunter <jhunter at collab.net>
To: Roslan Amir <roslan at xybase.com>
CC: jdom-interest at jdom.org
Subject: Re: [jdom-interest] Is beta-5 much slower than beta-4?

> Fair enough. Any idea when can we have the two methods? Or 
> three if you include one against schemas?

WellFormed-ness would be an easy factoring out.  Checking against a DTD
or schema, that's significant work.  I'd hope we could leverage some
outside work for that.

-jh-

--__--__--

Message: 7
Date: Thu, 19 Oct 2000 11:48:23 +0100
From: Jools Enticknap <jools at jools.org>
Organization: Mostly dis....
To: ivanooijdom at mymailbag.com
CC: jdom-interest at jdom.org
Subject: Re: [jdom-interest] is JDOM 100% pure Java ?

ivanooijdom at mymailbag.com wrote:
> 
> hi,
> 
>   Is JDOM 100% written in pure Java ?

Yes 100% pure.

--Jools

> 
> thanks

--__--__--

Message: 8
From: adam flinton <aflinton at armature.com>
To: "JDOM Mailing List (E-mail)" <jdom-interest at jdom.org>
Date: Thu, 19 Oct 2000 12:07:13 +0100
Subject: [jdom-interest] How many Beta's will there be / how soon b4 v1.0?

Dear All,

I have been following JDOM with a great deal of interest & am wondering (a)
How many more beta's are planned (b) When V1.0 is planned & (c) What is
needed b4 V1.0?

TIA

Adam Flinton

--__--__--

Message: 9
Date: Thu, 19 Oct 2000 09:07:18 -0400
To: "'jdom-interest at jdom.org'" <jdom-interest at jdom.org>
From: Elliotte Rusty Harold <elharo at metalab.unc.edu>
Subject: Re: [jdom-interest] Is beta-5 much slower than beta-4?

At 4:32 PM -0700 10/18/00, Jason Hunter wrote:

>This reminds me of something I've been thinking about wrt performance.
>Elliotte wrote code that checks all text content to ensure it contains
>only legal XML characters.  We haven't included it yet because it causes
>a significant performance hit (maybe 10%).  What I'd really like is a
>way to configure if you want this checking on or off, but unfortunately
>that's very difficult without adding factories for all object creation,
>and I don't want to go there.
>

I've thought about this a lot. As you know I'm a fanatic about not 
letting anything malformed into the document. However, I totally 
agree that there's no need to do the check when an object is read by 
a parser that is also doing these checks. It's just pointless 
duplication of effort.

The natural way I see around the problem is to add some hidden 
functions to the various Element, Attribute, ProcessingInstruction, 
and other classes that do not make any checks but that could only be 
accessed by builders. In C++ you'd do this with friend functions, but 
Java doesn't have friend. We could do it with package private methods 
but the builder classes are in a different package.

Possible solution 1:

Move the builder classes into org.jdom. In fact ditch the package 
structure completely.  Put everything in org.jdom. When I was working 
on namespaces the output problem was significantly harder for the 
same sorts of reasons; i.e. I couldn't create friendly functions in 
Element that XMLOutputter could access.

Possible solution 2:

Create special subclasses of Element, ProcessingInstruction etc. in 
the org.jdom.input package. Give these package private methods that 
don't do the checks (as well as the normal public ones that do). Then 
use these subclasses for input rather than the standard versions. 
Does anyone see any obvious flaws with this approach?

The public interface to and behavior of these subclasses would be 
identical to the interfaces we have now. No client code would need to 
be changed. Thus we don't even have to do this optimization now. It 
can be rolled in at any convenient time, and I suggest waiting until 
the public API is stable and real tests prove it's necessary.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|                  The XML Bible (IDG Books, 1999)                   |
|              http://metalab.unc.edu/xml/books/bible/               |
|   http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://metalab.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/     |
+----------------------------------+---------------------------------+

--__--__--

Message: 10
From: adam flinton <aflinton at armature.com>
To: "JDOM Mailing List (E-mail)" <jdom-interest at jdom.org>
Date: Thu, 19 Oct 2000 14:39:11 +0100
Subject: [jdom-interest] Using JDOM....to get Parser / SAX / Namespace
neutrality / indepe
 ndance?

Dear All,

I am helping to build a free app which transfers between XML & DBMS'es (both
2 Db & 2 XML) called XML-DBMS http://www.rpbourret.com/xmldbms/index.htm


We are currently working on Version 2 which amongst other things should use
schema's instead of DTD'es, a GUI & properties files for choosing the JDBC
connection & the XML parser (@ the mo' the latter 2 have to be edited in the
source code & then recompiled).

I have been reading how JDOM is meant to be faster than std DOM & even
(gasp) easier to use... The easy to use part would be usefull however
obviously the speed (esp when dealing with DB'es holding potentially a lot
of data / producing large documents) is crucial.

Firstly anyone with an interest in both XML<>DB interaction & JDOM then if
you've got the time.....<G>.

However one of the things which we need to do (& which I am going to try &
do) is the abstraction of the XML parser. I have looked @ JAXP but it does
not seem to support DOM2 plus I have the feeling that it won't keep up with
the speed with which XML is moving. 

How does JDOM achieve Parser Independance? 

@ the mo' the create document code needs to have one class for each parser
e.g. for Xerces:

<CODE>
import org.w3c.dom.Document;

public class DF_Xerces implements DocumentFactory
{

   public DF_Xerces()
   {
   }   


   public Document createDocument() throws DocumentFactoryException
   {
	  try
	  {
		 return new org.apache.xerces.dom.DocumentImpl();
	  }
	  catch (Exception e)
	  {
		 throw new DocumentFactoryException(e.getMessage());
	  }
   }   
}

</CODE>

vs Say for the IBM XML4J you have:

<CODE>
import org.w3c.dom.Document;

public class DF_IBM implements DocumentFactory
{

   public DF_IBM()
   {
   }   


   public Document createDocument() throws DocumentFactoryException
   {
	  try
	  {
		 return new com.ibm.xml.parser.TXDocument();
	  }
	  catch (Exception e)
	  {
		 throw new DocumentFactoryException(e.getMessage());
	  }
   }

</CODE>

So how would you deal with a similar thing in JDOM?

Equally there are similar probs with namespace e.g. DOM2:

<CODE>
package de.tudarmstadt.ito.domutils;

import org.w3c.dom.Node;

public class NQ_DOM2 extends NameQualifierImpl
{
   public String getLocalName(Node node)
   {
	  String localName = null;

	  int nodeType = node.getNodeType();


	  if ((nodeType == Node.ELEMENT_NODE) || (nodeType == 
Node.ATTRIBUTE_NODE))
	  {
		 localName = node.getLocalName();
	  }
	  if (localName == null) localName = node.getNodeName();
	  return localName;
   }   


   public String getNamespaceURI(Node node)
   {
	  return node.getNamespaceURI();
   }

</CODE>

vs say IBM

<CODE>

package de.tudarmstadt.ito.domutils;

import org.w3c.dom.Node;
import com.ibm.xml.parser.Namespace;

public class NQ_IBM extends NameQualifierImpl
{
   public String getLocalName(Node node)
   {
	  int nodeType = node.getNodeType();

	  if ((nodeType == Node.ELEMENT_NODE) || (nodeType ==
Node.ATTRIBUTE_NODE))
	  {
		 return ((Namespace)node).getNSLocalName();
	  }
	  return node.getNodeName();
   }   

   public String getNamespaceURI(Node node)
   {
	  int nodeType = node.getNodeType();

	  if ((nodeType == Node.ELEMENT_NODE) || (nodeType ==
Node.ATTRIBUTE_NODE))
	  {
		 return ((Namespace)node).getNSName();
	  }
	  return null;
   }   
}

</CODE>

     

So 
(a) Can JDOM Help vs say DOM & Sax to get away from differences in
implementations of Sax, DOM & Namespaces?
(b) Will JDOM speed stuff up compared to DOM
(c) Anyone knowledgeable in JDOM who wants to help?

Adam

--__--__--

Message: 11
From: "Rosen, Alex" <arosen at silverstream.com>
To: jdom-interest at jdom.org
Date: Thu, 19 Oct 2000 11:07:23 -0400
Subject: [jdom-interest] RE: PROPOSAL: Remove most constructors from
XMLOutputter

> > It might be completely out-of-line, but would a Configuration
> > object be useful in this case?  Ever-expanding arguments
> > to a Constructor definitely isn't a Good Thing in my mind.

In addition to Alex's comments about config objects, I just want to put in a
vote for a
very simple way to emit a human-readable (i.e. "pretty") version of a
document. I don't
have a big problem with having one convenience constructor for this common
case. But
maybe an even better solution would be static methods in XMLOutputter:

// Return a pretty-printed String version of the document.
public static String toString(Document d);

// Write a pretty-printed String version of the document to System.out.
public static void dump(Document d);

It's nice to make these common things really easy.

--Alex

P.S. Jason, I like the WellFormedChecker and ValidChecker ideas. Often if
you're
constructing a document programmatically, you can know that the data you're
adding is
well-formed and/or valid, without the extra checking.

--__--__--

Message: 12
From: "Amin" <amin at imkenberg.net>
To: <jdom-interest at jdom.org>
Date: Thu, 19 Oct 2000 07:14:52 +0200
Subject: [jdom-interest] Hello

This is a multi-part message in MIME format.

------=_NextPart_000_000D_01C0399C.469F2490
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable



------=_NextPart_000_000D_01C0399C.469F2490
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2920.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_000D_01C0399C.469F2490--


--__--__--

Message: 13
From: "Amin" <amin at imkenberg.net>
To: <jdom-interest at jdom.org>
Date: Thu, 19 Oct 2000 07:20:04 +0200
Subject: [jdom-interest] (no subject)

This is a multi-part message in MIME format.

------=_NextPart_000_001F_01C0399D.00935C40
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Is it possible to create a Document object within <xsp:logic> tag ?
i tried to do it, but cocoon raised many exceptions on
using jdom classes within <xsp:logic> tag.
How can i solve such problem ?

M. Amin


------=_NextPart_000_001F_01C0399D.00935C40
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2920.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Is it possible to create a Document =
object within=20
&lt;xsp:logic&gt; tag ?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>i tried to do it, but cocoon raised =
many exceptions=20
on</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>using jdom classes within =
&lt;xsp:logic&gt;=20
tag.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>How can i solve such problem =
?</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>M. Amin</FONT></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_001F_01C0399D.00935C40--


--__--__--

Message: 14
Date: Thu, 19 Oct 2000 10:57:46 -0700
From: Jason Hunter <jhunter at collab.net>
To: "Rosen, Alex" <arosen at silverstream.com>
CC: jdom-interest at jdom.org
Subject: Re: [jdom-interest] RE: PROPOSAL: Remove most constructors from 
 XMLOutputter

> P.S. Jason, I like the WellFormedChecker and ValidChecker ideas. 
> Often if you're constructing a document programmatically, you 
> can know that the data you're adding is
> well-formed and/or valid, without the extra checking.

Yep.  The problem I see with the notion of just builders being able to
skip the checking is that, as you say, sometimes it's data from other
sources.  

-jh-


--__--__--

_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhos
t.com

End of jdom-interest Digest



More information about the jdom-interest mailing list