[jdom-interest] First pass at Namespace revision[eg]

Thu Mar 29 10:53:01 PST 2001

> In my current opinion, the observed behavior is correct.  An element
> created without a namespace has no namespace.  It does not inherit a
> namespace, even if it has the chance to inherit a no-prefix "default"
> namespace.  The "correct output" example proposed above would 
> mean that
> child1.getNamespaceURI() can return different values depending on its
> location in the document.  To have a child element's namespace be
> affected by its location in the document violates the 
> namespace model we
> use.

Yes and no :-) Yes, I think that JDOM is doing what it intended to do,
though I didn't at the time I wrote this. No, it's not doing what I think it
possibly should do.  

> 
> Per the namespaces spec, "If the URI reference in a default namespace
> declaration is empty, then unprefixed elements in the scope of the
> declaration are
> not considered to be in any namespace."  So the output above is
> consistent in indicating that child1 has no namespace.  It didn't have
> one on creation, and it doesn't get one just because it's added to
> another element.

What I think the problem is lies in the difference between the object model
and the textual representation.  What developers who have struggled with
this seem to expect, and I tend to agree with now, is that
child1.getNamespaceURI() would return "" or null even if the element was in
the default namespace of an ancestor (or more likely don't care what it
would return).  In the textual representation, the default namespace is
declared in an ancestor element.  In the object model, we could *presume* a
child to be in the ancestor's default namespace by virtue of being a child
instead of having the child explicitly return the namespace of the parent.
If the child is in a qualified namespace, it would specify that explicitly
just as it does now.

Positives - Default namespaces are easy to implement and elements can be
moved around much easier when you are dealing with only default namespaces.
Changing the default namespace of an entire tree is very simple because you
only have to change it in one place.  No change to current api where fully
qualified namespaces are used.  An explicit empty namespace is treated
exactly like a qualified namespace, which it should be (through an
EMPTY_NAMESPACE or similar construct).  NO_NAMESPACE means *only* "no
explicit namespace specified".

Negatives - You cannot assume the getNamespace will tell exactly the
namespace the node is in.  You must know that if the namespace returned is
NO_NAMESPACE that the default namespace of an ancestor may apply.  You would
have to do something like the NamespaceStack approach used in the outputters
or just walk up the parent tree until you found a default namespace.  This
would not be the case if you knew your xml format however and would only be
an issue for documents you process generically (ie I know darn well where my
namespace declarations are).  You also need a way to specify an explicit
empty namespace.

I *think* that retrieving and comparing namespaces is a sufficiently small
use case of JDOM that this approach, while more code if you are comparing
namespace names, is much more intuitive and convienent for other uses of
JDOM.  I have created a working version of XMLOutputter and DOMOutputter
that uses this approach and it's a very minor change.  SAX and DOMBuilder
would have to be adjusted as well but I think it would be primarily removal
of code.  I can only speak for myself and the users who have voiced
complaints however.  Those who like the current approach have had no reason
to speak up at all ;^)

> snip....
> I think you're right we need to (a) make sure this is the 
> right approach
> and (b) document it better.  
> 
great! 

> One alternate approach is to replace the NO_NAMESPACE special 
> namespace
> with a simple "null" value to indicate an element not in any 
> namespace. 
> Then getNamespaceURI() would return null for elements not in any
> namespace.  getNamespace() would return null.  And so on.  
> The outputter
> would have to be changed to be smart enough to print the *same* output
> as today to write xmlns="" to remove the default namespace where
> appropriate.  
> 
> Hmm... I kind of like NO_NAMESPACE still because it makes the
> XMLOutputter code cleaner.  It makes spec sense also because of the
> namespace spec quote above and a related quote, "The default namespace
> can be set to the empty string. This has the same effect, within the
> scope of the declaration, of there being no default namespace."

as illustrated by 
<element xmlns="urn:foo">
  <child />
</element>

vs 

<element xmlns="urn:foo">
  <child xmlns="" />
</element>

I like the NO_NAMESPACE as opposed to null also.  For similar reasons I like
an EMPTY_NAMESPACE. DOM (xerces) returns "" for the uri and null for the
prefix for both an empty namespace and no namespace (yuck).  Somehow
internally it knows the difference though because the TreeWalker can "do the
right thing".

If we accomplish nothing more than clarifying what to do based on the needs
of developers, I'll be happy.  As I noted in my very first post, I had found
it was possible to do what was needed with the current api, but then I
didn't even suspect that is what the api thought was the best thing to do.