[jdom-interest] re: Namespace patch

Thu Aug 31 16:34:56 PDT 2000

Elliotte Rusty Harold wrote:

>I changed the semantics of the equals() method in Namespace to 
>compare against both prefix and URI. This was necessary for some of 
>the data structures I used. It could probably be changed back with a 
>little effort, but I'm not sure it should be.

No, it should not be changed back, the current implementation (in CVS)
violates the contract of hashCode . If two instances are compared with
equals and are found to be equal their hashCodes _must_ be equal
otherwise you break Hashtables and anything else that relies on the
contract of hashCode as described in the API docs. Here's the blurb
(java.lang.Object#hashCode):

<quote>
The general contract of hashCode is: 
Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. This integer need not remain
consistent from one execution of an application to another execution of
the same application. 
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce the
same integer result. 
</quote>

It's a lot more important that a.equals(b) => a.hashCode() ==
b.hashCode() than for the hashCode() to be 'unique' as the Namespace API
docs say. The former breaks the functionality of Hashtables/Maps/Sets,
the latter just degrades performance. I'd send the trivial patch but my
inclination is to make the fix in the other direction (leave equality as
it is and take the prefix out of the hashcode) but both approaches seem
somewhat inadequate to me and the improved XMLOutputter is a much more
useful thing to bring into the repository. 

I'm not very familiar with the gory details of the plethora of
XML-related specs so bear with me if the following questions and
comments seem too naive or outright wrong when all XML-related
complexities are taken into account. My comments are those of a Java API
user as opposed to Mystic Master of XML. 

Isn't the Namespace class trying to represent two distinct concepts
which are perhaps not reconcilable in a sensible way? From my point of
view (again, a Java API user interested in a simple XML API), there is
the concept of Namespace, in general, outside its use in a particular
document, which is uniquely defined by its URI. When used in a document,
a namespace can be bound to a prefix - why not have a NamespaceBinding
class which represents that? NamespaceBinding is uniquelly identified by
a Namespace and a prefix, it only makes sense in the context of a
document (or portion thereof). But I do want to be able to refer to a
Namespace generally without having to worry what prefix a document I'm
reading might be using. In fact, would it be unreasonable to have a
'verification' mode for a document which simply disallows adding
NamespaceBindings with non-unique prefixes to any member Elements?  I
realize that Namespace prefixes can be unset, reassigned, etc in the
same document but aren't these rather degenerate cases? Are there common
uses for such machinations in the 80-90% case that JDOM targets? An
further extension of disallowing non-unique prefixes could be an
additional utility mode in which unique prefixes are picked for the user
as the document is being built. 
I'm having a similar 'addressability' problem with Element. Element
represents a particular instance of an Element in a given document. When
manipulating documents programatically, though, it is useful to be able
to express the notion of an Element type (a unique, fully qualified
Element name), say, a combination of a Namespace (not NamespaceBinding!)
and an Element name. Let's suppose that such a concept (ElementId, or
ElementType) does not belong in the JDOM API, the 'duality' of the
Namespace class makes it  difficult to represent externally. I have an
ElementId class which is a composition of a Namespace and an Element
name but uses just the URI and and Element name for comparison and
hashing. I'm not terribly comfortable with it since it essentially
relies on knowledge of today's semantics of Namespace in order to work. 
At the risk of belabouring the point - why does one need an API
representation of these concepts? In my case, I'm doing some fairly
convoluted serialization and deserialization of XML fragments  into
various Java objects - the XML typically contains more than one
namespace. When mapping Elements (or groups thereof) to objects,
especially when the mapping is not hard-coded (i.e. not a straight
XML<->generate bean mapping or even an XML Element <->
Class/primitivetype mapping) but externally configurable, it's very
useful to be able to express the notion of a particular, unique kind of
Element or a particular, unique Namespace without worrying about the
details of their representaion in a given Document and to be able to
refer to/filter/retrieve  Elements by their externally definable,
unique, non-transient properties and let the JDOM implementation take
care of mapping these to the specifics of a Document instance.

Comments, opinions, corrections? 

Thanks,

-pvg