[jdom-interest] Parsing a MODS-document with validation fails

Jason Hunter jhunter at servlets.com
Fri Jul 22 13:12:59 PDT 2011


Thanks, Thomas.  I'll integrate it.

Anyone else sitting on a bug that could get fixed in 1.1.2?

-jh-

On Jul 22, 2011, at 12:12 AM, Thomas Scheffler wrote:

> Am 21.07.2011 10:14, schrieb Thomas Scheffler:
>> Am 21.07.2011 04:18, schrieb Bradley S. Huffman:
>>> Which version of JDOM?  My first guess is it is something in XMLOutputter.
>> 
>> This is the latest and greatest 1.1.1. I would not suspect XMLOutputter here as it usually does not have any problems with namespaces. This seems to be a parsing issue.
> 
> It is a bug in the SAXHandler class where attributes with a different Namespace are only detected by their QName and not by the different Namespace-URI. I attached a patch that fixes this bug.
> It would be great, if this could be integrated and released soon in a version 1.1.2.
> 
> regards
> 
> Thomas Scheffler
> 
>>> 
>>> On Wed, Jul 20, 2011 at 8:23 AM, Thomas Scheffler
>>> <thomas.scheffler at uni-jena.de>  wrote:
>>>> Hi,
>>>> 
>>>> if I parse a valid MODS document with XML Schema validation, JDOM changes
>>>> attributes as it handles default values of schema not correctly (by ignoring
>>>> the namespace).
>>>> 
>>>> Here is a short code to demonstrate this:
>>>> 
>>>> SAXBuilder builder = new SAXBuilder(true);
>>>> builder.setFeature("http://xml.org/sax/features/namespaces", true);
>>>> builder.setFeature("http://xml.org/sax/features/namespace-prefixes", true);
>>>> builder.setFeature("http://apache.org/xml/features/validation/schema",
>>>> true);
>>>> 
>>>> Document document = builder.build(new
>>>> URL("http://academiccommons.columbia.edu/download/fedora_content/show_pretty/ac:111060/CONTENT/ac111060_description.xml")); 
>>>> XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
>>>> xout.output(document, System.out);
>>>> 
>>>> Here is a result fragment:
>>>> 
>>>> <name type="simple">
>>>> <namePart type="family">Edwards</namePart>
>>>> <namePart type="given">Stephen A.</namePart>
>>>> <role>
>>>> <roleTerm type="text">author</roleTerm>
>>>> </role>
>>>> <affiliation>Columbia University. Computer Science</affiliation>
>>>> </name>
>>>> 
>>>> If you look at the original document you can see, that @type of name is
>>>> "personal". The "simple" comes from the xlink XML-Schema that was included
>>>> by the MODS-Schema. Therefor the result fragment should look like this:
>>>> 
>>>> <name type="personal" xlink:type="simple">
>>>> <namePart type="family">Edwards</namePart>
>>>> <namePart type="given">Stephen A.</namePart>
>>>> <role>
>>>> <roleTerm type="text">author</roleTerm>
>>>> </role>
>>>> <affiliation>Columbia University. Computer Science</affiliation>
>>>> </name>
>>>> 
>>>> If I use DOM from Java this is done correctly (but a bit ugly as it does not
>>>> use the namespace prefix already defined).
>>>> 
>>>> Could someone just fix this, please?
>> 
>> 
> 
> <jdom-namespace.patch>_______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com




More information about the jdom-interest mailing list