[jdom-interest] Parsing a MODS-document with validation fails
Thomas Scheffler
thomas.scheffler at uni-jena.de
Wed Jul 20 06:23:49 PDT 2011
Hi,
if I parse a valid MODS document with XML Schema validation, JDOM
changes attributes as it handles default values of schema not correctly
(by ignoring the namespace).
Here is a short code to demonstrate this:
SAXBuilder builder = new SAXBuilder(true);
builder.setFeature("http://xml.org/sax/features/namespaces", true);
builder.setFeature("http://xml.org/sax/features/namespace-prefixes", true);
builder.setFeature("http://apache.org/xml/features/validation/schema",
true);
Document document = builder.build(new
URL("http://academiccommons.columbia.edu/download/fedora_content/show_pretty/ac:111060/CONTENT/ac111060_description.xml"));
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
xout.output(document, System.out);
Here is a result fragment:
<name type="simple">
<namePart type="family">Edwards</namePart>
<namePart type="given">Stephen A.</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
<affiliation>Columbia University. Computer Science</affiliation>
</name>
If you look at the original document you can see, that @type of name is
"personal". The "simple" comes from the xlink XML-Schema that was
included by the MODS-Schema. Therefor the result fragment should look
like this:
<name type="personal" xlink:type="simple">
<namePart type="family">Edwards</namePart>
<namePart type="given">Stephen A.</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
<affiliation>Columbia University. Computer Science</affiliation>
</name>
If I use DOM from Java this is done correctly (but a bit ugly as it does
not use the namespace prefix already defined).
Could someone just fix this, please?
Regards,
Thomas Scheffler
More information about the jdom-interest
mailing list