[jdom-interest] validation and ANY

Travers Waker traversw at innoforge.com
Fri Oct 26 01:42:20 PDT 2001


> Part of my DTD specifies the following element:
>
> <!ELEMENT OLifEExtension ANY>
>
> It was my impression that ANY allows for any element that I want.  However,
> when I try to validate my document, I receive the following exception:
>
> Validating exception: org.jdom.JDOMException: Error on line 83: Element
> type "BenefitPercentage" must be declared.
>
> Was my impression wrong?  Is there any way to get around this?

Yes, your impression was wrong.  ANY allows any element that is defined in your DTD, but not any arbitrary XML element.

To allow arbitrary XML data (or any other data) to occur in a document that is validated by DTD, you must instruct the parser
not to parse that section of data.  This is done by defining a CDATA section in the xml instance document that the DTD is
validating.  For example, if you would like an element called "anyData" to contain data that is either not XML and might have
XML reserved characters (like '>') in it, or you would like it to contain xml elements that are not defined in your DTD, you
would define it in the DTD like this:

<!ELEMENT anyData (#PCDATA)>

then, in the actual xml instance document, you would wrap the contents of anyData in a CDATA section, like this:

<anyData>
    <![CDATA[<elementNotInDtd/>]]>
</anyData>

OK, now the probelm with this is that the contents of the CDATA section are not parsed at all, so if the contents is some
XML, then you don't even know if it's well-formed XML or not.  A solution is to reparse the CDATA section with a
non-validating parser (or maybe later with another DTD if you know what the XML should look like at some later stage in your
application).

Another solution is to use a schema instead of a DTD.  In a schema, you can define the anyData element from my previous
example as follows:

<xsd:element name='anyData'>
    <any minOccurs='0' maxOccurs='unbounded' processContents='skip'/>
</xsd:element>

This will allow anyData to contain any well-formed XML, which is probably what you were trying to achieve with the DTD.  The
advantage over the CDATA section solution is that the child elements and data of anyData are now available via JDOM, whereas
the CDATA section is just represented as textual content of the anyData element, unless you reparse teh CDATA section
separately withour validation.

Cheers

Travers





More information about the jdom-interest mailing list