[jdom-interest] Substituting a different <!DOCTYPE ...> when parsing an XML file

Jeff Turner jeff at socialchange.net.au
Fri May 31 04:03:48 PDT 2002


Hi Geoff,

On Thu, May 30, 2002 at 04:36:05PM +0000, Geoff Rimmer wrote:
> When parsing XML files with validation switched on, I think that in
> 95% of cases, it should be the *application* rather than the XML file
> that specifies which DTD file to validate against.

Absolutely. The point (for me) of validating is to ensure the XML
conforms to *my* understanding, not the sender's. So I almost always
replace the sender's DOCTYPE declaration with my own, or validate as a
separate step with RelaxNG/MSV.

> My question is: what is the best way of doing this?  Is it just to
> provide a utility filter class which replaces a file's DOCTYPE with
> another:
> 
>     class DocTypeReplacerInputStream extends FilterInputStream

I've got just the thing :) First implemented by Simon St.Laurent, then
generalized by me:

http://opensource.socialchange.net.au/doctypechanger/

The package.html of the Javadocs contain a full description, examples,
and general discussion of related techniques (setting EntityResolvers,
SGML Open Catalogs):

http://www.opensource.socialchange.net.au/doctypechanger/latest/apidocs/

> which you would use as follows:
> 
>     DocType docType = new DocType(
>         "countries", "http://www.sillyfish.com/countries.dtd" );
> 
>     Document doc = new SAXBuilder( true ).build(
>         new DocTypeReplacerInputStream(
>             new FileInputStream( "countries.xml" ) ) );

Hmm.. JDOM integration might be a good idea. I wonder if the problem
(replacing DOCTYPE declarations) is common enough to warrant pursuing
this.

--Jeff

> or is it to do as suggested above, namely to have extra build()
> methods in SAXBuilder so that a replacement DocType can be specified
> (and then make the DocTypeReplacerInputStream functionality part of
> SAXBuilder)?
> 
> Personally I think it would be such a useful thing to have that it
> *should* be part of SAXBuilder (as I have described above), but I
> would be interested to hear what other people think.
> 
> -- 
> Geoff Rimmer <> geoff.rimmer at sillyfish.com <> www.sillyfish.com
> www.sillyfish.com/phone - Make savings on your BT and Telewest phone calls
> UPDATED 09/05/2002: 508 destinations, 12 schemes (with contact details)



More information about the jdom-interest mailing list