[jdom-interest] Entity Resolver Cache/Catalog
Rolf
jdom at tuis.net
Wed Aug 31 22:54:52 PDT 2011
Hi Paul.
Interesting read. Thanks.
I have been reading up on your alternatives, and I have been trying to
think of how they can be applied to JDOM. The problem I see is that
properties are xerces specific.
For the moment I have done a few things...
1. converted the XML test-cases to use local files for resources, not
the web.
2. I have investigated your code, and code options. I have read up on
OASIS, xcatalog, etc.
3. I have continued playing with a 'CachingEntityResolver' that I was
playing with on the weekend.
I was thinking that if I come up with a reliable cachingEntityResolver
it could be put in the contrib section, and just added to a SAX/DOM Builder.
Can you think of a good way to make the process more generic, but still
easy?
Rolf
On 29/08/2011 11:15 AM, Paul Libbrecht wrote:
>
> Le 29 août 2011 à 16:58, Rolf Lear a écrit :
>
>> This is further compounded by there being some restrictions on some
>> documents too, like the w3.org 'ban' on default Java user-agents:
>> http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/
>>
>> My experimentation indicates that w3.org has put a blanket 'tarpit' of 30
>> seconds on any connection, regardless of what User-agent you use. This is
>> 'significant'.
>
> Definitely, W3C wants you to stop reference DTDs by their URL-URIs.
> Well... it wants the parsers to stop keep parsing them.
>
>> Typical solutions to this problem are things like OASIS catalogs, etc. but
>> that feels heavy-weight... or, is it?
>
> I believe it was not very hard to configure the java-shipped Xerces with catalogs.
> And I would encourage the JDOM code to encourage this by showing good practice.
>
> Here's what I used before SAXparsing:
>
>> SAXParserFactory factory = SAXParserFactory.newInstance();
>> System.setProperty("com.sun.org.apache.xerces.xni.parser.XMLParserConfiguration",
>> "com.sun.org.apache.xerces.parsers.XMLGrammarCachingConfiguration");
>> SAXParser parser = factory.newSAXParser();
>>
>> XMLCatalogResolver resolver = new XMLCatalogResolver();
>> resolver.setPreferPublic(true);
>> resolver.setCatalogList(new String[]{this.getClass().getResource("xmlCatalog.xml").toExternalForm()});
>> handler = new EventDeserializerSAXHandler(resolver);
>> if(LOG.isDebugEnabled()) LOG.debug("Starting parser.");
>> parser.parse(inputStream, handler);
>
> Caching, however, is for free with a single system-property (within the vm lifecycle) if I remember well.
>
> It would be cool to have SAXBuilder.setCatalog to make JDOM a good citizen!
> (or even better: SAXBuilder.addCatalogEntry(public, URL) with a javadoc example where the URL is using class.getResource().
>
> paul
> also often developing in train ;-)
>
More information about the jdom-interest
mailing list