[jdom-interest] Entity Resolver Cache/Catalog
Paul Libbrecht
paul at activemath.org
Mon Aug 29 08:15:25 PDT 2011
Le 29 août 2011 à 16:58, Rolf Lear a écrit :
> This is further compounded by there being some restrictions on some
> documents too, like the w3.org 'ban' on default Java user-agents:
> http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/
>
> My experimentation indicates that w3.org has put a blanket 'tarpit' of 30
> seconds on any connection, regardless of what User-agent you use. This is
> 'significant'.
Definitely, W3C wants you to stop reference DTDs by their URL-URIs.
Well... it wants the parsers to stop keep parsing them.
> Typical solutions to this problem are things like OASIS catalogs, etc. but
> that feels heavy-weight... or, is it?
I believe it was not very hard to configure the java-shipped Xerces with catalogs.
And I would encourage the JDOM code to encourage this by showing good practice.
Here's what I used before SAXparsing:
> SAXParserFactory factory = SAXParserFactory.newInstance();
> System.setProperty("com.sun.org.apache.xerces.xni.parser.XMLParserConfiguration",
> "com.sun.org.apache.xerces.parsers.XMLGrammarCachingConfiguration");
> SAXParser parser = factory.newSAXParser();
>
> XMLCatalogResolver resolver = new XMLCatalogResolver();
> resolver.setPreferPublic(true);
> resolver.setCatalogList(new String[]{this.getClass().getResource("xmlCatalog.xml").toExternalForm()});
> handler = new EventDeserializerSAXHandler(resolver);
> if(LOG.isDebugEnabled()) LOG.debug("Starting parser.");
> parser.parse(inputStream, handler);
Caching, however, is for free with a single system-property (within the vm lifecycle) if I remember well.
It would be cool to have SAXBuilder.setCatalog to make JDOM a good citizen!
(or even better: SAXBuilder.addCatalogEntry(public, URL) with a javadoc example where the URL is using class.getResource().
paul
also often developing in train ;-)
More information about the jdom-interest
mailing list