[jdom-interest] Yet another TODO (entity escapes)

philip.nelson at omniresources.com philip.nelson at omniresources.com
Tue Jun 19 10:35:32 PDT 2001


> However, the SAX parser expands &#xxx; entities into their unicode
> string versions, *even when you call setExpandEntities(false)*.
> Sounds like either a bug or a design flaw in SAX parsers, or in the
> SAXBuilder (which I haven't looked closely at).  Shouldn't they return
> EntityRef objects?

SAX is doing this.  I haven't actually seen a parser that will allow you to
set 
http://xml.org/sax/features/external-parameter-entities or the same for
external general entities.  So we are coding around it for now for external
general entities that are not character entities.  I had an idea for coding
around the same limitation for external parameter entites that perhaps Harry
Evans will code into the doctype internal subset.  The only plan I could
come up with for character entities is to take any high order unicode string
we get in the doctype's entity declaration and turn it into a character
entity which *I think* would be correct more than not.  Somebody with
multi-byte experience may be better to answer which use cases are the best
target though.  People have been completely quiet on this so far.



More information about the jdom-interest mailing list