[jdom-interest] Yet another TODO (entity escapes)
philip.nelson at omniresources.com
philip.nelson at omniresources.com
Tue Jun 19 10:35:32 PDT 2001
> However, the SAX parser expands &#xxx; entities into their unicode
> string versions, *even when you call setExpandEntities(false)*.
> Sounds like either a bug or a design flaw in SAX parsers, or in the
> SAXBuilder (which I haven't looked closely at). Shouldn't they return
> EntityRef objects?
SAX is doing this. I haven't actually seen a parser that will allow you to
set
http://xml.org/sax/features/external-parameter-entities or the same for
external general entities. So we are coding around it for now for external
general entities that are not character entities. I had an idea for coding
around the same limitation for external parameter entites that perhaps Harry
Evans will code into the doctype internal subset. The only plan I could
come up with for character entities is to take any high order unicode string
we get in the doctype's entity declaration and turn it into a character
entity which *I think* would be correct more than not. Somebody with
multi-byte experience may be better to answer which use cases are the best
target though. People have been completely quiet on this so far.
More information about the jdom-interest
mailing list