[jdom-interest] Questions regarding implementation of
DocType.internalSubset[eg]
Jason Hunter
jhunter at collab.net
Wed Jun 13 11:44:32 PDT 2001
philip.nelson at omniresources.com wrote:
>
> * what to do (if anything) about character entities in the source doc like
> <!ENTITY Ouml 'Ö'>
>
> The parser turns this into a String from the parsed entity and that is what
> gets output.
Try to create a string as close to the original as possible.
> * NOTATIONs and ndata. I think we have to implement DTDHandler and include
> notations in the internal subset.
OK.
> * Whitespace handling is pretty arbitrary. I don't think it is preserved at
> all and I have done a simple 2 space indent with trailing "\n"
>
> * is <!ENTITY % e "foo"> equivalent to <!ENTITY %e "foo">
OK.
> * what to do about external parameter entities because the feature
> http://xml.org/sax/features/external-parameter-entities is also not
> supported, at least by xerces 1.3.1 (Andy?)
>
> <!DOCTYPE doc [
> <!ELEMENT doc (#PCDATA)>
> <!ENTITY % e SYSTEM "097.ent">
> <!ATTLIST doc a1 CDATA "v1">
> %e;
> <!ATTLIST doc a2 CDATA "v2">
> ]>
> <doc></doc>
>
> gets turned into
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE doc [
> <!ELEMENT doc (#PCDATA)>
> <!ENTITY %e SYSTEM "097.ent" >
> <!ATTLIST doc a1 CDATA "v1">
> <!ATTLIST doc a2 CDATA #IMPLIED>
> <!ATTLIST doc a2 CDATA "v2">
> ]>
> <doc a1="v1" />
>
> I can handle this manually I think with start and endEntity calls and
> assuming that while in the dtd and in an entity, the next element or
> attribute or comment decl will be from the entity. So should skipping
> parameter entity expansion be the normal behaviour in all cases or
> configurable?
I'd be OK with always staying true to the original.
Hope you and Harry can check each other's work on this. Comparing
approaches will probably be insightful.
-jh-
More information about the jdom-interest
mailing list