[jdom-interest] Feature Request
Dennis Sosnoski
dms at sosnoski.com
Sat Feb 21 11:15:34 PST 2004
John Cowan wrote:
>Dennis Sosnoski scripsit:
>
>
>> schema.elementType("span", Schema.M_ANY, Schema.M_ANY, 0);
>> schema.elementType("div", Schema.M_ANY, Schema.M_ANY, 0);
>> schema.elementType("table", Schema.M_ANY, Schema.M_ANY, 0);
>> schema.elementType("br", Schema.M_EMPTY, Schema.M_ANY, 0);
>>
>>
>
>I'd be interested in knowing why these particular ones were important.
>I understand the issue with script and style.
>
I ran into some cases where these elements were being misused in the
HTML pages I was looking at, so patched them in this manner to allow
arbitrary nesting. I'm not even sure all these are necessary for my
purposes - I just hacked as I went to muddle through the pages. AFAIK
the only element definitions which are actually incorrect in your
current content model are the script and style elements.
>...
>
>>>>The only downside I've noticed is that the handling it uses to
>>>>turn HTML into XHTML can go berserk in some cases of real-world HTML,
>>>>such as <script> and <style> elements within the <body> (it properly
>>>>tries to force them into a <head> element, so you end up with multiple
>>>><head>s and <body>s).
>>>>
>>>>
>
>TagSoup's content models are implicitly of the form (A|B|C|...)*, so
>it thinks the content model of the html element is (head|body)*.
>I may do some special-casery to fix this, but probably not for 0.9.2
>unless I see a very easy way to do it.
>
>
With the <script> and <style> element containment fixed I don't think
this'll be a big deal.
- Dennis
More information about the jdom-interest
mailing list