SV: SV: [jdom-interest] One less TODO item

Per Norrman pernorrman at telia.com
Tue Oct 7 10:15:46 PDT 2003


What, do you append '/' before loading the resource?
In principle,  http://www.cafeconleche.org is not the same
resource as  http://www.cafeconleche.org/. It probably is for
the overwhelming majority of cases, but I vaguley remember a 
case where wasn't (can't come up with the details).

I looked again at the RFC 2396, and I'm now leaning on that the correct
resolution of DTD/xyz to the base URI http://www.cafeconleche.org 
is indeed  http://www.cafeconleche.orgDTD/xyz. Obviously, this is not
the desired result, but there is a concise algorithm in the spec
that yields this result!. Perhaps this quote from section 5.1.4 of said
document encapsulates the problem:

   It is the responsibility of the distributor(s) of a document
   containing relative URI to ensure that the base URI for that document
   can be established.  It must be emphasized that relative URI cannot
   be used reliably in situations where the document's base URI is not
   well-defined.

This *is* a murky area!

/pmn




> -----Ursprungligt meddelande-----
> Från: Elliotte Rusty Harold [mailto:elharo at metalab.unc.edu] 
> Skickat: den 7 oktober 2003 18:35
> Till: Per Norrman
> Kopia: jdom-interest at jdom.org
> Ämne: Re: SV: [jdom-interest] One less TODO item
> 
> 
> >Hmm,
> >
> >I thought it was the SAXParsers (or XMLReader) that resolved the 
> >relative URI. If I supply an EntityResolver to either crimson or 
> >xerces, the system id is already resolved when the callback is made. 
> >How do you work around that in XOM? Or does it have its own parser?
> >
> 
> I look at the URLs that are fed in and if they don't have a path 
> component, I add a / at the end. Really simple.
> 
> It's a hack, I admit, and it only works for URLs that don't have path 
> components, but it does help XOM work with a lot of URLs it would 
> otherwise fail on.
> 
> I reported this bug in Xerces some time ago (or at least I thought I 
> did. Can't seem to find it in Bugzilla at the moment). However, it's 
> still present in 2.5. It's one of the few bugs left in Xerces that 
> affects XOM. This compares very well to other parsers, most of which 
> have dozens of bugs the XOM unit tests expose.
> 
> OK, I found the bug. It's 
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18345 (God, I hate 
> Bugzilla.) Hmm, look like they claim it's fixed in 2.5 but I could 
> swear I'm still seeing it. Possibly I'm using an older parser? I'll 
> look into this further. If Xerrces has indeed fixed this, then all 
> JDOM has to do is ship the latest Xerces.
> 
> Oh, I bet I know what's going on. I think I'm loading the older 
> Xerces bundled with Java 1.4.2 rather than the bug fixed version. 
> Hmm, not that's not it. OK, I've got it. They've instituted something 
> equivalent to the same workaround I used. In other words, they can 
> handle http://www.cafeconleche.org but not http://www.ibiblio.org/xml 
> and this can be verified with Xerces's own sax.Counter program. I'll 
> reopen the bug.
> 
> -- 
> 
>    Elliotte Rusty Harold
>    elharo at metalab.unc.edu
>    Processing XML with Java (Addison-Wesley, 2002)
>    http://www.cafeconleche.org/books/xmljava            
>    http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA
> 




More information about the jdom-interest mailing list