[jdom-interest] Content missing after conversion from W3C Element to JDOM2 Element

Rolf Lear jdom at tuis.net
Wed Nov 7 16:07:29 PST 2012


Hi Lars.

I am back home, and I have my JDOM code in front of me.

I have just gone through the code, and 'it works for me'. What this 
means is:

if:
- you load a Document using a DOM DocumentBuilderFactory supplied by Xerces
- and you pass that document to JDOM to build a JDOM document
- and that document contains Text nodes

that:
- JDOM will correctly translate those DOM Text nodes in to JDOM Text nodes.


Now, I am not saying that using the getTextContent() is the 'right' 
method to call. It is possible that I would be better off using 
getNodeValue(). In fact, in JDOM versions before 2.x it used 
getNodeValue(). I can't think of why I decided to use getTextContent() 
instead other than the fact that that part of code was refactored 
significantly, and I used the documentation carefully, and perhaps there 
was something that used getTextContent() and I chose to do it that way.

I have just run the entire test suite with the code changed to use 
getNodeValue() and it still works fine for me.

On checking the DOM specification, the getTextContent() method was added 
in DOM level 3.

The Java API documentation is a mess in this area.... JDK 1.5 package 
information indicates that the org.w3c.dom API supports DOM Level 2:
http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/package-summary.html

Yet, the Node class indicates it implements Level 3..... and it exposes 
all the Level 3 changes.

In fact, the Java5 new features indicate that:
http://docs.oracle.com/javase/1.5.0/docs/guide/xml/jaxp/index.html

JAXP implements the Level3 specification.


So, what this means is that:
- JDOM is doing the right thing
- It uses functionality supported since Java5
- it is probably your particular DOM library that has a broken 
implementation of the new-to-DOM3 method getTextContent()
- JDOM does not need to use getTextContent() because the old method 
getNodeValue() will work just fine.



What would be useful is if you could determine the library that you are 
using. Since you have already 'hacked' the code, why don't you 
temporarily add the line: System.out.println(text.getClass()); to the 
method. This will tell you the concrete implementation of DOM that's broken.

I will also change the text() method to use getNodeValue() instead... it 
makes sense to do it if there's a broken library, and it's no big deal 
for JDOM.... Also, I am planning a release imminently, so the timing is 
good.

If you could get back to me on what library you are using, I will dig in 
to it and we can see if there's a fix for your library (I imagine that 
there is....).

I did google for "getTextContent bug Text Node" but I find no seemingly 
relevant hits.

Created issue #100 for this: https://github.com/hunterhacker/jdom/issues/100

Rolf


On 07/11/2012 5:48 PM, Larsen wrote:
> I will try to test this tomorrow in my company.
>
>
> Lars
>
>
> On Wed, 07 Nov 2012 22:48:48 +0100, Rolf Lear <jdom at tuis.net> wrote:
>
>> Hi
>>
>> If you pull the JDOM code from github, set it up as an eclipse project
>> (if you use eclipse...), then right-click the build.xml file and run
>> the eclipse target. If you use eclipse you can then right-click the
>> project and run all tests, or you can run the ant junit target.
>>
>> As for which DOM you use, run your project with the java option
>> -Djaxp.debug=1 to see which DOM is found.
>>
>>
>> Rolf
>> Larsen <larsen007 at web.de> wrote:Hi Rolf,
>>
>> I haven´t used unit tests so far and would need some instructions on
>> howto run them in case this becomes necessary.
>>
>> How can I check for a buggy DOM implementation?
>>
>>
>> Lars
>>
>>
>> On Wed, 07 Nov 2012 19:31:09 +0100, Rolf Lear <jdom at tuis.net> wrote:
>>
>>> Hi (again).
>>>
>>> Based on some double-checking, I suspect that you have a buggy
>>> DOMimplementation?
>>>
>>> GetTextContent returns nodeBalue for Text nodes...
>>> Node.getTextContentsays it should anyway.
>>>
>>> I will check it out some more later.
>>>
>>>
>>>
>>> Rolf
>>> Larsen <larsen007 at web.de> wrote:> I am quite a JDOM2 newbie and
>>> noticedstrange/incorrect behaviour when
>>>> converting a W3C-Element to a JDOM-Element. (snip)
>>>
>>>
>>> PS: Using latest JDOM 2.0.3 and Java 7 ("1.7.0_09")
>>> _______________________________________________
>>> To control your jdom-interest membership:
>>> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>>
>> _______________________________________________
>> To control your jdom-interest membership:
>> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com



More information about the jdom-interest mailing list