SV: [jdom-interest] JDOM and text outside tags

Jacques-Albert De Blasio jdeblasi at
Thu Oct 23 00:37:43 PDT 2003

Thanks, it works perfectly!

Per Norrman wrote:

>A sample:
>public class MixedContent {
>	private static String xml =
>		"<TD> "
>			+ "<SMALL>"
>			+ "<IMG src = \"...\" /> some_text <BR />"
>			+ "<IMG src =\" ...\" /> some_other_text <BR /> "
>			+ "</SMALL>"
>			+ "</TD> ";
>	public static void main(String[] args) throws Exception {
>		Document doc = new SAXBuilder().build(new
>		Element small = doc.getRootElement().getChild("SMALL");
>        for (Iterator i = small.getContent().iterator(); i.hasNext();) {
>			Object node =;
>            if (node instanceof Text) {
>                System.out.println("[" + ((Text)node).getText() + "]");
>            } else if (node instanceof Element) {
>                System.out.println("<" + ((Element)node).getName() + ">");
>            }
>		}
>	}
>>-----Ursprungligt meddelande-----
>>Från: jdom-interest-admin at 
>>[mailto:jdom-interest-admin at] För Jacques-Albert De Blasio
>>Skickat: den 23 oktober 2003 08:53
>>Till: jdom-interest at
>>Ämne: Re: SV: [jdom-interest] JDOM and text outside tags
>>Thank you very much for you answers (including Stein Erik Berget).
>>As you said, I have a concatenated string which looks like 
>>"some_textsome_other_text" when I use the getText() on the SMALL 
>>element. I may have not really understood what you mean by "iterate 
>>through the content of the SMALL element", because I do not 
>>get any Text 
>>node if I do so (I first get a list of all children of SMALL 
>>and search 
>>for a Text node).
>>Do you have any idea?
>>Per Norrman wrote:
>>>getText() on the SMALL element should get you
>>>a concatenated string, " some_text  some_other_text ".
>>>If you want the pieces individually, you have to
>>>iterate through the content of the SMALL element and
>>>pick up the the value of each Text node in question.
>>>>-----Ursprungligt meddelande-----
>>>>Från: jdom-interest-admin at
>>>>[mailto:jdom-interest-admin at] För Jacques-Albert De Blasio
>>>>Skickat: den 22 oktober 2003 11:06
>>>>Till: jdom-interest at
>>>>Ämne: [jdom-interest] JDOM and text outside tags
>>>>Hi all,
>>>>I have a problem with JDOM and I am sure that one of you JDOM
>>>>guru could 
>>>>help me out :)
>>>>In a program I'm writing, I first fetch HTML pages on the
>>>>web, tidy them 
>>>>with NekoHTML (JTidy was not sufficient as it could not parse 
>>>>html pages) and then transform the DOM outputed by NekoHTML 
>>into JDOM 
>>>>My problem is the following: in a given page, I have tags such as
>>>><IMG src = "..." /> some_text <BR />
>>>><IMG src =" ..." /> some_other_text <BR />
>>>>How can I fetch the "some_text" and "some_other_text" ?
>>>>Thank you very much for your help,
>>>>To control your jdom-interest membership:
>>>uraddr at
>>>To control your jdom-interest membership: 
>ouraddr at yo
>To control your jdom-interest membership:
>To control your jdom-interest membership:

More information about the jdom-interest mailing list