SV: SV: [jdom-interest] JDOM and text outside tags

Per Norrman pernorrman at telia.com
Thu Oct 23 00:11:15 PDT 2003


Hi,

A sample:

public class MixedContent {

	private static String xml =
		"<TD> "
			+ "<SMALL>"
			+ "<IMG src = \"...\" /> some_text <BR />"
			+ "<IMG src =\" ...\" /> some_other_text <BR /> "
			+ "</SMALL>"
			+ "</TD> ";

	public static void main(String[] args) throws Exception {
		Document doc = new SAXBuilder().build(new
StringReader(xml));
		Element small = doc.getRootElement().getChild("SMALL");
        
        for (Iterator i = small.getContent().iterator(); i.hasNext();) {
			Object node = i.next();
            if (node instanceof Text) {
                System.out.println("[" + ((Text)node).getText() + "]");
            } else if (node instanceof Element) {
                System.out.println("<" + ((Element)node).getName() + ">");
            }
			
		}
	}
}

> -----Ursprungligt meddelande-----
> Från: jdom-interest-admin at jdom.org 
> [mailto:jdom-interest-admin at jdom.org] För Jacques-Albert De Blasio
> Skickat: den 23 oktober 2003 08:53
> Till: jdom-interest at jdom.org
> Ämne: Re: SV: [jdom-interest] JDOM and text outside tags
> 
> 
> Thank you very much for you answers (including Stein Erik Berget).
> 
> As you said, I have a concatenated string which looks like 
> "some_textsome_other_text" when I use the getText() on the SMALL 
> element. I may have not really understood what you mean by "iterate 
> through the content of the SMALL element", because I do not 
> get any Text 
> node if I do so (I first get a list of all children of SMALL 
> and search 
> for a Text node).
> 
> Do you have any idea?
> 
> Thanks,
> 
> Jack
> 
> Per Norrman wrote:
> 
> >Hi,
> >
> >getText() on the SMALL element should get you
> >a concatenated string, " some_text  some_other_text ".
> >
> >If you want the pieces individually, you have to
> >iterate through the content of the SMALL element and
> >pick up the the value of each Text node in question.
> >
> >/pmn
> >
> >  
> >
> >>-----Ursprungligt meddelande-----
> >>Från: jdom-interest-admin at jdom.org
> >>[mailto:jdom-interest-admin at jdom.org] För Jacques-Albert De Blasio
> >>Skickat: den 22 oktober 2003 11:06
> >>Till: jdom-interest at jdom.org
> >>Ämne: [jdom-interest] JDOM and text outside tags
> >>
> >>
> >>Hi all,
> >>
> >>I have a problem with JDOM and I am sure that one of you JDOM
> >>guru could 
> >>help me out :)
> >>
> >>In a program I'm writing, I first fetch HTML pages on the
> >>web, tidy them 
> >>with NekoHTML (JTidy was not sufficient as it could not parse 
> >>japanese 
> >>html pages) and then transform the DOM outputed by NekoHTML 
> into JDOM 
> >>documents.
> >>
> >>My problem is the following: in a given page, I have tags such as
> >>
> >><TD>
> >><SMALL>
> >><IMG src = "..." /> some_text <BR />
> >><IMG src =" ..." /> some_other_text <BR />
> >></SMALL>
> >></TD>
> >>
> >>How can I fetch the "some_text" and "some_other_text" ?
> >>
> >>Thank you very much for your help,
> >>
> >>Jack
> >>
> >>_______________________________________________
> >>To control your jdom-interest membership:
> >>http://lists.denveronline.net/mailman/options/jdom-interest/yo
> >>    
> >>
> >uraddr at yourhost.com
> >
> >_______________________________________________
> >To control your jdom-interest membership: 
> >http://lists.denveronline.net/mailman/options/jdom-interest/y
ouraddr at yo
>urhost.com
>
>
>  
>

_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhos
t.com




More information about the jdom-interest mailing list