SV: [jdom-interest] JDOM and text outside tags
Jacques-Albert De Blasio
jdeblasi at isl.rdc.toshiba.co.jp
Thu Oct 23 00:37:43 PDT 2003
Thanks, it works perfectly!
Per Norrman wrote:
>Hi,
>
>A sample:
>
>public class MixedContent {
>
> private static String xml =
> "<TD> "
> + "<SMALL>"
> + "<IMG src = \"...\" /> some_text <BR />"
> + "<IMG src =\" ...\" /> some_other_text <BR /> "
> + "</SMALL>"
> + "</TD> ";
>
> public static void main(String[] args) throws Exception {
> Document doc = new SAXBuilder().build(new
>StringReader(xml));
> Element small = doc.getRootElement().getChild("SMALL");
>
> for (Iterator i = small.getContent().iterator(); i.hasNext();) {
> Object node = i.next();
> if (node instanceof Text) {
> System.out.println("[" + ((Text)node).getText() + "]");
> } else if (node instanceof Element) {
> System.out.println("<" + ((Element)node).getName() + ">");
> }
>
> }
> }
>}
>
>
>
>>-----Ursprungligt meddelande-----
>>Från: jdom-interest-admin at jdom.org
>>[mailto:jdom-interest-admin at jdom.org] För Jacques-Albert De Blasio
>>Skickat: den 23 oktober 2003 08:53
>>Till: jdom-interest at jdom.org
>>Ämne: Re: SV: [jdom-interest] JDOM and text outside tags
>>
>>
>>Thank you very much for you answers (including Stein Erik Berget).
>>
>>As you said, I have a concatenated string which looks like
>>"some_textsome_other_text" when I use the getText() on the SMALL
>>element. I may have not really understood what you mean by "iterate
>>through the content of the SMALL element", because I do not
>>get any Text
>>node if I do so (I first get a list of all children of SMALL
>>and search
>>for a Text node).
>>
>>Do you have any idea?
>>
>>Thanks,
>>
>>Jack
>>
>>Per Norrman wrote:
>>
>>
>>
>>>Hi,
>>>
>>>getText() on the SMALL element should get you
>>>a concatenated string, " some_text some_other_text ".
>>>
>>>If you want the pieces individually, you have to
>>>iterate through the content of the SMALL element and
>>>pick up the the value of each Text node in question.
>>>
>>>/pmn
>>>
>>>
>>>
>>>
>>>
>>>>-----Ursprungligt meddelande-----
>>>>Från: jdom-interest-admin at jdom.org
>>>>[mailto:jdom-interest-admin at jdom.org] För Jacques-Albert De Blasio
>>>>Skickat: den 22 oktober 2003 11:06
>>>>Till: jdom-interest at jdom.org
>>>>Ämne: [jdom-interest] JDOM and text outside tags
>>>>
>>>>
>>>>Hi all,
>>>>
>>>>I have a problem with JDOM and I am sure that one of you JDOM
>>>>guru could
>>>>help me out :)
>>>>
>>>>In a program I'm writing, I first fetch HTML pages on the
>>>>web, tidy them
>>>>with NekoHTML (JTidy was not sufficient as it could not parse
>>>>japanese
>>>>html pages) and then transform the DOM outputed by NekoHTML
>>>>
>>>>
>>into JDOM
>>
>>
>>>>documents.
>>>>
>>>>My problem is the following: in a given page, I have tags such as
>>>>
>>>><TD>
>>>><SMALL>
>>>><IMG src = "..." /> some_text <BR />
>>>><IMG src =" ..." /> some_other_text <BR />
>>>></SMALL>
>>>></TD>
>>>>
>>>>How can I fetch the "some_text" and "some_other_text" ?
>>>>
>>>>Thank you very much for your help,
>>>>
>>>>Jack
>>>>
>>>>_______________________________________________
>>>>To control your jdom-interest membership:
>>>>http://lists.denveronline.net/mailman/options/jdom-interest/yo
>>>>
>>>>
>>>>
>>>>
>>>uraddr at yourhost.com
>>>
>>>_______________________________________________
>>>To control your jdom-interest membership:
>>>http://lists.denveronline.net/mailman/options/jdom-interest/y
>>>
>>>
>ouraddr at yo
>
>
>>urhost.com
>>
>>
>>
>>
>>
>>
>
>_______________________________________________
>To control your jdom-interest membership:
>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhos
>t.com
>
>_______________________________________________
>To control your jdom-interest membership:
>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
>
>
>
>
More information about the jdom-interest
mailing list