[jdom-interest] XML escaping and unescaping
David Wall
d.wall at computer.org
Fri Nov 19 18:24:12 PST 2004
Very cool. I'll give it a try!
David
----- Original Message -----
From: "Jason Hunter" <jhunter at xquery.com>
To: <d.wall at computer.org>
Cc: <jdom-interest at jdom.org>
Sent: Friday, November 19, 2004 4:58 PM
Subject: Re: [jdom-interest] XML escaping and unescaping
> When you call elt.getText() you get the decoded (semantic) form. Think
> of JDOM as representing the XML infoset and the " or CDATA
> representation as just one way to encode the XML data when written as a
> stream of bytes. If you call elt.setText("This \"is\" a test") the
> outputter will write what you have below.
>
> In other words, it's not part of standard class libs since it's almost
> never needed by normal programmers. JDOM via the parsers handles the
> input and JDOM via XMLOutputter handles the output. You just deal with
> plain old strings and you don't mind which chars are special and which
> aren't.
>
> -jh-
>
> d.wall at computer.org wrote:
>
> > Thanks. I'll take a look at your escapers and compare. It's a bit
> > amazing that such functionality isn't just part of the standard class
> > libraries by now.
> >
> > As for coming back in, an XML parser won't decode a string for you, will
> > it? I mean, if my XML looks like:
> >
> > <data>
> > <field>This "is" a test.</field>
> > </data>
> >
> > I would expect that getting the data->field text value would return:
> > This "is" a test.
> >
> > Are you saying some XML parsers will return instead:
> > This "is" a test.
> >
> > My impression is that such an encoded element would return the String
> > still encoded.
> >
> > David
> >
> >
> > Jason Hunter wrote:
> >
> >> XMLOutputter has escapeElementEntities() and escapeAttributeEntities()
> >> that do what you want and have a pluggaable EscapeStrategy to handle
> >> characters outside the selected output encoding. We don't have code
> >> to do the reverse as we rely on XML parsers for that.
> >>
> >> -jh-
> >>
> >> d.wall at computer.org wrote:
> >>
> >>> Does JDOM come with any utility routines that will take a String and
> >>> make it XML safe? And also a routine that takes an XML safe encoding
> >>> and converts it back to a regular String?
> >>>
> >>> i.e.
> >>>
> >>> String -> XML Safe string -> String
> >>>
> >>> "This" -> "This" -> "This" (no change needed)
> >>> "4+3<4+4" -> "4+3<4+4" -> "4+3<4+4"
> >>>
> >>> I only ask because I have some basic routines that do this, but they
> >>> only map the following:
> >>>
> >>> > >
> >>> < <
> >>> & &
> >>> ' '
> >>> " "
> >>>
> >>> It currently doesn't deal with escaped character codes like ' It
> >>> seems that putting data into XML and getting it back from XML is so
> >>> common that there must be a general routine to do this rather than
> >>> having to rely on my own implementation.
> >>>
> >>> Thanks,
> >>> David
> >>>
> >>> _______________________________________________
> >>> To control your jdom-interest membership:
> >>>
http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
> >>>
> >>
> >
> > _______________________________________________
> > To control your jdom-interest membership:
> > http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
> >
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
More information about the jdom-interest
mailing list