[jdom-interest] Outputting escaped entities in element text

David Moles david.moles at vykor.com
Wed Oct 2 12:45:00 PDT 2002


I'm using JDOM to generate an XSL:FO tree for conversion to PDF,
and I'd like to include some exotic characters (specifically
em dash, \u2014, and multiplication sign, \u00d7) in the text that
I'm outputting. The Apache FOP processor requires that these be
given as escaped entities (—, ×); it then maps them
to the appropriate characters in the font that it's using.

With JDOM, though, I can't find a way to get those into the text.

If I just say:

  Element foo = new Element("inline", namespace);
  foo.setText("—");

then, naturally, I get:

  <fo:inline>&amp;#x2014;</fo:inline>

in my XML, and in my output PDF the &amp; gets turned back into
a &, as you'd expect, and I get "&#x2014;" instead of character
\u2014.

I tried creating EntityRef objects and using those, but that of
course failed because "#x2014" is an invalid entity name, having 
"#" in it.

If I use the Java character literal instead:

  foo.setText("\u2014");

then in my XML I get

  <fo:inline>[Unicode character 2014]</fo:inline>

-- the character's not escaped, it's just encoded in UTF-8, as
you'd expect it to be. But then FOP doesn't know it needs to be
remapped, it looks for the character in the font it's using and
doesn't find it, and so in my PDF I get the "#" for "garbage
character".

(I know I'm probably looking at a FOP bug here, too -- I don't
know why they'd treat a character different from a character
reference. But they do, and I need to work around that.)

So, is there any way to get JDOM to output these as escaped
character references, or am I going to have to do some post-
processing hack? :)

Any help appreciated,

David





More information about the jdom-interest mailing list