[jdom-interest] Outputting escaped entities in element text
Elliotte Rusty Harold
elharo at metalab.unc.edu
Thu Oct 3 06:24:10 PDT 2002
At 12:45 PM -0700 10/2/02, David Moles wrote:
>I'm using JDOM to generate an XSL:FO tree for conversion to PDF,
>and I'd like to include some exotic characters (specifically
>em dash, \u2014, and multiplication sign, \u00d7) in the text that
>I'm outputting. The Apache FOP processor requires that these be
>given as escaped entities (—, ×); it then maps them
>to the appropriate characters in the font that it's using.
Your message indicates that you have fundamental misunderstanding
about how XML (and FOP) works. If those are corrected, the solution
should become apparent.
1. — etc. are not escaped entities. They are character references.
2. No conformant processor (including FOP) cares whether or not you
use the character references or the actual characters, provided that
they are representable in the chosen character encoding.
3. If they are not representable in the chosen character encoding
(e.g. Latin-1) then you need to use a character reference instead.
4. Java strings are always in UTF-8, which can represent such
characters. Again, though, there's a non-XML escaping mechanism using
\u in the event you're not writing your Java code in UTF-8.
5. The XMLOotputter should be able to figure out which characters it
can and cannot escape. YOu do not need to concern yourself with this.
What you need to do is this:
Element foo = new Element("inline", namespace);
foo.setText("\u2014;");
>then in my XML I get
>
> <fo:inline>[Unicode character 2014]</fo:inline>
That's what you should get.
>-- the character's not escaped, it's just encoded in UTF-8, as
>you'd expect it to be. But then FOP doesn't know it needs to be
>remapped, it looks for the character in the font it's using and
>doesn't find it, and so in my PDF I get the "#" for "garbage
>character".
If this is true, and I still doubt it, then FOP is broken and needs
to be fixed. This is not a JDOM problem.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| XML in a Nutshell, 2nd Edition (O'Reilly, 2002) |
| http://www.cafeconleche.org/books/xian2/ |
| http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
More information about the jdom-interest
mailing list