<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.0.8">
</HEAD>
<BODY>
Hi<BR>
<BR>
I need to write some sort of Entity handling routine that converts all of the non US-ASCII characters<BR>
to their SGML Entity reference. There was some discussion on this subject way back, but I am not sure<BR>
what came out of it. All of the documents I need to produce have to comply to the following restriction:<BR>
<A HREF="http://www.ncbi.nlm.nih.gov/entrez/query/static/entities.html">http://www.ncbi.nlm.nih.gov/entrez/query/static/entities.html</A><BR>
<BR>
What would be the best way:<BR>
<BR>
a) write EntityRef for each one of these and then let JDOM XMLOutputter do the conversion (I assume it<BR>
does it)<BR>
<BR>
b) write my own String conversion utility that converts the chars outside 127 bit range to their entity<BR>
ref value.<BR>
<BR>
Actually, what I really would like to know is if JDOM would convert a Unicode String to an XML String<BR>
that is valid for a particular encoding (i.e. US-ASCII) simply by registering EntityRef for each of<BR>
the characters outside the range for the given encoding?<BR>
<BR>
Best regards<BR>
<BR>
Benjamin<BR>
<TABLE CELLSPACING="0" CELLPADDING="0" WIDTH="100%">
<TR>
<TD>
<PRE>--
benjamin kopic
m: +44 (0)780 154 7643
t: +44 (0)20 7794 3090
e: benjamin.kopic@panContext.com
w: http://www.panContext.com/</PRE>
</TD>
</TR>
</TABLE>
</BODY>
</HTML>