[jdom-interest] Possible inconsistency in Verifier.isXMLCharacter()

Jason Hunter jhunter at acm.org
Thu Apr 10 15:05:15 PDT 2003


I'm not sure why it's there, but at least it's not hurting anything. 
The highest checks will only happen if all the lower checks never match,
which in Java today they certainly will.

-jh-

> Rolf Lear wrote:
> 
> This is the code from isXMLCharacter().
> 
>     public static boolean isXMLCharacter(char c) {
> 
>         if (c == '\n') return true;
>         if (c == '\r') return true;
>         if (c == '\t') return true;
> 
>         if (c < 0x20) return false;  if (c <= 0xD7FF) return true;
>         if (c < 0xE000) return false;  if (c <= 0xFFFD) return true;
>         if (c < 0x10000) return false;  if (c <= 0x10FFFF) return
> true;
> 
>         return false;
>     }
> 
> Now, according to Java spec, chars have value 0x0000 through 0xffff
> (http://java.sun.com/docs/books/jls/second_edition/html/typesValues.doc.html#9151)
> 
> Thus, the line:
> 
> if (c < 0x10000) return false;  if (c <= 0x10FFFF) return true;
> 
> is redundant, until there is a java with more than 2 byte chars.
> 
> So, whatever characters are meant to be in the range 0x10000 through
> 0x10FFF they will never validate.
> 
> Rolf



More information about the jdom-interest mailing list