[jdom-interest] jdom

Phill_Perryman at Mitel.COM Phill_Perryman at Mitel.COM
Thu Apr 1 00:40:24 PST 2004

If you use wordpad and save as a text document (not unicode text) then you 
don't get the 3 bytes added. A quick look with HexEdit shows what is 
actually in the file.

IS Dept, Software Engineer.
phill_perryman at mitel.com
Tel: +44 1291 436023

Laurent Bihanic <laurent.bihanic at atosorigin.com>
Sent by: jdom-interest-admin at jdom.org
01/04/2004 08:45

        To:     "M.Novosselov" <novosselovm at 3web.net>
        cc:     jdom-interest at jdom.org
        Subject:        Re: [jdom-interest] jdom


M.Novosselov wrote:
> I got a few surprises while testing my program. I wrote test XML file in 

> notepad and saved it using UTF-8 encoding. To my surprise I got a 
> parsing exception thrown by SAXBuilder:
> root-element is missing. When I saved same file using other encodings - 
> everything worked fine (btw file with UTF-8 encoding had size 3 bytes 
> bigger than others).

When requested to saved in Unicode format (UTF-8 or 16), Notepad adds a 2, 
or 4-byte long "Byte Order Mark" (BOM) header to the file data (for more 
information: http://www.unicode.org/faq/utf_bom.html)
Some parsers handle this header correctly (Xerces) some don't (Crimson).

To control your jdom-interest membership:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://jdom.org/pipermail/jdom-interest/attachments/20040401/58a8dcfa/attachment.htm

More information about the jdom-interest mailing list