[jdom-interest] Detecting if file is XML
Mattias Jiderhamn
mj-lists at expertsystems.se
Wed Jul 14 06:00:02 PDT 2010
Thanks all for your suggestions. For some reason, the responses in this
thread didn't reach me until a month or so later.
In my particular case, the file extension was not to be trusted (or
rather, would never be .xml), I wasn't interested in whether the XML was
valid or not (at the point of file type detection), but I needed to take
encodings and white space into account.
Anyway, I just shared the solution on ForkCan. See
http://www.forkcan.com/viewcode/196/Detect-if-file-contains-XML
</Mattias>
----- Original Message -----
Subject: Re: [jdom-interest] Detecting if file is XML
Date: Mon, 07 Jun 2010 07:39:41 -0400
From: Rolf <jdom at tuis.net>
Mattias Jiderhamn wrote:
> This is semi-off topic, but what is the best way - performance wise -
> to determine if a file is an XML file or not, from Java?
>
I have done this recently...
I can't share the code, but I created a 'simple' SAX Handler that, after
a configured amount ('X' amount) of 'valid' XML throws a custom
exception. The 'calling code' listens for any exceptions. If it
successfully parses the file, or it gets the special 'custom' exception,
then it knows that the file is small and valid (no exception), or is
larger, and the first 'X' amount of the file is valid XML.
The SAX Handler does nothing with the content except count
'startElement()' and 'characters()' method calls.
Makes for a pretty fast, efficient, validating handler..... without any
unneeded 'overhead'.
Sure, there may be a better way, but this worked for me...
Rolf
More information about the jdom-interest
mailing list