[jdom-interest] malformed URL exception exception in saxbuilder.build due to unreachable URL

Rolf Lear jdom at tuis.net
Thu Feb 9 15:59:20 PST 2012


Hi Cliff.

I just ran a quick test ... :

public class TestParse {
	public static void main(String[] args) throws Exception {
		String xml = "<?xml version=\"1.0\" ?>\n<xmlchars />";
		SAXBuilder sb = new SAXBuilder();
		sb.build(xml);
	}
}

and got:

Exception in thread "main" java.net.MalformedURLException: no protocol: 
<?xml version="1.0" ?>
<xmlchars />
	at java.net.URL.<init>(URL.java:567)
	at java.net.URL.<init>(URL.java:464)
	at java.net.URL.<init>(URL.java:413)
	at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown 
Source)
	at 
org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown 
Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
	at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:217)
	at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:327)
	at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1286)
	at net.tuis.debug.TestParse.main(TestParse.java:12)



On 09/02/2012 6:44 PM, Rolf Lear wrote:
> Hi Cliff.
>
> I think the problem is that the content of the column 'xml_contents' of
> the table 'xml_table' is actual XML data.... not a file name/URI.
>
>
> I think what you want to do is:
>
> String xmlcontent = rs.getString(2);
> StringReader reader = new StringReader(xmlcontent);
> xmlDoc = saxBuilder.build(reader);
>
> See http://jdom.org/docs/faq.html#a0210
>
> Rolf
>
> On 09/02/2012 6:26 PM, cliff palmer wrote:
>> code below:
> .....
>> rs = null;
>> try {
>> String xmlQuery = "select xml_id, xml_contents from xml_table";
>> rs = stmt.executeQuery(xmlQuery);
>> } catch (SQLException e) {
>> logLogger.debug("Exception executing query");
>> e.printStackTrace();
>> }
>> saxBuilder = new SAXBuilder();
>> xmlFmt = Format.getPrettyFormat().setEncoding("UTF-8");
>> xmlOutputter = new XMLOutputter(xmlFmt);
>> structureString = new ArrayList<String>();
>> tagMap = new HashMap<String, Integer>();
> ...
>> while(rs.next()) {
>> xmlDoc = null;
>> rowsRead ++;
>> msgID = rs.getString(1);
>> try {
>> xmlDoc = saxBuilder.build(rs.getString(2));
>> } catch (JDOMException e) {
>> e.printStackTrace();
>> } catch (IOException e) {
>> goodXML = false;
>> e.printStackTrace();
>> }
>> try {
>> rootElement = xmlDoc.getRootElement();
>> doProcess(rootElement);
>> } catch (Exception e) {
>> e.printStackTrace();
>> }
>> }
>
>
>
>
>>
>> On 2/9/12, cliff palmer<palmercliff at gmail.com> wrote:
>>> Hi Rolf
>>> I will post the code later, (sorry late for a meeting) but to answer
>>> your questions:
>>> - this error occurs when there is an "xmlns" declaration. Since this
>>> is the first instance of an "xmlns" declaration I've encountered with
>>> JDOM and all of the URLs in the "xmlns" declaration that I have found
>>> point to the same bad address, I don't know if the problem is related
>>> to lookup of the URL or just the presence of an "xmlns" declaration.
>>> - the problem is predictable and occurs for each xml document that
>>> uses this bad URL in an "xmlns" declaration.
>>> - I've used the code (I will post it, I promise) to parse over 3
>>> million xml documents, passing a string containing the xml document
>>> (not a URL). The value I pass to saxbuilder.build is the returned
>>> string from the JDBC call ResultSet.getString using a column number
>>> parameter. I haven't been altering or converting the string returned
>>> from JDBC.
>>>
>>> Thanks Rolf and I will post the code as soon as the suits are done
>>> with me.
>>>
>>> Cliff
>>>
>>> On Thu, Feb 9, 2012 at 5:40 PM, Rolf Lear<jdom at tuis.net> wrote:
>>>> Hi Cliff.
>>>>
>>>> I think there's been some good pointers already, but just to make
>>>> things
>>>> crystal clear... can you perhaps post the relevant code snippet you are
>>>> using to parse the document, and perhaps the first few lines of the
>>>> actual
>>>> XML too.
>>>>
>>>> Also, does this problem happen with *all* xml documents (the first
>>>> one),
>>>> or
>>>> with just some of them?
>>>>
>>>> My guess is that Oliver has the right idea with parsing the wrong
>>>> string....
>>>> remember that the SaxBuilder.build(String) method expects the String
>>>> to be
>>>> a
>>>> URL, not the actual XML content..... YTour stack trace indicates you
>>>> are
>>>> calling this method...
>>>>
>>>> See the code here:
>>>> https://github.com/hunterhacker/jdom/blob/jdom-1.x/core/src/java/org/jdom/input/SAXBuilder.java#L986
>>>>
>>>>
>>>> Anyway, seeing your code would help....
>>>>
>>>> Rolf
>>>>
>>>>
>>>> On 09/02/2012 3:54 PM, cliff palmer wrote:
>>>>> I'm reading through several hundred thousand existing XML documents
>>>>> building counts of XML tags and have encountered a
>>>>> Java.net.MalformedURL Exception raised by saxBuilder.build because the
>>>>> xmlns points to a URL that can not be reached.
>>>>> I am using JDOM 1.1.2.
>>>>> Is there a call or parameter setting that will cause saxBuilder to
>>>>> ignore namespaces when parsing?
>>>>> Thanks!
>>>>> Cliff
>>>>> _______________________________________________
>>>>> To control your jdom-interest membership:
>>>>> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>>>>>
>>>>>
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>



More information about the jdom-interest mailing list