[jdom-interest] malformed URL exception exception in saxbuilder.build due to unreachable URL

Rolf Lear jdom at tuis.net
Thu Feb 9 15:44:10 PST 2012


Hi Cliff.

I think the problem is that the content of the column 'xml_contents' of 
the table 'xml_table' is actual XML data.... not a file name/URI.


I think what you want to do is:

     String xmlcontent = rs.getString(2);
     StringReader reader = new StringReader(xmlcontent);
     xmlDoc = saxBuilder.build(reader);

See http://jdom.org/docs/faq.html#a0210

Rolf

On 09/02/2012 6:26 PM, cliff palmer wrote:
> code below:
.....
> 		rs = null;
> 		try {
> 			String xmlQuery = "select xml_id, xml_contents from xml_table";
> 			rs = stmt.executeQuery(xmlQuery);
> 		} catch (SQLException e) {
> 			logLogger.debug("Exception executing query");
> 			e.printStackTrace();
> 		}
> 		saxBuilder = new SAXBuilder();
> 		xmlFmt =  Format.getPrettyFormat().setEncoding("UTF-8");
> 		xmlOutputter = new XMLOutputter(xmlFmt);
> 		structureString = new ArrayList<String>();
> 		tagMap = new HashMap<String, Integer>();
...
> 		while(rs.next()) {
> 			xmlDoc = null;
> 			rowsRead ++;
> 			msgID = rs.getString(1);
> 			try {
> 				xmlDoc = saxBuilder.build(rs.getString(2));
> 			} catch (JDOMException e) {
> 				e.printStackTrace();
> 			} catch (IOException e) {
> 				goodXML = false;
> 				e.printStackTrace();
> 			}
> 			try {
> 				rootElement = xmlDoc.getRootElement();
> 				doProcess(rootElement);
> 			} catch (Exception e) {
> 				e.printStackTrace();
> 			}
> 		}




>
> On 2/9/12, cliff palmer<palmercliff at gmail.com>  wrote:
>> Hi Rolf
>> I will post the code later, (sorry late for a meeting) but to answer
>> your questions:
>> - this error occurs when there is an "xmlns" declaration.  Since this
>> is the first instance of an "xmlns" declaration I've encountered with
>> JDOM and all of the URLs in the "xmlns" declaration that I have found
>> point to the same bad address, I don't know if the problem is related
>> to lookup of the URL or just the presence of an "xmlns" declaration.
>> - the problem is predictable and occurs for each xml document that
>> uses this bad URL in an "xmlns" declaration.
>> - I've used the code (I will post it, I promise) to parse over 3
>> million xml documents, passing a string containing the xml document
>> (not a URL).  The value I pass to saxbuilder.build is the returned
>> string from the JDBC call ResultSet.getString using a column number
>> parameter.  I haven't been altering or converting the string returned
>> from JDBC.
>>
>> Thanks Rolf and I will post the code as soon as the suits are done with me.
>>
>> Cliff
>>
>> On Thu, Feb 9, 2012 at 5:40 PM, Rolf Lear<jdom at tuis.net>  wrote:
>>> Hi Cliff.
>>>
>>> I think there's been some good pointers already, but just to make things
>>> crystal clear... can you perhaps post the relevant code snippet you are
>>> using to parse the document, and perhaps the first few lines of the
>>> actual
>>> XML too.
>>>
>>> Also, does this problem happen with *all* xml documents (the first one),
>>> or
>>> with just some of them?
>>>
>>> My guess is that Oliver has the right idea with parsing the wrong
>>> string....
>>> remember that the SaxBuilder.build(String) method expects the String to be
>>> a
>>> URL, not the actual XML content..... YTour stack trace indicates you are
>>> calling this method...
>>>
>>> See the code here:
>>> https://github.com/hunterhacker/jdom/blob/jdom-1.x/core/src/java/org/jdom/input/SAXBuilder.java#L986
>>>
>>> Anyway, seeing your code would help....
>>>
>>> Rolf
>>>
>>>
>>> On 09/02/2012 3:54 PM, cliff palmer wrote:
>>>> I'm reading through several hundred thousand existing XML documents
>>>> building counts of XML tags and have encountered a
>>>> Java.net.MalformedURL Exception raised by saxBuilder.build because the
>>>> xmlns points to a URL that can not be reached.
>>>> I am using JDOM 1.1.2.
>>>> Is there a call or parameter setting that will cause saxBuilder to
>>>> ignore namespaces when parsing?
>>>> Thanks!
>>>> Cliff
>>>> _______________________________________________
>>>> To control your jdom-interest membership:
>>>> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>>>>



More information about the jdom-interest mailing list