[jdom-interest] Re: setters in Format

Mike Brenner mikeb at mitre.org
Fri May 30 06:32:18 PDT 2003


WHITE SPACE AND EMPTY ELEMENTS AND NULLS

As the authors change the interfaces in preparation
for JDOM-beta10, we still need to clear the haziness
away from some of the basics.

One good thing to do before JDOM-beta10 would be to combine 
the lessons from the many (and sometimes contradictory)
emails on this list having to do with WHITE SPACE and 
EMPTY ELEMENTS into one place.

I think we need sample code using the new interfaces
to show how to do the following.

CLONING (PARSING THEN GENERATING A COPY OF) XML FILES:

	a. How to keep empty elements versus eliminating them.
	FOR EXAMPLE, how to use setExpandEmptyElements, which 
	is completely orthogonal to the new enumeration for
	the 4 kinds of white space management.

	b. How to get rid of some or all white space in the XML
	itself and preserve the data integrity of the CDATA.
	FOR EXAMPLE the 4 enumerations, and:
	setEncoding, setIndent, setNewlines, setTrimAllWhite,
	space="preserve", normalizing versus trimming,
	 Namespace xml = Namespace.getNamespace("xml", 
                "http://www.w3.org/XML/1998/namespace");    
	text.setAttribute("space","preserve",xml);
	text.setAttribtue("space","preserve",Namespace.XML_NAMESPACE);
	outputter=XMLOutputter(" ", true,"ISO-8859-1");
	outputter.setTextTrim(true);
	outputter.preserveDataIntegrityOfCDATA(true);

	c. How to prevent white space from being added.

	d. How to copy the file exactly.

	e. Limitations and their workarounds for using InputStreams 
	versus Readers with varying encodings of Unicode and ASCII.

	f. Limitations and workarounds for using OutputStreams
	versus Writers.

	g. How to do non-English languages. At least one of the
	Asian alphabets: Hindi, Chinese, Thai, etc.
	At least one of the right-to-left languages: Arabic, Hebrew,
	Biblical Hebrew, Farsi, Aramaic. 
	At least one of the European alphabets with accents: Polish, 
	Vietnamese, etc.
		
	h. Encoding Names with Readers vs. InputStreams.
	
	i. Unicode versus Ascii.

	j. Generating SVG Files (when the interfaces stabilize, I'll
	provide the example. But with GET CHILDREN deprecated, I will 
	need to recode in the non-deprecated way.

	k. Reading an XML file into a tree of HashMaps of HashMaps of ...
	Again, I'll provide an example, after I learn the new JDOM-beta10
	interfaces.
	
	l. How to use NULL elements (whatever that means).

	m. What would be needed to handle the html language (closed tags
	and escape characters).

	n. What would be needed to handle the water language (closed tags,
	escape characters, abbreviations, and xml slang).

MOST IMPORTANT: how to keep the empty elements.
The others are stylistic, but empty elements are functional.

One can't reliably create svg text nodes in an svg JDOM tree, but 
it is easy to fill in the text in an already existing text node.

Therefore, it is critical to the creation and dynamic modification
of SVG files, to keep the empty elements.



> Jason Hunter writes:
>> Good point, but we'd need to add another method to turn off all trimming
>> behaviors.  Otherwise you could setTextTrim() and not undo it!
>>
>> setTextPreserve()?
 
"Bradley S. Huffman" wrote:
> +1 by me.




More information about the jdom-interest mailing list