<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Robert.<br>
<br>
OK. I have spent some time going through things, and, admittedly,
this is confusing, and working through the
combinations/permutations for formatting is liable to end in a
headache.<br>
<br>
So, I think I have resolved that there are a number of issues at
hand in your case:<br>
1. JDOM2 is doing different things than JDOM1<br>
2. JDOM1 is probably doing the wrong thing in this case<br>
3. JDOM2 is also probably doing the wrong thing, but, in fairness,
changing the 'TextMode' of a PrettyPrint format is a 'dangerous'
thing .... not by design, but because of the actual implementation
and choices the formatter makes with the pretty format.<br>
4. If whitespace is significant for certain members of an XML
document then you should not be relying on the whim of JDOM to
make things right, but you should be using the
xml:space="preserve" mechanism that is designed for this purpose.<br>
<br>
So, here are a few 'answers'.<br>
<br>
Answer 0:<br>
=====================================================<br>
The output you are getting from JDOM 1.x is broken. If you have a
'preserve' text mode then there should be no whitespace between
any elements (indenting/newlines) because that is not 'preserved'
space (it's 'invented' whitespace).<br>
<br>
The JDOM output you currently get is relying on a bug in JDOM 1.x<br>
<br>
Answer 1:<br>
=====================================================<br>
The "right" thing for you to do is to add the xml:space="preserve"
to the sub2 elements:<br>
<br>
public static void main(String argv[]) throws Exception{<br>
Document document = new Document();<br>
<b>Attribute cloneme = new Attribute("space", "preserve",
Namespace.XML_NAMESPACE);</b><br>
Element root = new Element("root");<br>
document.addContent(root);<br>
Element sub1 = new Element("sub1");<br>
root.addContent(sub1);<br>
sub1.addContent(new Element("sub2").setText("Some text")<b>.setAttribute(cloneme.clone())</b>);<br>
sub1.addContent(new Element("sub2").setText(" text with
left and right whitespace ")<b>.setAttribute(cloneme.clone())</b>);<br>
Format fmt = Format.getPrettyFormat();<br>
XMLOutputter xout = new XMLOutputter(fmt);<br>
xout.output(document, System.out);<br>
}<br>
<br>
Gives the output:<br>
<br>
<root><br>
<sub1><br>
<sub2 xml:space="preserve">Some text</sub2><br>
<sub2 xml:space="preserve"> text with left and right
whitespace </sub2><br>
</sub1><br>
</root><br>
<br>
Answer 2:<br>
=====================================================<br>
The "OK" thing for you to do is to use the
TextMode.TRIM_FULL_WHITE instead of TextMode.PRESERVE... the
default TextMode for PrettyPrint is TextMode.TRIM, which removes
white-space from either-end of the text, but the TRIM_FULL_WHITE
will remove whitespace only when there's only whitespace, and will
do nothing if there's any non-whitespace characters. I want you to
be aware that other tools (JDOM, xmllint) have the right to mess
with the whitespace (
<a class="moz-txt-link-freetext" href="http://www.w3.org/TR/REC-xml/#sec-white-space">http://www.w3.org/TR/REC-xml/#sec-white-space</a> ). It is only by
convention that the following will work in JDOM (I recommend
preserving whitespace correctly with xml:space="preserve") :<br>
<br>
public static void main(String argv[]) throws Exception{<br>
Document document = new Document();<br>
Element root = new Element("root");<br>
document.addContent(root);<br>
Element sub1 = new Element("sub1");<br>
root.addContent(sub1);<br>
sub1.addContent(new Element("sub2").setText("Some text"));<br>
sub1.addContent(new Element("sub2").setText(" text with
left and right whitespace "));<br>
Format fmt = Format.getPrettyFormat();<br>
fmt.setTextMode(Format.TextMode.TRIM_FULL_WHITE);<br>
XMLOutputter xout = new XMLOutputter(fmt);<br>
xout.output(document, System.out);<br>
}<br>
<br>
Gives the output:<br>
<br>
<root><br>
<sub1><br>
<sub2>Some text</sub2><br>
<sub2> text with left and right whitespace
</sub2><br>
</sub1><br>
</root><br>
<br>
Answer 3:<br>
=====================================================<br>
JDOM 2.x uses a different (faster, and more flexible) algorithm
for output handling. This algorithm has two major triggers: The
TextMode and the Indent. PrettyPrint sets the TextMode to TRIM and
the Indent to two spaces " ". The TRIM mode tells JDOM it can
mess with whitespace in Text. The INDENT tells JDOM it can mess
with the formatting of the XML structure (setting it to null tells
JDOM not to mess with any indenting).<br>
You have been changing the TextMode to PRESERVE, and, as I think
about that, JDOM should never mess with the indenting when the
mode is PRESERVE. JDOM has code to make sure that it manages the
INDENT and the TextMode correctly when they need to change
internally, but you are basically setting an invalid situation by
setting INDENT and PRESERVE at the same time. JDOM should handle
that better.<br>
<br>
But, the right thing to do, is when you set PRESERVE, JDOM2 should
output the following:<br>
<root><sub1><sub2>Some
text</sub2><sub2> text with left and right
whitespace </sub2></sub1></root><br>
<br>
So, I think there's a bug in JDOM2, and, given the input you have
(Format.getPrettyFormat().setTextMode(TextMode.PRESERVE) ) It
should be outputting the above (which is not what you want).<br>
<br>
Answer 4:<br>
=====================================================<br>
You can use the Raw format, and output the spaces yourself by
adding your own indenting and newlines.<br>
<br>
<br>
<br>
<br>
<br>
On 06/10/2013 1:09 PM, Robert Krüger wrote:<br>
</div>
<blockquote
cite="mid:CAEnqZEXy1aQX=fvC39T8sFY+3eJYHYyQTTFdkhcmaWMPitTLKA@mail.gmail.com"
type="cite">
<pre wrap="">Hi Rolf,
On Sat, Oct 5, 2013 at 2:08 AM, Rolf <a class="moz-txt-link-rfc2396E" href="mailto:jdom@tuis.net"><jdom@tuis.net></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Hi Robert.
Just so we are on the same page, when I run the code, I get the following
output:
with the setTextMode(...):
new
XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document,
System.out);
<?xml version="1.0" encoding="UTF-8"?>
<root>
<sub1>
<sub2>
Some text
</sub2><sub2>
text with left and right whitespace
</sub2>
</sub1>
</root>
without the setTextMode(...)
new XMLOutputter(Format.getPrettyFormat()).output(document,
System.out);
<?xml version="1.0" encoding="UTF-8"?>
<root>
<sub1>
<sub2>Some text</sub2>
<sub2>text with left and right whitespace</sub2>
</sub1>
</root>
The plain "Pretty" format is the way I think you want the output, and it is
right, right?
</pre>
</blockquote>
<pre wrap="">
Yes, except for whitespace being trimmed. I do not want that but want
indenting and no whitespace trimming for text-only elements (that was
the behaviour of JDOM1). The use case is that I use xml to store data
(e.g. user input of a content management system) and removing
whitespace modifies the data, which I do not want to happen but I do
want indenting.
</pre>
<blockquote type="cite">
<pre wrap="">
It is very unusual for someone ysing the PrettyFormat to modify the
TextMode.... I wonder why you have the setTextMode() at all...
</pre>
</blockquote>
<pre wrap="">
see above.
</pre>
<blockquote type="cite">
<pre wrap="">
Rolf
</pre>
</blockquote>
<pre wrap="">
Robert
</pre>
<blockquote type="cite">
<pre wrap="">
On 30/09/2013 9:43 AM, Robert Krüger wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
forgot to reply to the list
---------- Forwarded message ----------
From: Robert Krüger <a class="moz-txt-link-rfc2396E" href="mailto:krueger@lesspain.de"><krueger@lesspain.de></a>
Date: Mon, Sep 30, 2013 at 3:42 PM
Subject: Re: [jdom-interest] Formatting differences after migrating to
JDOM2
To: Rolf <a class="moz-txt-link-rfc2396E" href="mailto:jdom@tuis.net"><jdom@tuis.net></a>
This reproduces the behaviour:
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;
public class JDOMOutput {
public static void main(String argv[]) throws Exception{
Document document = new Document();
Element root = new Element("root");
document.addContent(root);
Element sub1 = new Element("sub1");
root.addContent(sub1);
sub1.addContent(new Element("sub2").setText("Some text"));
sub1.addContent(new Element("sub2").setText(" text with left
and right whitespace "));
new
XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document,
System.out);
}
}
Try with and without the setTextMode(Format.TextMode.PRESERVE). None
of them does what I need.
On Sun, Sep 29, 2013 at 7:10 PM, Robert Krüger <a class="moz-txt-link-rfc2396E" href="mailto:krueger@lesspain.de"><krueger@lesspain.de></a>
wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
Hi,
it is part of a large application. I will try to build a simple test
program that demonstrates the effect.
Cheers,
Robert
On Sun, Sep 29, 2013 at 5:26 PM, Rolf <a class="moz-txt-link-rfc2396E" href="mailto:jdom@tuis.net"><jdom@tuis.net></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
Hi Robert.
This is surprising indeed, and I agree it should not be different from
JDOM
1.x
Can you get me a copy of the input file and the relevant parts of Java
code?
You don't need to CC the whole list it is large...
Thanks
Rolf
On 29/09/2013 10:42 AM, Robert Krüger wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
Hi,
I just migrated my code to from JDOM to JDOM2 and noticed some of our
unit tests failed. The reason is different formatting. I used
Format.getPrettyFormat().setTextMode(PRESERVE) for the formatting and
with jdom this produced output like the following
<av-container format-version="0.3.4">
<container-format>MP4</container-format>
<bitrate>646448</bitrate>
<duration>2002002</duration>
<start-time>0</start-time>
<acquisition-timestamp>1340887741000</acquisition-timestamp>
<stream>
<type>VIDEO</type>
<codec>H.264</codec>
...
after replacing the imports by jdom2 I got
<av-container format-version="0.3.4">
<container-format>
MP4
</container-format><bitrate>
646448
</bitrate><duration>
2002002
</duration><start-time>
0
</start-time><acquisition-timestamp>
1340887741000
</acquisition-timestamp><stream>
<type>
VIDEO
</type><codec>
H.264
</codec>...
This looks rather broken as it does not preserve the original data at
all with all those added newlines. Removing the setTextMode(PRESERVE)
restored the format to what is shown above but the reason I added
setTextMode(PRESERVE) was that without it, whitespace was trimmed and
I do not want that for elements with text content.
Is this a bug? How can I achieve what I want, i.e. have a "pretty",
i.e. indented format and have text-only elements preserve whitespace?
Thanks in advance,
Robert
_______________________________________________
To control your jdom-interest membership:
<a class="moz-txt-link-freetext" href="http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com">http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com</a>
</pre>
</blockquote>
</blockquote>
</blockquote>
<pre wrap="">_______________________________________________
To control your jdom-interest membership:
<a class="moz-txt-link-freetext" href="http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com">http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com</a>
</pre>
</blockquote>
<pre wrap="">
</pre>
</blockquote>
<pre wrap="">
</pre>
</blockquote>
<br>
</body>
</html>