From jdom at tuis.net Fri Oct 4 17:08:48 2013 From: jdom at tuis.net (Rolf) Date: Fri, 04 Oct 2013 20:08:48 -0400 Subject: [jdom-interest] Fwd: Formatting differences after migrating to JDOM2 In-Reply-To: References: <52484699.6040002@tuis.net> Message-ID: <524F5890.40204@tuis.net> Hi Robert. Just so we are on the same page, when I run the code, I get the following output: with the setTextMode(...): new XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document, System.out); Some text text with left and right whitespace without the setTextMode(...) new XMLOutputter(Format.getPrettyFormat()).output(document, System.out); Some text text with left and right whitespace The plain "Pretty" format is the way I think you want the output, and it is right, right? It is very unusual for someone ysing the PrettyFormat to modify the TextMode.... I wonder why you have the setTextMode() at all... Rolf On 30/09/2013 9:43 AM, Robert Kr?ger wrote: > forgot to reply to the list > > > ---------- Forwarded message ---------- > From: Robert Kr?ger > Date: Mon, Sep 30, 2013 at 3:42 PM > Subject: Re: [jdom-interest] Formatting differences after migrating to JDOM2 > To: Rolf > > > This reproduces the behaviour: > > import org.jdom2.Document; > import org.jdom2.Element; > import org.jdom2.output.Format; > import org.jdom2.output.XMLOutputter; > > public class JDOMOutput { > > public static void main(String argv[]) throws Exception{ > Document document = new Document(); > Element root = new Element("root"); > document.addContent(root); > Element sub1 = new Element("sub1"); > root.addContent(sub1); > sub1.addContent(new Element("sub2").setText("Some text")); > sub1.addContent(new Element("sub2").setText(" text with left > and right whitespace ")); > new XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document, > System.out); > } > > } > > Try with and without the setTextMode(Format.TextMode.PRESERVE). None > of them does what I need. > > On Sun, Sep 29, 2013 at 7:10 PM, Robert Kr?ger wrote: >> Hi, >> >> it is part of a large application. I will try to build a simple test >> program that demonstrates the effect. >> >> Cheers, >> >> Robert >> >> On Sun, Sep 29, 2013 at 5:26 PM, Rolf wrote: >>> Hi Robert. >>> >>> This is surprising indeed, and I agree it should not be different from JDOM >>> 1.x >>> >>> Can you get me a copy of the input file and the relevant parts of Java code? >>> You don't need to CC the whole list it is large... >>> >>> Thanks >>> >>> Rolf >>> >>> >>> On 29/09/2013 10:42 AM, Robert Kr?ger wrote: >>>> Hi, >>>> >>>> I just migrated my code to from JDOM to JDOM2 and noticed some of our >>>> unit tests failed. The reason is different formatting. I used >>>> Format.getPrettyFormat().setTextMode(PRESERVE) for the formatting and >>>> with jdom this produced output like the following >>>> >>>> >>>> MP4 >>>> 646448 >>>> 2002002 >>>> 0 >>>> 1340887741000 >>>> >>>> VIDEO >>>> H.264 >>>> ... >>>> >>>> after replacing the imports by jdom2 I got >>>> >>>> >>>> >>>> MP4 >>>> >>>> 646448 >>>> >>>> 2002002 >>>> >>>> 0 >>>> >>>> 1340887741000 >>>> >>>> >>>> VIDEO >>>> >>>> H.264 >>>> ... >>>> >>>> This looks rather broken as it does not preserve the original data at >>>> all with all those added newlines. Removing the setTextMode(PRESERVE) >>>> restored the format to what is shown above but the reason I added >>>> setTextMode(PRESERVE) was that without it, whitespace was trimmed and >>>> I do not want that for elements with text content. >>>> >>>> Is this a bug? How can I achieve what I want, i.e. have a "pretty", >>>> i.e. indented format and have text-only elements preserve whitespace? >>>> >>>> Thanks in advance, >>>> >>>> Robert >>>> _______________________________________________ >>>> To control your jdom-interest membership: >>>> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com >>>> > _______________________________________________ > To control your jdom-interest membership: > http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com From krueger at lesspain.de Sun Oct 6 10:09:33 2013 From: krueger at lesspain.de (=?UTF-8?Q?Robert_Kr=C3=BCger?=) Date: Sun, 6 Oct 2013 19:09:33 +0200 Subject: [jdom-interest] Fwd: Formatting differences after migrating to JDOM2 In-Reply-To: <524F5890.40204@tuis.net> References: <52484699.6040002@tuis.net> <524F5890.40204@tuis.net> Message-ID: Hi Rolf, On Sat, Oct 5, 2013 at 2:08 AM, Rolf wrote: > Hi Robert. > > Just so we are on the same page, when I run the code, I get the following > output: > > with the setTextMode(...): > new > XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document, > System.out); > > > > > Some text > > > text with left and right whitespace > > > > > > without the setTextMode(...) > new XMLOutputter(Format.getPrettyFormat()).output(document, > System.out); > > > > Some text > text with left and right whitespace > > > > The plain "Pretty" format is the way I think you want the output, and it is > right, right? Yes, except for whitespace being trimmed. I do not want that but want indenting and no whitespace trimming for text-only elements (that was the behaviour of JDOM1). The use case is that I use xml to store data (e.g. user input of a content management system) and removing whitespace modifies the data, which I do not want to happen but I do want indenting. > > It is very unusual for someone ysing the PrettyFormat to modify the > TextMode.... I wonder why you have the setTextMode() at all... see above. > > Rolf Robert > > > > On 30/09/2013 9:43 AM, Robert Kr?ger wrote: >> >> forgot to reply to the list >> >> >> ---------- Forwarded message ---------- >> From: Robert Kr?ger >> Date: Mon, Sep 30, 2013 at 3:42 PM >> Subject: Re: [jdom-interest] Formatting differences after migrating to >> JDOM2 >> To: Rolf >> >> >> This reproduces the behaviour: >> >> import org.jdom2.Document; >> import org.jdom2.Element; >> import org.jdom2.output.Format; >> import org.jdom2.output.XMLOutputter; >> >> public class JDOMOutput { >> >> public static void main(String argv[]) throws Exception{ >> Document document = new Document(); >> Element root = new Element("root"); >> document.addContent(root); >> Element sub1 = new Element("sub1"); >> root.addContent(sub1); >> sub1.addContent(new Element("sub2").setText("Some text")); >> sub1.addContent(new Element("sub2").setText(" text with left >> and right whitespace ")); >> new >> XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document, >> System.out); >> } >> >> } >> >> Try with and without the setTextMode(Format.TextMode.PRESERVE). None >> of them does what I need. >> >> On Sun, Sep 29, 2013 at 7:10 PM, Robert Kr?ger >> wrote: >>> >>> Hi, >>> >>> it is part of a large application. I will try to build a simple test >>> program that demonstrates the effect. >>> >>> Cheers, >>> >>> Robert >>> >>> On Sun, Sep 29, 2013 at 5:26 PM, Rolf wrote: >>>> >>>> Hi Robert. >>>> >>>> This is surprising indeed, and I agree it should not be different from >>>> JDOM >>>> 1.x >>>> >>>> Can you get me a copy of the input file and the relevant parts of Java >>>> code? >>>> You don't need to CC the whole list it is large... >>>> >>>> Thanks >>>> >>>> Rolf >>>> >>>> >>>> On 29/09/2013 10:42 AM, Robert Kr?ger wrote: >>>>> >>>>> Hi, >>>>> >>>>> I just migrated my code to from JDOM to JDOM2 and noticed some of our >>>>> unit tests failed. The reason is different formatting. I used >>>>> Format.getPrettyFormat().setTextMode(PRESERVE) for the formatting and >>>>> with jdom this produced output like the following >>>>> >>>>> >>>>> MP4 >>>>> 646448 >>>>> 2002002 >>>>> 0 >>>>> 1340887741000 >>>>> >>>>> VIDEO >>>>> H.264 >>>>> ... >>>>> >>>>> after replacing the imports by jdom2 I got >>>>> >>>>> >>>>> >>>>> MP4 >>>>> >>>>> 646448 >>>>> >>>>> 2002002 >>>>> >>>>> 0 >>>>> >>>>> 1340887741000 >>>>> >>>>> >>>>> VIDEO >>>>> >>>>> H.264 >>>>> ... >>>>> >>>>> This looks rather broken as it does not preserve the original data at >>>>> all with all those added newlines. Removing the setTextMode(PRESERVE) >>>>> restored the format to what is shown above but the reason I added >>>>> setTextMode(PRESERVE) was that without it, whitespace was trimmed and >>>>> I do not want that for elements with text content. >>>>> >>>>> Is this a bug? How can I achieve what I want, i.e. have a "pretty", >>>>> i.e. indented format and have text-only elements preserve whitespace? >>>>> >>>>> Thanks in advance, >>>>> >>>>> Robert >>>>> _______________________________________________ >>>>> To control your jdom-interest membership: >>>>> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com >>>>> >> _______________________________________________ >> To control your jdom-interest membership: >> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com > > From jdom at tuis.net Sun Oct 6 13:05:59 2013 From: jdom at tuis.net (Rolf) Date: Sun, 06 Oct 2013 16:05:59 -0400 Subject: [jdom-interest] Fwd: Formatting differences after migrating to JDOM2 In-Reply-To: References: <52484699.6040002@tuis.net> <524F5890.40204@tuis.net> Message-ID: <5251C2A7.4090804@tuis.net> Hi Robert. OK. I have spent some time going through things, and, admittedly, this is confusing, and working through the combinations/permutations for formatting is liable to end in a headache. So, I think I have resolved that there are a number of issues at hand in your case: 1. JDOM2 is doing different things than JDOM1 2. JDOM1 is probably doing the wrong thing in this case 3. JDOM2 is also probably doing the wrong thing, but, in fairness, changing the 'TextMode' of a PrettyPrint format is a 'dangerous' thing .... not by design, but because of the actual implementation and choices the formatter makes with the pretty format. 4. If whitespace is significant for certain members of an XML document then you should not be relying on the whim of JDOM to make things right, but you should be using the xml:space="preserve" mechanism that is designed for this purpose. So, here are a few 'answers'. Answer 0: ===================================================== The output you are getting from JDOM 1.x is broken. If you have a 'preserve' text mode then there should be no whitespace between any elements (indenting/newlines) because that is not 'preserved' space (it's 'invented' whitespace). The JDOM output you currently get is relying on a bug in JDOM 1.x Answer 1: ===================================================== The "right" thing for you to do is to add the xml:space="preserve" to the sub2 elements: public static void main(String argv[]) throws Exception{ Document document = new Document(); *Attribute cloneme = new Attribute("space", "preserve", Namespace.XML_NAMESPACE);* Element root = new Element("root"); document.addContent(root); Element sub1 = new Element("sub1"); root.addContent(sub1); sub1.addContent(new Element("sub2").setText("Some text")*.setAttribute(cloneme.clone())*); sub1.addContent(new Element("sub2").setText(" text with left and right whitespace ")*.setAttribute(cloneme.clone())*); Format fmt = Format.getPrettyFormat(); XMLOutputter xout = new XMLOutputter(fmt); xout.output(document, System.out); } Gives the output: Some text text with left and right whitespace Answer 2: ===================================================== The "OK" thing for you to do is to use the TextMode.TRIM_FULL_WHITE instead of TextMode.PRESERVE... the default TextMode for PrettyPrint is TextMode.TRIM, which removes white-space from either-end of the text, but the TRIM_FULL_WHITE will remove whitespace only when there's only whitespace, and will do nothing if there's any non-whitespace characters. I want you to be aware that other tools (JDOM, xmllint) have the right to mess with the whitespace ( http://www.w3.org/TR/REC-xml/#sec-white-space ). It is only by convention that the following will work in JDOM (I recommend preserving whitespace correctly with xml:space="preserve") : public static void main(String argv[]) throws Exception{ Document document = new Document(); Element root = new Element("root"); document.addContent(root); Element sub1 = new Element("sub1"); root.addContent(sub1); sub1.addContent(new Element("sub2").setText("Some text")); sub1.addContent(new Element("sub2").setText(" text with left and right whitespace ")); Format fmt = Format.getPrettyFormat(); fmt.setTextMode(Format.TextMode.TRIM_FULL_WHITE); XMLOutputter xout = new XMLOutputter(fmt); xout.output(document, System.out); } Gives the output: Some text text with left and right whitespace Answer 3: ===================================================== JDOM 2.x uses a different (faster, and more flexible) algorithm for output handling. This algorithm has two major triggers: The TextMode and the Indent. PrettyPrint sets the TextMode to TRIM and the Indent to two spaces " ". The TRIM mode tells JDOM it can mess with whitespace in Text. The INDENT tells JDOM it can mess with the formatting of the XML structure (setting it to null tells JDOM not to mess with any indenting). You have been changing the TextMode to PRESERVE, and, as I think about that, JDOM should never mess with the indenting when the mode is PRESERVE. JDOM has code to make sure that it manages the INDENT and the TextMode correctly when they need to change internally, but you are basically setting an invalid situation by setting INDENT and PRESERVE at the same time. JDOM should handle that better. But, the right thing to do, is when you set PRESERVE, JDOM2 should output the following: Some text text with left and right whitespace So, I think there's a bug in JDOM2, and, given the input you have (Format.getPrettyFormat().setTextMode(TextMode.PRESERVE) ) It should be outputting the above (which is not what you want). Answer 4: ===================================================== You can use the Raw format, and output the spaces yourself by adding your own indenting and newlines. On 06/10/2013 1:09 PM, Robert Kr?ger wrote: > Hi Rolf, > > On Sat, Oct 5, 2013 at 2:08 AM, Rolf wrote: >> Hi Robert. >> >> Just so we are on the same page, when I run the code, I get the following >> output: >> >> with the setTextMode(...): >> new >> XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document, >> System.out); >> >> >> >> >> Some text >> >> >> text with left and right whitespace >> >> >> >> >> >> without the setTextMode(...) >> new XMLOutputter(Format.getPrettyFormat()).output(document, >> System.out); >> >> >> >> Some text >> text with left and right whitespace >> >> >> >> The plain "Pretty" format is the way I think you want the output, and it is >> right, right? > Yes, except for whitespace being trimmed. I do not want that but want > indenting and no whitespace trimming for text-only elements (that was > the behaviour of JDOM1). The use case is that I use xml to store data > (e.g. user input of a content management system) and removing > whitespace modifies the data, which I do not want to happen but I do > want indenting. > >> It is very unusual for someone ysing the PrettyFormat to modify the >> TextMode.... I wonder why you have the setTextMode() at all... > see above. > >> Rolf > Robert > >> >> >> On 30/09/2013 9:43 AM, Robert Kr?ger wrote: >>> forgot to reply to the list >>> >>> >>> ---------- Forwarded message ---------- >>> From: Robert Kr?ger >>> Date: Mon, Sep 30, 2013 at 3:42 PM >>> Subject: Re: [jdom-interest] Formatting differences after migrating to >>> JDOM2 >>> To: Rolf >>> >>> >>> This reproduces the behaviour: >>> >>> import org.jdom2.Document; >>> import org.jdom2.Element; >>> import org.jdom2.output.Format; >>> import org.jdom2.output.XMLOutputter; >>> >>> public class JDOMOutput { >>> >>> public static void main(String argv[]) throws Exception{ >>> Document document = new Document(); >>> Element root = new Element("root"); >>> document.addContent(root); >>> Element sub1 = new Element("sub1"); >>> root.addContent(sub1); >>> sub1.addContent(new Element("sub2").setText("Some text")); >>> sub1.addContent(new Element("sub2").setText(" text with left >>> and right whitespace ")); >>> new >>> XMLOutputter(Format.getPrettyFormat().setTextMode(Format.TextMode.PRESERVE)).output(document, >>> System.out); >>> } >>> >>> } >>> >>> Try with and without the setTextMode(Format.TextMode.PRESERVE). None >>> of them does what I need. >>> >>> On Sun, Sep 29, 2013 at 7:10 PM, Robert Kr?ger >>> wrote: >>>> Hi, >>>> >>>> it is part of a large application. I will try to build a simple test >>>> program that demonstrates the effect. >>>> >>>> Cheers, >>>> >>>> Robert >>>> >>>> On Sun, Sep 29, 2013 at 5:26 PM, Rolf wrote: >>>>> Hi Robert. >>>>> >>>>> This is surprising indeed, and I agree it should not be different from >>>>> JDOM >>>>> 1.x >>>>> >>>>> Can you get me a copy of the input file and the relevant parts of Java >>>>> code? >>>>> You don't need to CC the whole list it is large... >>>>> >>>>> Thanks >>>>> >>>>> Rolf >>>>> >>>>> >>>>> On 29/09/2013 10:42 AM, Robert Kr?ger wrote: >>>>>> Hi, >>>>>> >>>>>> I just migrated my code to from JDOM to JDOM2 and noticed some of our >>>>>> unit tests failed. The reason is different formatting. I used >>>>>> Format.getPrettyFormat().setTextMode(PRESERVE) for the formatting and >>>>>> with jdom this produced output like the following >>>>>> >>>>>> >>>>>> MP4 >>>>>> 646448 >>>>>> 2002002 >>>>>> 0 >>>>>> 1340887741000 >>>>>> >>>>>> VIDEO >>>>>> H.264 >>>>>> ... >>>>>> >>>>>> after replacing the imports by jdom2 I got >>>>>> >>>>>> >>>>>> >>>>>> MP4 >>>>>> >>>>>> 646448 >>>>>> >>>>>> 2002002 >>>>>> >>>>>> 0 >>>>>> >>>>>> 1340887741000 >>>>>> >>>>>> >>>>>> VIDEO >>>>>> >>>>>> H.264 >>>>>> ... >>>>>> >>>>>> This looks rather broken as it does not preserve the original data at >>>>>> all with all those added newlines. Removing the setTextMode(PRESERVE) >>>>>> restored the format to what is shown above but the reason I added >>>>>> setTextMode(PRESERVE) was that without it, whitespace was trimmed and >>>>>> I do not want that for elements with text content. >>>>>> >>>>>> Is this a bug? How can I achieve what I want, i.e. have a "pretty", >>>>>> i.e. indented format and have text-only elements preserve whitespace? >>>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> Robert >>>>>> _______________________________________________ >>>>>> To control your jdom-interest membership: >>>>>> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com >>>>>> >>> _______________________________________________ >>> To control your jdom-interest membership: >>> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From krueger at lesspain.de Mon Oct 7 00:40:12 2013 From: krueger at lesspain.de (=?UTF-8?Q?Robert_Kr=C3=BCger?=) Date: Mon, 7 Oct 2013 09:40:12 +0200 Subject: [jdom-interest] Fwd: Formatting differences after migrating to JDOM2 In-Reply-To: <5251C2A7.4090804@tuis.net> References: <52484699.6040002@tuis.net> <524F5890.40204@tuis.net> <5251C2A7.4090804@tuis.net> Message-ID: On Sun, Oct 6, 2013 at 10:05 PM, Rolf wrote: > Hi Robert. > > OK. I have spent some time going through things, and, admittedly, this is > confusing, and working through the combinations/permutations for formatting > is liable to end in a headache. > > So, I think I have resolved that there are a number of issues at hand in > your case: > 1. JDOM2 is doing different things than JDOM1 > 2. JDOM1 is probably doing the wrong thing in this case > 3. JDOM2 is also probably doing the wrong thing, but, in fairness, changing > the 'TextMode' of a PrettyPrint format is a 'dangerous' thing .... not by > design, but because of the actual implementation and choices the formatter > makes with the pretty format. > 4. If whitespace is significant for certain members of an XML document then > you should not be relying on the whim of JDOM to make things right, but you > should be using the xml:space="preserve" mechanism that is designed for this > purpose. > > So, here are a few 'answers'. > > Answer 0: > ===================================================== > The output you are getting from JDOM 1.x is broken. If you have a 'preserve' > text mode then there should be no whitespace between any elements > (indenting/newlines) because that is not 'preserved' space (it's 'invented' > whitespace). Yes, after thinking about it that was more or less the answer I expected. > > The JDOM output you currently get is relying on a bug in JDOM 1.x > > Answer 1: > ===================================================== > The "right" thing for you to do is to add the xml:space="preserve" to the > sub2 elements: > > > public static void main(String argv[]) throws Exception{ > Document document = new Document(); > Attribute cloneme = new Attribute("space", "preserve", > Namespace.XML_NAMESPACE); > > Element root = new Element("root"); > document.addContent(root); > Element sub1 = new Element("sub1"); > root.addContent(sub1); > sub1.addContent(new Element("sub2").setText("Some > text").setAttribute(cloneme.clone())); > sub1.addContent(new Element("sub2").setText(" text with left and > right whitespace ").setAttribute(cloneme.clone())); > Format fmt = Format.getPrettyFormat(); > XMLOutputter xout = new XMLOutputter(fmt); > xout.output(document, System.out); > } > > Gives the output: > > > > Some text > text with left and right whitespace > > > > > Answer 2: > ===================================================== > The "OK" thing for you to do is to use the TextMode.TRIM_FULL_WHITE instead > of TextMode.PRESERVE... the default TextMode for PrettyPrint is > TextMode.TRIM, which removes white-space from either-end of the text, but > the TRIM_FULL_WHITE will remove whitespace only when there's only > whitespace, and will do nothing if there's any non-whitespace characters. I > want you to be aware that other tools (JDOM, xmllint) have the right to mess > with the whitespace ( http://www.w3.org/TR/REC-xml/#sec-white-space ). It is > only by convention that the following will work in JDOM (I recommend > preserving whitespace correctly with xml:space="preserve") : > > > public static void main(String argv[]) throws Exception{ > Document document = new Document(); > Element root = new Element("root"); > document.addContent(root); > Element sub1 = new Element("sub1"); > root.addContent(sub1); > sub1.addContent(new Element("sub2").setText("Some text")); > sub1.addContent(new Element("sub2").setText(" text with left and > right whitespace ")); > Format fmt = Format.getPrettyFormat(); > fmt.setTextMode(Format.TextMode.TRIM_FULL_WHITE); > XMLOutputter xout = new XMLOutputter(fmt); > xout.output(document, System.out); > } > > Gives the output: > > > > > Some text > text with left and right whitespace > > > > Answer 3: > ===================================================== > JDOM 2.x uses a different (faster, and more flexible) algorithm for output > handling. This algorithm has two major triggers: The TextMode and the > Indent. PrettyPrint sets the TextMode to TRIM and the Indent to two spaces " > ". The TRIM mode tells JDOM it can mess with whitespace in Text. The INDENT > tells JDOM it can mess with the formatting of the XML structure (setting it > to null tells JDOM not to mess with any indenting). > You have been changing the TextMode to PRESERVE, and, as I think about that, > JDOM should never mess with the indenting when the mode is PRESERVE. JDOM > has code to make sure that it manages the INDENT and the TextMode correctly > when they need to change internally, but you are basically setting an > invalid situation by setting INDENT and PRESERVE at the same time. JDOM > should handle that better. > > But, the right thing to do, is when you set PRESERVE, JDOM2 should output > the following: > > Some text text with left and right > whitespace > > So, I think there's a bug in JDOM2, and, given the input you have > (Format.getPrettyFormat().setTextMode(TextMode.PRESERVE) ) It should be > outputting the above (which is not what you want). > > Answer 4: > ===================================================== > You can use the Raw format, and output the spaces yourself by adding your > own indenting and newlines. > Thanks a lot for your in-depth answer! It helps a lot. Robert