[jdom-interest] End-of-line sequence.

Rolf jdom at tuis.net
Mon Nov 14 18:22:55 PST 2011


I was anticipating the JDOM2 branch, yes.

The 'significance' on JDOM2 is that I am comparing output with other 
'standard' tools (xmllint, DOM, etc), and inspecting the differences, 
and this is coming up as one.

I currently view it as a low-risk easy-win concept.... other than people 
who have baseline/regression tests with a particular format (and for the 
moment, I think it is only JDOM's regression/junit tests that expect the 
EOL sequence to be any particular value...).

I have already broken that (in JDOM2) though for people using other 
'pretty' formats because JDOM was issuing double-newline-sequences at 
the end-of-file, and it now only issues one.

As for the random&hard-to-figure out pain, I am not sure.... it all 
depends on how you look at it: people who 'depend' on the \r\n sequence 
are just as likely to have those sorts of issues regardless of the JDOM 
setting.... There is *nothing* that should depend on the EOL sequence, 
thus, it *should* be safe to change....

... and again, it comes down to 'why is \r\n better than \n'? I can 
think of reasons why \n is better than \r\n, but not the other way 
around....

Rolf

On 14/11/2011 9:02 PM, Jason Hunter wrote:
> I can see this causing people some random, hard-to-figure-out pain.  I'd never want to do a change like this on the 1.x branch.  But on the 2.x branch?  It's a possibility.
>
> -jh-
>
> On Nov 14, 2011, at 5:29 PM, Rolf wrote:
>
>> Hi all.
>>
>> JDOM has been merrily using "\r\n" as an end-of-line sequence in the XMLOutputer since 'forever'. The XML Spec indicates that all end-of-line sequences should be normalized to a single '\n': http://www.w3.org/TR/REC-xml/#sec-line-ends The wording is such that XML parsers should clear out any extra '\r' characters if there are any, so it is not as if the code is completely broken.
>>
>> But, I think it makes sense to follow the spec, and avoid having different XML compared to other systems.
>>
>> I propose changing the line separator to follow the spec, but this has a very large impact on anyone who has expectations on JDOM having a particular line-terminator, even though they shouldn't...
>>
>> I have filed https://github.com/hunterhacker/jdom/issues/53
>>
>> The original decision was made by Elliotte: http://markmail.org/message/gv7m3xjgrkomrfe7   (it's worth noting that it was changed from the 'platform default' to the constant '\r\n' to create some consistency too).
>>
>> vvv quote vvv
>>
>> The one open question in this version is what to use for a line separator. Right now I'm using \r\n since that's most cross-platform compatible and friendliest to various network protocols. However, \n alone might be slightly friendlier to XML parsers. Another possibility is to ask for System.getProperty("line.separator"). However, I'm loathe to make the output platform dependent. What do people think?
>>
>> ^^^ quote ^^^
>>
>> Also, the commit introducing this has interesting comments: https://github.com/hunterhacker/jdom/commit/958fb22a4c7088b82f0d48a933bdf4e5c6806151#L0R173
>>
>> Two issues I see:
>> 1. "\r\n" was chosen for 'Network protocol' friendliness... is this still a valid argument?
>>
>> 2. is it OK to change the standard format of all the XML that JDOM produces? (I have been really careful (so far) for the most part to ensure all whitespace (including indents and EOL/EOF is not changed) ).
>>
>> I see changing the default EOL as being an easy decision, especially since users can still change it back easily on their Format instance.
>>
>> advantages:
>> 1. Most XML tools do not use "\r" values - better compatibility?
>> 2. XML output will be slightly smaller - ;-)
>> 3. XML produced by 'other' outputters (currently the StAX outputters) can be compared directly with XMLOutputter for testing/compatibility
>>
>> disadvantages:
>> 1. people may have 'baselines' that contain \r\n terminators, which will then be different from JDOM's default output.
>> 2. there may be some (obscure) protocols that require \r\n terminators and users of JDOM2 will have to override the EOL to be '\r\n' for those.
>>
>> Anyone have comments/suggestions?
>>
>> Rolf
>> _______________________________________________
>> To control your jdom-interest membership:
>> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>



More information about the jdom-interest mailing list