[jdom-interest] End-of-line sequence.
Rolf
jdom at tuis.net
Mon Nov 14 17:29:00 PST 2011
Hi all.
JDOM has been merrily using "\r\n" as an end-of-line sequence in the
XMLOutputer since 'forever'. The XML Spec indicates that all end-of-line
sequences should be normalized to a single '\n':
http://www.w3.org/TR/REC-xml/#sec-line-ends The wording is such that XML
parsers should clear out any extra '\r' characters if there are any, so
it is not as if the code is completely broken.
But, I think it makes sense to follow the spec, and avoid having
different XML compared to other systems.
I propose changing the line separator to follow the spec, but this has a
very large impact on anyone who has expectations on JDOM having a
particular line-terminator, even though they shouldn't...
I have filed https://github.com/hunterhacker/jdom/issues/53
The original decision was made by Elliotte:
http://markmail.org/message/gv7m3xjgrkomrfe7 (it's worth noting that
it was changed from the 'platform default' to the constant '\r\n' to
create some consistency too).
vvv quote vvv
The one open question in this version is what to use for a line
separator. Right now I'm using \r\n since that's most cross-platform
compatible and friendliest to various network protocols. However, \n
alone might be slightly friendlier to XML parsers. Another possibility
is to ask for System.getProperty("line.separator"). However, I'm loathe
to make the output platform dependent. What do people think?
^^^ quote ^^^
Also, the commit introducing this has interesting comments:
https://github.com/hunterhacker/jdom/commit/958fb22a4c7088b82f0d48a933bdf4e5c6806151#L0R173
Two issues I see:
1. "\r\n" was chosen for 'Network protocol' friendliness... is this
still a valid argument?
2. is it OK to change the standard format of all the XML that JDOM
produces? (I have been really careful (so far) for the most part to
ensure all whitespace (including indents and EOL/EOF is not changed) ).
I see changing the default EOL as being an easy decision, especially
since users can still change it back easily on their Format instance.
advantages:
1. Most XML tools do not use "\r" values - better compatibility?
2. XML output will be slightly smaller - ;-)
3. XML produced by 'other' outputters (currently the StAX outputters)
can be compared directly with XMLOutputter for testing/compatibility
disadvantages:
1. people may have 'baselines' that contain \r\n terminators, which will
then be different from JDOM's default output.
2. there may be some (obscure) protocols that require \r\n terminators
and users of JDOM2 will have to override the EOL to be '\r\n' for those.
Anyone have comments/suggestions?
Rolf
More information about the jdom-interest
mailing list