[jdom-interest] XMLOutputter
andnewlinesafterdeclaration/doctype
Alex Rosen
arosen at novell.com
Fri Dec 20 09:03:30 PST 2002
Yup, I was talking about text editors.
That is a good point about not handling newlines in Vadim's case (which
is separate from the case that I'm talking about). Although, what if he
used a FilterOutputStream to post-process the output of XMLOutputter,
and replaces all newline characters with &x10; or &x13; as appropriate?
Are character references allowed outside of the root element (e.g. right
after the XML declaration)?
Alex
>>> Elliotte Rusty Harold <elharo at metalab.unc.edu> 12/20/02 05:03AM
>>>
At 5:21 PM -0700 12/19/02, Alex Rosen wrote:
>Regardless of this particular case, I don't think that being an XML
>Nazi (pardon the expression) is the way to go in general. Plenty of
>people use non-XML tools on their XML documents, if only to look at
>them. This won't change any time soon. So, things outside the spec do
>matter. Maybe in an ideal world they wouldn't, but it the real world
>they do. I don't think there should be any hard and fast rule of XML
>infoset good, any other syntax bad.
The primary non-XML tool used on such data is a text editor. The
prevalence of that completely swamps all other non-XML use cases.
That's where JDOM needs to default to when it has a choice.
Beyond text editors, most non-parser based tools fail when presented
with XML documents sooner or later, normally sooner. Regular
expressions can't handle markup embedded in CDATA sections, comments
and processing instructions. Using XML legal characters such as line
feeds as document delimiters in a single file (as Vadim wants to do)
fails as soon as some of your data contains that character, even if a
comment or a tag.
Sooner or later everyone who tries to do this learns this lesson: you
need a parser to handle XML. Nothing less will do. If a parser is too
heavyweight for you (which is rarely true), then you need to use
something other than XML.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| XML in a Nutshell, 2nd Edition (O'Reilly, 2002) |
| http://www.cafeconleche.org/books/xian2/ |
| http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
More information about the jdom-interest
mailing list