[jdom-interest] XML Multilanguage support
Murray Altheim
Murray.Altheim at eng.sun.com
Fri Aug 4 15:00:58 PDT 2000
tsasala at hifusion.com wrote:
>
> What's the easiest way to support multiple languages with
> XML/JDOM? We have people entering spanish content and it looks
> like the parser is loosing the spanish characters. Would changing
> the character set to UTF-16 help? I'm totally in the dark on
> this one. Any suggestions would be greatly appreciated.
You can use the scoped attribute 'xml:lang', whose values are specified
in the XML 1.0 Recommendation as language and/or country codes. You don't
need to change the character set. XML uses what is essentially Unicode,
and contains the ability to represent almost all of the world's various
written languages.
Murray
...........................................................................
Murray Altheim, SGML/XML Grease Monkey <mailto:altheim@eng.sun.com>
XML Technology Center
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025
In the evening
The rice leaves in the garden
Rustle in the autumn wind
That blows through my reed hut. -- Minamoto no Tsunenobu
Received: from cs.umb.edu (root at cs.umb.edu [158.121.104.2])
by dorothy.denveronline.net (8.9.3/8.9.3) with ESMTP id PAA13354
for <jdom-interest at jdom.org>; Fri, 4 Aug 2000 15:42:02 -0600 (MDT)
Received: from eris.cs.umb.edu (junwan at eris.cs.umb.edu [158.121.104.162])
by cs.umb.edu (8.8.8/8.8.8) with ESMTP id RAA21993
for <jdom-interest at jdom.org>; Fri, 4 Aug 2000 17:42:00 -0400 (EDT)
Received: from localhost (junwan at localhost)
by eris.cs.umb.edu (8.8.8/8.8.8) with ESMTP id RAA14941
for <jdom-interest at jdom.org>; Fri, 4 Aug 2000 17:41:59 -0400 (EDT)
X-Authentication-Warning: eris.cs.umb.edu: junwan owned process doing -bs
Date: Fri, 4 Aug 2000 17:41:59 -0400 (EDT)
From: Jun Wan <junwan at cs.umb.edu>
To: jdom-interest at jdom.org
Message-ID: <Pine.GSO.4.05.10008041723150.14578-100000 at eris.cs.umb.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [jdom-interest] predefined namespace
Sender: jdom-interest-admin at jdom.org
Errors-To: jdom-interest-admin at jdom.org
X-BeenThere: jdom-interest at jdom.org
X-Mailman-Version: 2.0beta2
Precedence: bulk
List-Id: JDOM Mailing List for General Issues and Updates <jdom-interest.jdom.org>
I was testing the following lang.xml
===========================
<?xml version="1.0" ?>
<lang xml:lang="en">
<elm1>elm1</elm1>
<elm2>
<subelm1>subelm1</subelm1>
<subelm2>subelm2</subelm2>
</elm2>
</lang>
===========================
using the jdom to process it:
===========================
import org.jdom.*;
import java.io.*;
public class Test {
public static void main(String[] args) {
org.jdom.input.SAXBuilder builder = new org.jdom.input.SAXBuilder();
try {
Document doc = builder.build(new File("lang.xml"));
System.out.println(doc.getRootElement().getChild("elm1").getContent());
}catch (org.jdom.JDOMException je) {
System.out.println(je);
}
}
}
===========================
the compile was fine, but when I run it, it gave following error msg
===========================
org.jdom.JDOMException: The name "" is not legal for JDOM/XML namespaces:
Namesp
ace URIs must be non-null and non-empty Strings..: The name "" is not
legal for
JDOM/XML namespaces: Namespace URIs must be non-null and non-empty
Strings..
===========================
after I changed the namespace of lang.xml as following
===========================
<?xml version="1.0" ?>
<lang xmlns:xml="something" xml:lang="en">
<elm1>elm1</elm1>
<elm2>
<subelm1>subelm1</subelm1>
<subelm2>subelm2</subelm2>
</elm2>
</lang>
===========================
Then it works.
So, seems like jdom don't recognize xml as a predefined namespace (as
opposed to xmlns here), and it
needs explicit definition in an xml file, since 'xml' namespace is
reserved and inchangable, wouldn't it be nice that jdom take xml: without
extra declaration?
|\ _,,,---,,_
WWwwww /,`.-'`' -. ;-;;,_
|,4- ) )-,_. ,\ ( `'-'
'---''(_/--' `-'\_)
Jun Wan
Dept. of Math. & Computer Science
Univ. of Massachusetts Boston
Received: from lukla.Sun.COM (lukla.Sun.COM [192.18.98.31])
by dorothy.denveronline.net (8.9.3/8.9.3) with ESMTP id PAA11888
for <jdom-interest at jdom.org>; Fri, 4 Aug 2000 15:14:20 -0600 (MDT)
Received: from engmail4.Eng.Sun.COM ([129.144.134.6])
by lukla.Sun.COM (8.9.3+Sun/8.9.3) with ESMTP id PAA22386;
Fri, 4 Aug 2000 15:14:17 -0600 (MDT)
Received: from mehitabel.eng.sun.com (mehitabel.Eng.Sun.COM [129.146.82.247])
by engmail4.Eng.Sun.COM (8.9.3+Sun/8.9.3/ENSMAIL,v1.7) with ESMTP id OAA17423;
Fri, 4 Aug 2000 14:14:14 -0700 (PDT)
Received: from eng.sun.com (localhost [127.0.0.1])
by mehitabel.eng.sun.com (8.9.1b+Sun/8.9.1) with ESMTP id OAA11405;
Fri, 4 Aug 2000 14:14:52 -0700 (PDT)
Message-ID: <398B324C.517A6635 at eng.sun.com>
Date: Fri, 04 Aug 2000 14:14:52 -0700
From: Murray Altheim <Murray.Altheim at eng.sun.com>
Organization: Sun Microsystems, Inc.
X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.7 sun4u)
X-Accept-Language: en
MIME-Version: 1.0
To: Wesley Biggs <wbiggs at elite.com>
CC: "'Will Glozer'" <will.glozer at jda.com>,
"'jdom-interest at jdom.org'" <jdom-interest at jdom.org>
Subject: Re: [jdom-interest] XMLOutputter NPE
References: <D6F9138561A6D31181E3009027E40C0EFD4C6B at mailhost2.elite.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jdom-interest-admin at jdom.org
Errors-To: jdom-interest-admin at jdom.org
X-BeenThere: jdom-interest at jdom.org
X-Mailman-Version: 2.0beta2
Precedence: bulk
List-Id: JDOM Mailing List for General Issues and Updates <jdom-interest.jdom.org>
Wesley Biggs wrote:
>
> Will,
>
> The XML specification states that there should be no semantic difference
> between
> <tag></tag>
> and
> <tag />
> So if you rely on that to indicate a null (and it's tempting, I've been down
> that path) you're probably going to get thrown to the wolves on the whim of
> whatever parser/outputter you choose.
>
> What you're probably looking for is
> <tag xsl:null="true" />
>
> Which JDOM doesn't directly support (you have to define the XSL namespace,
> etc.), but it at least gives you a clear and reliable semantic distinction
> between null and empty.
Are you talking about xsi:null="true" that is currently part of the XML
Schema spec, or using something from XSL? I don't see any such thing in
XSL, so I'm guessing the former. If so, I wouldn't rely on 'xsi:null' even
being in the final XML Schema Structures specification. Things are still
very much in flux, and the question of how nulls are represented is to my
knowledge up in the air, although I'm not on the WG itself.
Murray
...........................................................................
Murray Altheim, SGML/XML Grease Monkey <mailto:altheim@eng.sun.com>
XML Technology Center
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025
In the evening
The rice leaves in the garden
Rustle in the autumn wind
That blows through my reed hut. -- Minamoto no Tsunenobu
Received: from lukla.Sun.COM (lukla.Sun.COM [192.18.98.31])
by dorothy.denveronline.net (8.9.3/8.9.3) with ESMTP id OAA14941
for <jdom-interest at jdom.org>; Thu, 3 Aug 2000 14:28:21 -0600 (MDT)
Received: from engmail3.Eng.Sun.COM ([129.144.170.5])
by lukla.Sun.COM (8.9.3+Sun/8.9.3) with ESMTP id OAA01674;
Thu, 3 Aug 2000 14:28:17 -0600 (MDT)
Received: from mehitabel.eng.sun.com (mehitabel.Eng.Sun.COM [129.146.82.247])
by engmail3.Eng.Sun.COM (8.9.3+Sun/8.9.3/ENSMAIL,v1.7) with ESMTP id NAA15893;
Thu, 3 Aug 2000 13:27:53 -0700 (PDT)
Received: from eng.sun.com (localhost [127.0.0.1])
by mehitabel.eng.sun.com (8.9.1b+Sun/8.9.1) with ESMTP id NAA10784;
Thu, 3 Aug 2000 13:28:29 -0700 (PDT)
Message-ID: <3989D5ED.72582FDD at eng.sun.com>
Date: Thu, 03 Aug 2000 13:28:29 -0700
From: Murray Altheim <Murray.Altheim at eng.sun.com>
Organization: Sun Microsystems, Inc.
X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.7 sun4u)
X-Accept-Language: en
MIME-Version: 1.0
To: Brett McLaughlin <brett.mclaughlin at lutris.com>
CC: Bernhard Boser <boser at eecs.berkeley.edu>,
"'jdom-interest at jdom.org'" <jdom-interest at jdom.org>
Subject: Re: [jdom-interest] problem with JDOM / Namespaces
References: <FD905415E0ECD011922300A0C9558C4BBB41EB at ravine.EECS.Berkeley.EDU> <39899ADF.410BFA92 at lutris.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: jdom-interest-admin at jdom.org
Errors-To: jdom-interest-admin at jdom.org
X-BeenThere: jdom-interest at jdom.org
X-Mailman-Version: 2.0beta2
Precedence: bulk
List-Id: JDOM Mailing List for General Issues and Updates <jdom-interest.jdom.org>
Brett McLaughlin wrote:
>
> Bernhard Boser wrote:
> >
> > This XML document is produced with jdom (see code below).
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> >
> > <root xmlns="http://ns1">
> > <ns2:ns2elementA xmlns:ns2="http://ns2" />
> > <ns2:ns2elementB />
> > </root>
>
> It is technically proper XML - the name of the first nested element is
> ns2elementA in the http://ns2 namespace, the name of the second element
> is ns2:ns2elementB. XML 1.0 allows names with colons, although it
> clearly says that they may be used later on. In JDOM, we don't allow
> names with colons (at least in Beta 4) - that's why you are getting the
> error. But we need to fix this, so we'll take a look - are you in JDOM
> Beta 4, or the latest CVS? Try the latest CVS if you aren't there, it
> may be fixed by now (we've had this reported before).
It's technically proper XML 1.0 but not technically proper XML 1.0 plus
Namespaces. If you're going to put something up the looks like a toilet,
it ought to be a toilet, otherwise things get messy. If it has an xmlns
attribute, you're using XML Namespaces, so you should comply with that
spec.
The error you're getting is that the parser is namespace-aware, and
the second element (<ns2:ns2elementB />) has no defined namespace. 'ns2'
is defined only in the scope of the first element. You should move the
xmlns:ns2 attribute declaration up to the document element to solve this
one, as in:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://ns1"
xmlns:ns2="http://ns2">
<ns2:ns2elementA />
<ns2:ns2elementB />
</root>
Murray
...........................................................................
Murray Altheim, SGML/XML Grease Monkey <mailto:altheim@eng.sun.com>
XML Technology Center
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025
In the evening
The rice leaves in the garden
Rustle in the autumn wind
That blows through my reed hut. -- Minamoto no Tsunenobu
Received: from web1302.mail.yahoo.com (web1302.mail.yahoo.com [128.11.23.152])
by dorothy.denveronline.net (8.9.3/8.9.3) with SMTP id OAA16518
for <jdom-interest at jdom.org>; Thu, 3 Aug 2000 14:48:59 -0600 (MDT)
Received: (qmail 23495 invoked by uid 60001); 3 Aug 2000 20:49:33 -0000
Message-ID: <20000803204933.23494.qmail at web1302.mail.yahoo.com>
Received: from [206.181.95.130] by web1302.mail.yahoo.com; Thu, 03 Aug 2000 13:49:33 PDT
Date: Thu, 3 Aug 2000 13:49:33 -0700 (PDT)
From: Srikanth Rao <gsr88 at yahoo.com>
To: jdom-interest at jdom.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Subject: [jdom-interest] Problem using getChildren() of Element object
Sender: jdom-interest-admin at jdom.org
Errors-To: jdom-interest-admin at jdom.org
X-BeenThere: jdom-interest at jdom.org
X-Mailman-Version: 2.0beta2
Precedence: bulk
List-Id: JDOM Mailing List for General Issues and Updates <jdom-interest.jdom.org>
Hi,
I am using Jdom to parse an XML document. According to
the JDom API the code
List childElements = documentElement.getChildren();
System.out.println(childElements.size());
should print the number of child elements one level
deep within this element where documentElement is the
root element of the document.
But to my surprise it gave the value which is twice
the actual value of what it should return.
Is it a bug in the JDom or there is some problem
with my code. Please reply asap.
Thanks
srikanth
__________________________________________________
Do You Yahoo!?
Kick off your party with Yahoo! Invites.
http://invites.yahoo.com/
More information about the jdom-interest
mailing list