You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Toby H Ferguson <to...@sun.com> on 2001/01/02 15:35:51 UTC
RE: About XML encoding
Suk Tae Kyung,
you *must* use an encoding in which all your various languages can be
encoded if they're all going to be in the same file - however, you can break
up the file into different files, each with their own encoding, and that way
get around your problem with UTF-8
Why do you think the UTF-8 data is broken? THere are not many editors which
understand UTF-8 - I normally read UTF-8 documents using browsers, which are
UTF-8 aware.
Toby
-----Original Message-----
From: Suk Tae Kyung [mailto:tkstone@penta.co.kr]
Sent: Tuesday, December 26, 2000 10:50 PM
To: Xerces
Subject: Q:About XML encoding
Hi, Xerces Developers. I have a question on the encoding of XML.
Although this question is not on Xerces, it has relation to Xerces.
To specify encoding info, one should write xml like this...
<?xml version="1.0" encoding="some encoding" ?>
<~~~>
</~~~>
If i use Korean data, i should specify "euc-kr" encoding. But
if i use Korean and Chinease and Japanese in one XML, which encoding
should i set?
It would be possible that all data is transformed into UTF-8 and
specifying "UTF-8"
as encoding. But this method is not good because all data is transformed
into UTF-8
first and this data is not good at reading.(data seems to be broken)
Does any one have good idea?
Thanks in advance.
Regards.
Suk Tae Kyung
Re: About XML encoding
Posted by Andy Clark <an...@apache.org>.
It's nice that Notepad in Win2K now can write its output as UTF-8,
as well as UTF-16.
--
Andy Clark * IBM, TRL - Japan * andyc@apache.org