You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Amthauer, Heiner" <He...@t-systems.com> on 2002/12/04 08:30:01 UTC
FAQ, encodingtool and demo required!?
Hi Folks!
Approx. 80% of the last 50 questions (just a guess, not mathematically
correct) have been about enconding and the errors around it. Should'nt there
be a FAQ and demosources, as well as a tool for converting text into some
encoding (e.g. using escape codes)? I would write one, of I'd know anything
about it.
greetings
---------------------------------------------------------------
Dipl. Ing. Heiner Amthauer
T-Systems GEI GmbH
Hausanschrift: Magirusstr. 39/1, 89077 Ulm
Postanschrift: Postfach 20 64, 89010 Ulm
Telefon: +49 ( 731) 9344-4422
Telefax: +49 (731) 9344-4409
Mobil: +49 (1 78) 4269335
E-Mail: heiner.amthauer@t-systems.com
Internet: http://www.t-systems.com
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: FAQ, encodingtool and demo required!?
Posted by Joseph Kesselman <ke...@us.ibm.com>.
On Wednesday, 12/04/2002 at 08:30 CET, "Amthauer, Heiner"
<He...@t-systems.com> wrote:
> a FAQ and demosources
There probably should be, somewhere. But encoding is a general Unicode
question, and escaping characters not supported by the encoding is a
general XML question, so there may be quibbles about who should be
responsible for writing that FAQ and where it should live...
> as well as a tool for converting text into some encoding (e.g. using
escape codes)
Any decent Unicode support library can do this (read using one encoding,
write using another) -- but before you can convert text to a specific
encoding, you need to know what encoding you're converting it from, which
takes us back to the FAQ.
Conceptually, there really isn't much to the concept of encodings. Unicode
can represent a HUGE set of characters, basically covering every natural
human language's character set and with space reserved to support
"unnatural" ones. (Klingon, anyone?). All the encoding does is say how
those characters are mapped to and from what actually appears in your data
stream -- which may involve a varying number of bytes per character,
and/or may involve mapping a Unicode character number to a different
character number in that specific encoding.
______________________________________
Joe Kesselman / IBM Research
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org