You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jaxme-dev@ws.apache.org by Jochen Wiedmann <jo...@gmail.com> on 2005/10/04 23:18:20 UTC
Re: encoding: how does this work
Hi, Dean,
sorry for replying late to your various mails. I've been in vacation
until today. Trying to work up my old mails. Expect me to reply to
everything until tomorrow.
Dean Hiller wrote:
> so what is the point of the xml spec having the encoding attribute if
> you have to figure out the encoding before you even get to the specified
> encoding.
Detecting the encoding from the first bytes is not always possible.
There are a real lot of encodings, which are upwards compatible to ASCII
in the range of 0..127, for example US-ASCII itself (obviously),
ISO-8859-1, and UTF-8.
In the above cases, the encoding is required.
Jochen
---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org
Re: encoding: how does this work
Posted by Jochen Wiedmann <jo...@gmail.com>.
Dean Hiller wrote:
> "in the above cases".....but isn't that a catch 22. How can you read
> what encoding it is, if you don't know what encoding it is?
I never wrote an XML parser, but I would assume that the basic rules
are: Read the first four or so bytes to identify an encoding family
(upwards compatible to US-ASCII, EBCDIC, UTF-16, ...). These first bytes
are also sufficient to detect whether an XML declaration is present.
(Always the case for EBCDIC, UTF-16, ...)
If an XML declaration is present, continue to read the declaration,
including the optional "encoding" attribute, which specifies the family
member.
Jochen
---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org
Re: encoding: how does this work
Posted by Dean Hiller <de...@xsoftware.biz>.
on vacation...no problem
"in the above cases".....but isn't that a catch 22. How can you read
what encoding it is, if you don't know what encoding it is?
thanks,
dean
Jochen Wiedmann wrote:
>
> Hi, Dean,
>
> sorry for replying late to your various mails. I've been in vacation
> until today. Trying to work up my old mails. Expect me to reply to
> everything until tomorrow.
>
> Dean Hiller wrote:
>
>> so what is the point of the xml spec having the encoding attribute if
>> you have to figure out the encoding before you even get to the
>> specified encoding.
>
>
> Detecting the encoding from the first bytes is not always possible.
> There are a real lot of encodings, which are upwards compatible to
> ASCII in the range of 0..127, for example US-ASCII itself (obviously),
> ISO-8859-1, and UTF-8.
>
> In the above cases, the encoding is required.
>
>
> Jochen
>
---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org