You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Zvi Har'El <rl...@math.technion.ac.il> on 2003/12/14 14:13:34 UTC
AddCharset filename extensions
The default httpd.conf includes the lines
AddCharset ISO-8859-1 .iso8859-1 .latin1
AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen
AddCharset ISO-8859-3 .iso8859-3 .latin3
AddCharset ISO-8859-4 .iso8859-4 .latin4
AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru
AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb
AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk
AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb
AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk
However, quick look at http://www.iana.org/assignments/character-sets shows
that calling the non-latin charsets ISO8859-N by the name latinN is wrong.
For example, latin8 is ISO-8859-14, or iso-celtic, and certainly not
ISO-8859-8, which is just hebrew! Similarly, latin6 is ISO-8859-10, and not
ISO-8859-6, which is arabic! Finally, latin5 is ISO-8859-9, turkish, and not
ISO-8859-5, which is cyrillic. latin1-4 are ok, and I didn't find latin7 in
this reference at all. I suggest httpd.conf should be fixed accordingly.
--
Dr. Zvi Har'El mailto:rl@math.technion.ac.il Department of Mathematics
tel:+972-54-227607 icq:179294841 Technion - Israel Institute of Technology
fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/ Haifa 32000, ISRAEL
"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
Sunday, 19 Kislev 5764, 14 December 2003, 3:03PM
Re: AddCharset filename extensions
Posted by Zvi Har'El <rl...@math.technion.ac.il>.
To make my point clearer, here is the patch:
--- httpd-2.0.48/docs/conf/httpd-std.conf.in.~20031011014743~ 2003-10-11 03:47:43.000000000 +0200
+++ httpd-2.0.48/docs/conf/httpd-std.conf.in 2003-12-15 18:47:07.000000000 +0200
@@ -797,11 +797,15 @@
AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen
AddCharset ISO-8859-3 .iso8859-3 .latin3
AddCharset ISO-8859-4 .iso8859-4 .latin4
-AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru
-AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb
-AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk
-AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb
-AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk
+AddCharset ISO-8859-5 .iso8859-5 .cyr .iso-ru
+AddCharset ISO-8859-6 .iso8859-6 .arb
+AddCharset ISO-8859-7 .iso8859-7 .grk
+AddCharset ISO-8859-8 .iso8859-8 .heb
+AddCharset ISO-8859-9 .iso8859-9 .latin5 .trk
+AddCharset ISO-8859-10 .iso8859-10 .latin6
+AddCharset ISO-8859-13 .iso8859-13 .latin7
+AddCharset ISO-8859-14 .iso8859-14 .latin8
+AddCharset ISO-8859-15 .iso8859-15 .latin9
AddCharset ISO-2022-JP .iso2022-jp .jis
AddCharset ISO-2022-KR .iso2022-kr .kis
AddCharset ISO-2022-CN .iso2022-cn .cis
I have also included latin7 and latin9, which for some reason absent from IANA,
but appear as standard in in the FSF's "free recode". BTW, instead of
inventing new charset abbreviations like .cyr, .arb, .grk, .heb, I would
personally prefer using the IANA (RFC 1345) aliases: .cyrillic, .arabic,
.greek, .hebrew, in the same way we use .latin1, .latin2 , etc, but this is a
matter of opinion, not bug fix patching.
On Sun, 14 Dec 2003 15:13:34 +0200, Zvi Har'El wrote about "AddCharset filename extensions":
> The default httpd.conf includes the lines
>
> AddCharset ISO-8859-1 .iso8859-1 .latin1
> AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen
> AddCharset ISO-8859-3 .iso8859-3 .latin3
> AddCharset ISO-8859-4 .iso8859-4 .latin4
> AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru
> AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb
> AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk
> AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb
> AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk
>
> However, quick look at http://www.iana.org/assignments/character-sets shows
> that calling the non-latin charsets ISO8859-N by the name latinN is wrong.
> For example, latin8 is ISO-8859-14, or iso-celtic, and certainly not
> ISO-8859-8, which is just hebrew! Similarly, latin6 is ISO-8859-10, and not
> ISO-8859-6, which is arabic! Finally, latin5 is ISO-8859-9, turkish, and not
> ISO-8859-5, which is cyrillic. latin1-4 are ok, and I didn't find latin7 in
> this reference at all. I suggest httpd.conf should be fixed accordingly.
>
> --
> Dr. Zvi Har'El mailto:rl@math.technion.ac.il Department of Mathematics
> tel:+972-54-227607 icq:179294841 Technion - Israel Institute of Technology
> fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/ Haifa 32000, ISRAEL
> "If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
> Sunday, 19 Kislev 5764, 14 December 2003, 3:03PM
--
Dr. Zvi Har'El mailto:rl@math.technion.ac.il Department of Mathematics
tel:+972-54-227607 icq:179294841 Technion - Israel Institute of Technology
fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/ Haifa 32000, ISRAEL
"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
Monday, 21 Kislev 5764, 15 December 2003, 6:58PM