You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2004/03/19 18:31:44 UTC

Re: AddCharset filename extensions (again)

Representing a huge palette of code pages - I'd recommend our docs
folk consider this and comment or commit.

Bill

At 10:59 AM 3/19/2004, Zvi Har'El wrote:
>Dear Apache developers,
>
>I sent the following three months ago, but since I got no response, and now
>2.0.49 has been rolled without the patch, I resubmit it for you attention:
>
>
>The default httpd.conf includes the lines
>
>AddCharset ISO-8859-1  .iso8859-1  .latin1
>AddCharset ISO-8859-2  .iso8859-2  .latin2 .cen
>AddCharset ISO-8859-3  .iso8859-3  .latin3
>AddCharset ISO-8859-4  .iso8859-4  .latin4
>AddCharset ISO-8859-5  .iso8859-5  .latin5 .cyr .iso-ru
>AddCharset ISO-8859-6  .iso8859-6  .latin6 .arb
>AddCharset ISO-8859-7  .iso8859-7  .latin7 .grk
>AddCharset ISO-8859-8  .iso8859-8  .latin8 .heb
>AddCharset ISO-8859-9  .iso8859-9  .latin9 .trk
>
>However, quick look at http://www.iana.org/assignments/character-sets shows
>that calling the non-latin charsets ISO8859-N by the name latinN is wrong. 
>For example, latin8 is ISO-8859-14, or iso-celtic, and certainly not
>ISO-8859-8, which is just hebrew! Similarly, latin6 is ISO-8859-10, and not
>ISO-8859-6, which is arabic! Finally, latin5 is ISO-8859-9, turkish, and not
>ISO-8859-5, which is cyrillic. latin1-4 are ok, and I didn't find latin7 in
>this reference at all. I suggest httpd.conf should be fixed accordingly.
>
>To make my point clearer, here is the patch:
>
>
>--- httpd-2.0.48/docs/conf/httpd-std.conf.in.~20031011014743~   2003-10-11 03:47:43.000000000 +0200
>+++ httpd-2.0.48/docs/conf/httpd-std.conf.in    2003-12-15 18:47:07.000000000 +0200
>@@ -797,11 +797,15 @@
> AddCharset ISO-8859-2  .iso8859-2  .latin2 .cen
> AddCharset ISO-8859-3  .iso8859-3  .latin3
> AddCharset ISO-8859-4  .iso8859-4  .latin4
>-AddCharset ISO-8859-5  .iso8859-5  .latin5 .cyr .iso-ru
>-AddCharset ISO-8859-6  .iso8859-6  .latin6 .arb
>-AddCharset ISO-8859-7  .iso8859-7  .latin7 .grk
>-AddCharset ISO-8859-8  .iso8859-8  .latin8 .heb
>-AddCharset ISO-8859-9  .iso8859-9  .latin9 .trk
>+AddCharset ISO-8859-5  .iso8859-5  .cyr .iso-ru
>+AddCharset ISO-8859-6  .iso8859-6  .arb
>+AddCharset ISO-8859-7  .iso8859-7  .grk
>+AddCharset ISO-8859-8  .iso8859-8  .heb
>+AddCharset ISO-8859-9  .iso8859-9  .latin5 .trk
>+AddCharset ISO-8859-10  .iso8859-10  .latin6 
>+AddCharset ISO-8859-13  .iso8859-13  .latin7 
>+AddCharset ISO-8859-14  .iso8859-14  .latin8 
>+AddCharset ISO-8859-15  .iso8859-15  .latin9 
> AddCharset ISO-2022-JP .iso2022-jp .jis
> AddCharset ISO-2022-KR .iso2022-kr .kis
> AddCharset ISO-2022-CN .iso2022-cn .cis
>
>
>
>
>I have also included latin7 and latin9, which for some reason absent from IANA,
>but appear as standard in in  the FSF's "free recode". BTW, instead of
>inventing new charset abbreviations like .cyr, .arb, .grk, .heb, I would
>personally prefer using the IANA (RFC 1345) aliases: .cyrillic, .arabic,
>.greek, .hebrew, in the same way we use .latin1, .latin2 , etc, but this is a
>matter of opinion, not bug fix patching.
>
>Best,
>
>Zvi.
>
>-- 
>Dr. Zvi Har'El     mailto:rl@math.technion.ac.il    Department of Mathematics
>tel:+972-54-227607 icq:179294841     Technion - Israel Institute of Technology
>fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/    Haifa 32000, ISRAEL
>"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
>                                  Friday, 27 Adar 5764, 19 March 2004,  6:53PM



---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org