You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Martin Strand <do...@gmail.com> on 2007/09/10 02:10:59 UTC

[users@httpd] utf-8 encoded htaccess?

I'm using non-ascii characters for AuthName but they show up as garbage in  
firefox.
Changing the encoding for .htaccess from utf-8 to iso-8859-1 solves the  
problem but I'd like to support utf-8.

I'm running RHEL 5 and LC_CTYPE is en_US.UTF-8 so I guess utf-8 should be  
working fine.
Perhaps it's not an Apache issue but I don't know of anything else I could  
change in my OS to fix this. Help?

Thanks,
Martin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] utf-8 encoded htaccess?

Posted by Martin Strand <do...@gmail.com>.
On Mon, 10 Sep 2007 15:48:45 +0200, Vincent Bray <no...@gmail.com> wrote:

> On 10/09/2007, Martin Strand <do...@gmail.com> wrote:
>> It really seems to be a UA problem. According to those RFCs, both  
>> 'realm'
>> and 'filename' are supposed to be 'quoted-string' and should be treated
>> exactly the same. It just seems so weird that three major browser  
>> vendors
>> would miss this.
>> Frustrating... :\
>
> Indeed. I'm reaching a bit here but how about checking with a UA
> that's meant to be a standards facist? I'm thinking the likes of svn
> (via libneon) or links or wget, those kinds of command line clients.
>
> Googling this issue doesn't shed much light :(
>
> Iit would help to see a dump of the actual headers sent on the wire,
> just in case apache's somehow mangling it. There was a link to the
> wiki page containing several ways to watch those in an earlier post.

Alright, I just tried svn but no luck - it doesn't decode the realm name  
and neither does elinks.
I couldn't find a way to show the realm with wget.
Letting wget dump the response shows that the header looks just fine:

WWW-Authenticate: Basic realm="=?utf-8?B?5pel5pys6aaZ5aCC?="

As far as I understand those two RFCs, that's exactly how it's supposed to  
look.

Martin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] utf-8 encoded htaccess?

Posted by Vincent Bray <no...@gmail.com>.
On 10/09/2007, Martin Strand <do...@gmail.com> wrote:
> It really seems to be a UA problem. According to those RFCs, both 'realm'
> and 'filename' are supposed to be 'quoted-string' and should be treated
> exactly the same. It just seems so weird that three major browser vendors
> would miss this.
> Frustrating... :\

Indeed. I'm reaching a bit here but how about checking with a UA
that's meant to be a standards facist? I'm thinking the likes of svn
(via libneon) or links or wget, those kinds of command line clients.

Googling this issue doesn't shed much light :(

Iit would help to see a dump of the actual headers sent on the wire,
just in case apache's somehow mangling it. There was a link to the
wiki page containing several ways to watch those in an earlier post.

-- 
noodl

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] utf-8 encoded htaccess?

Posted by Martin Strand <do...@gmail.com>.
On Mon, 10 Sep 2007 04:40:44 +0200, Vincent Bray <no...@gmail.com> wrote:

> On 10/09/2007, Martin Strand <do...@gmail.com> wrote:
>> This must be a UA problem - none of IE/FF/Opera decodes the name, they
>> must've forgotten to read rfc2617. :)
>
> Or quite possibly an apache problem, given that this string must be
> shoehorned in to the WWW-Authenticate header along with the other
> tokens. Does the header look reasonable? I'm not sure if apache
> supports this at all.

I did some experimenting with PHP. This script outputs an RFC compliant  
header (as far as i can tell) but IE/FF/Opera won't decode the realm name:

<?php
header("WWW-Authenticate: Basic realm=\"=?utf-8?B?5pel5pys6aaZ5aCC?=\"",  
true, 401);
?>


The same thing works fine for a filename so there's nothing wrong with the  
actual encoding:

<?php
header("Content-Disposition: attachment;  
filename=\"=?utf-8?B?5pel5pys6aaZ5aCC?=\"");
?>


It really seems to be a UA problem. According to those RFCs, both 'realm'  
and 'filename' are supposed to be 'quoted-string' and should be treated  
exactly the same. It just seems so weird that three major browser vendors  
would miss this.
Frustrating... :\

Martin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] utf-8 encoded htaccess?

Posted by Vincent Bray <no...@gmail.com>.
On 10/09/2007, Martin Strand <do...@gmail.com> wrote:
> This must be a UA problem - none of IE/FF/Opera decodes the name, they
> must've forgotten to read rfc2617. :)

Or quite possibly an apache problem, given that this string must be
shoehorned in to the WWW-Authenticate header along with the other
tokens. Does the header look reasonable? I'm not sure if apache
supports this at all.

-- 
noodl

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] utf-8 encoded htaccess?

Posted by Martin Strand <do...@gmail.com>.
On Mon, 10 Sep 2007 02:49:04 +0200, Vincent Bray <no...@gmail.com> wrote:

> On 10/09/2007, Martin Strand <do...@gmail.com> wrote:
>> I'm using non-ascii characters for AuthName but they show up as garbage  
>> in
>> firefox.
>> Changing the encoding for .htaccess from utf-8 to iso-8859-1 solves the
>> problem but I'd like to support utf-8.
>
> It takes a bit of backtracing through all the RFCs to find an
> authoritative answer to this.
>
> RFC2616#14.47: References RFC2617 (via a footnote reference)
> RFC2617#1.2: realm-value = quoted-string
>
> .. so back to 2616 for a definition of quoted-string ..
>
> RFC2616#2.2
>
> quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
> qdtext         = <any TEXT except <">>
>
> .. and ..
>
> Words
>    of *TEXT MAY contain characters from character sets other than ISO-
>    8859-1 [22] only when encoded according to the rules of RFC 2047
>    [14].
>
> Gah, another RFC: http://www.faqs.org/rfcs/rfc2047.html
>
> I'm not familliar with that one, and it doesn't look pretty, but I
> hope this hopes to get you closer to an answer.

Thanks, but unfortunately MIME encoding the realm name doesn't help:

AuthName =?utf-8?B?5pel5pys6aaZ5aCC?=

This must be a UA problem - none of IE/FF/Opera decodes the name, they  
must've forgotten to read rfc2617. :)
I know most UAs understand MIME encoding in UTF-8 filenames, which is a  
quoted-string too:

Content-Disposition: attachment; filename="=?utf-8?B?5pel5pys6aaZ5aCC?="

Martin

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] utf-8 encoded htaccess?

Posted by Vincent Bray <no...@gmail.com>.
On 10/09/2007, Martin Strand <do...@gmail.com> wrote:
> I'm using non-ascii characters for AuthName but they show up as garbage in
> firefox.
> Changing the encoding for .htaccess from utf-8 to iso-8859-1 solves the
> problem but I'd like to support utf-8.

It takes a bit of backtracing through all the RFCs to find an
authoritative answer to this.

RFC2616#14.47: References RFC2617 (via a footnote reference)
RFC2617#1.2: realm-value = quoted-string

.. so back to 2616 for a definition of quoted-string ..

RFC2616#2.2

quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext         = <any TEXT except <">>

.. and ..

Words
   of *TEXT MAY contain characters from character sets other than ISO-
   8859-1 [22] only when encoded according to the rules of RFC 2047
   [14].

Gah, another RFC: http://www.faqs.org/rfcs/rfc2047.html

I'm not familliar with that one, and it doesn't look pretty, but I
hope this hopes to get you closer to an answer.

-- 
noodl

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org