You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Mark Thomas <ma...@apache.org> on 2014/01/01 17:59:55 UTC

Re: 8-bit text in cookie values

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 26/12/2013 19:23, Jeremy Boynes wrote:
> On Dec 26, 2013, at 2:47 AM, Mark Thomas <ma...@apache.org> wrote:
> 
> Focusing on the 8-bit issue address by the patch, leaving the other
> RFC6265 thread for broader discussion ...
> 
>>> The change only allows these characters in values if version ==
>>> 0 where Netscape’s rather than RFC2109’s syntax applies (per
>>> the Servlet spec). The Netscape spec is vague in that it does
>>> not define “OPAQUE_STRING" at all and defines “VALUE” as
>>> containing equally undefined “characters” although
>>> historically[1] those have been taken to be OCTETs as permitted
>>> by RFC2616’s “*TEXT” variant of “field-content.” The change
>>> will continue to reject these characters in names and in
>>> unquoted values when version != 0 (RFC2109’s “word" rule)
>>> 
>>> [1] based on comments by Fielding et al. on http-state and
>>> what I’ve seen in the wild
>> 
>> Can you provide references for [1]?
> 
> This is the mail in the run up to RFC6265 that triggered the
> discussion: 
> http://www.ietf.org/mail-archive/web/http-state/current/msg01232.html

Thanks
> 
for that reference. What a complete mess. RFC6265 really
dropped the ball on this. The grammar for cookie-value is a disaster.
So far the issues include:
- - no support for 0x80 to 0xFF
- - no support for \" sequences
- - no support for using whitespace, comma, semi-colon, backslash

I was beginning to think that factoring out the cookie generation /
parsing and then providing different implementations (one for Netscape
+ RFC2109 - roughly what we have now with a few fixes, one for RFC6265
and maybe one very relaxed) would be the way to go. Having looked at
the first issue that plan already looks like it needs a re-think.

I'm still hoping that by documenting all the various issues in one
place we will be able to come up with a solution that both addresses
all the issues you have raised and is better than the handful of
system properties we have currently.

Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSxEmLAAoJEBDAHFovYFnnyVcP+wfe+dxLyTEG856JW2NcyrBY
j3iszFdsriJHqGnFOI3YWzKflF5h72oZjBL5cKQ5MozlF2Ycx+UHsPu2p6f1wpy8
d2T2frCwaXIULpqMdsMVMIEMZbVjwWdB9zYKKZAxZm1uhHUhqNyzsIG3rs/dTJrP
Ytt9/hJCKEYEgFCNFCmDoCj4tWCkIFz/bdYb3D7kLe2AP/SF7rUrgkJgW9bF3/y+
BMZYUXIgBj1NZ0Ts9C7K/k8ngiWgpsCXiJos2b0lMU1ga9agadTTJU+2EJgrd9m9
NjVXlBMIraEbPp+Gj2WHPBuVMRhDKwTvyg7AnR0B1toEkqEK986YJU5wzOUHp/em
KW8M81oCY6t+JdvVZ48rAjuFBsj8DQVCyjIOBUNYZ1e/oS68Wjt84c2/NZfPUtVr
iCEWEgeUpb7fTwCQezn6+FdNu1urnuouaw/4szkRPruQKCBbh/ngLZ3PChuttozR
QpePdcXIyG0XRSIB682UGyuZoUWFQQ3Ug67sC6rb9yKu3oOlaMg6Ii32UulGUczA
SfoNIeQj2uz9pfqA79PqDY9Qkg7GcqvDQl7WKDb8tJ4Of+NAvh7affcm0Nvf+ldt
0hezWjhlhnSA9dowycSe7Z20OM+dWFXCwl3czMH0Ick4JX+QeqT8z9TDYKtDMYpq
EXHhPslORjxfHCf4zNQ0
=gHjq
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: 8-bit text in cookie values

Posted by Jeremy Boynes <jb...@apache.org>.
Adding more confusion to the pile, HTML5[1] now specifies that JavaScript can set Unicode characters through document.cookie and that they must be encoded as UTF-8 in the header. Quick testing with Chrome shows it does just that (i.e. U+00E1 is sent as 0xC3 0xA1). If client and server-side application code is going to interoperate then we would need to accept them in a Cookie header and allow them to be sent in a Set-Cookie header. However, this is ambiguous when compared to Netscape and its implicit assumption of ISO-8859-1.

[1] http://www.w3.org/html/wg/drafts/html/master/single-page.html#cookie

On Jan 1, 2014, at 10:18 AM, Jeremy Boynes <jb...@apache.org> wrote:

> On Jan 1, 2014, at 8:59 AM, Mark Thomas <ma...@apache.org> wrote:
> 
>> Signed PGP part
>> On 26/12/2013 19:23, Jeremy Boynes wrote:
>>> On Dec 26, 2013, at 2:47 AM, Mark Thomas <ma...@apache.org> wrote:
>>> 
>>> Focusing on the 8-bit issue address by the patch, leaving the other
>>> RFC6265 thread for broader discussion ...
>>> 
>>>>> The change only allows these characters in values if version ==
>>>>> 0 where Netscape’s rather than RFC2109’s syntax applies (per
>>>>> the Servlet spec). The Netscape spec is vague in that it does
>>>>> not define “OPAQUE_STRING" at all and defines “VALUE” as
>>>>> containing equally undefined “characters” although
>>>>> historically[1] those have been taken to be OCTETs as permitted
>>>>> by RFC2616’s “*TEXT” variant of “field-content.” The change
>>>>> will continue to reject these characters in names and in
>>>>> unquoted values when version != 0 (RFC2109’s “word" rule)
>>>>> 
>>>>> [1] based on comments by Fielding et al. on http-state and
>>>>> what I’ve seen in the wild
>>>> 
>>>> Can you provide references for [1]?
>>> 
>>> This is the mail in the run up to RFC6265 that triggered the
>>> discussion:
>>> http://www.ietf.org/mail-archive/web/http-state/current/msg01232.html
>> 
>> Thanks
>>> 
>> for that reference. What a complete mess. RFC6265 really
>> dropped the ball on this. The grammar for cookie-value is a disaster.
>> So far the issues include:
>> - no support for 0x80 to 0xFF
>> - no support for \" sequences
>> - no support for using whitespace, comma, semi-colon, backslash
>> 
>> I was beginning to think that factoring out the cookie generation /
>> parsing and then providing different implementations (one for Netscape
>> + RFC2109 - roughly what we have now with a few fixes, one for RFC6265
>> and maybe one very relaxed) would be the way to go. Having looked at
>> the first issue that plan already looks like it needs a re-think.
>> 
>> I'm still hoping that by documenting all the various issues in one
>> place we will be able to come up with a solution that both addresses
>> all the issues you have raised and is better than the handful of
>> system properties we have currently.
> 
> I think they did a reasonable job given the mess cookies are in the wild today. They summarize this in the preamble:
>> The recommendations for cookie generation provided in Section 4 represent a preferred subset of current server behavior, and even the more liberal cookie processing algorithm provided in Section 5 does not recommend all of the syntactic and semantic variations in use today.
> 
> Section 4 recommends guidelines for servers generating cookies. I interpret that as being “if you follow these guidelines, you have a good chance of actually getting back the value you tried to set.” The rules above (no 8-bit, no escaping, no Netscape delimiters) reflect that principle. A server application can step outside those guidelines but "thar ther be dragons."
> 
> —
> Jeremy


Re: 8-bit text in cookie values

Posted by Jeremy Boynes <jb...@apache.org>.
On Jan 1, 2014, at 8:59 AM, Mark Thomas <ma...@apache.org> wrote:

> Signed PGP part
> On 26/12/2013 19:23, Jeremy Boynes wrote:
> > On Dec 26, 2013, at 2:47 AM, Mark Thomas <ma...@apache.org> wrote:
> >
> > Focusing on the 8-bit issue address by the patch, leaving the other
> > RFC6265 thread for broader discussion ...
> >
> >>> The change only allows these characters in values if version ==
> >>> 0 where Netscape’s rather than RFC2109’s syntax applies (per
> >>> the Servlet spec). The Netscape spec is vague in that it does
> >>> not define “OPAQUE_STRING" at all and defines “VALUE” as
> >>> containing equally undefined “characters” although
> >>> historically[1] those have been taken to be OCTETs as permitted
> >>> by RFC2616’s “*TEXT” variant of “field-content.” The change
> >>> will continue to reject these characters in names and in
> >>> unquoted values when version != 0 (RFC2109’s “word" rule)
> >>>
> >>> [1] based on comments by Fielding et al. on http-state and
> >>> what I’ve seen in the wild
> >>
> >> Can you provide references for [1]?
> >
> > This is the mail in the run up to RFC6265 that triggered the
> > discussion:
> > http://www.ietf.org/mail-archive/web/http-state/current/msg01232.html
> 
> Thanks
> >
> for that reference. What a complete mess. RFC6265 really
> dropped the ball on this. The grammar for cookie-value is a disaster.
> So far the issues include:
> - no support for 0x80 to 0xFF
> - no support for \" sequences
> - no support for using whitespace, comma, semi-colon, backslash
> 
> I was beginning to think that factoring out the cookie generation /
> parsing and then providing different implementations (one for Netscape
> + RFC2109 - roughly what we have now with a few fixes, one for RFC6265
> and maybe one very relaxed) would be the way to go. Having looked at
> the first issue that plan already looks like it needs a re-think.
> 
> I'm still hoping that by documenting all the various issues in one
> place we will be able to come up with a solution that both addresses
> all the issues you have raised and is better than the handful of
> system properties we have currently.

I think they did a reasonable job given the mess cookies are in the wild today. They summarize this in the preamble:
> The recommendations for cookie generation provided in Section 4 represent a preferred subset of current server behavior, and even the more liberal cookie processing algorithm provided in Section 5 does not recommend all of the syntactic and semantic variations in use today.

Section 4 recommends guidelines for servers generating cookies. I interpret that as being “if you follow these guidelines, you have a good chance of actually getting back the value you tried to set.” The rules above (no 8-bit, no escaping, no Netscape delimiters) reflect that principle. A server application can step outside those guidelines but "thar ther be dragons."

—
Jeremy