You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2014/01/04 22:34:01 UTC
[Bug 55951] New: HTML5 specifies UTF-8 encoding for cookie values
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
Bug ID: 55951
Summary: HTML5 specifies UTF-8 encoding for cookie values
Product: Tomcat 8
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Connectors
Assignee: dev@tomcat.apache.org
Reporter: jboynes@apache.org
The HTML5 specification is specifying that cookie values may contain characters
that are not part of US-ASCII or ISO-8859-1 and that those codepoints should be
UTF-8 encoded for display.
http://www.w3.org/html/wg/drafts/html/master/single-page.html#cookie
This will result in 8-bit high values in cookies that need to be accepted and
set.
This will also require special encoding to handle conversion to the UCS-16
characters used by the Java String used to represent the value in the Cookie
class.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
Mark Thomas <ma...@apache.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution|--- |FIXED
--- Comment #7 from Mark Thomas <ma...@apache.org> ---
This has now been fixed in 8.0.x for 8.0.15 onwards.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
Jeremy Boynes <jb...@apache.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Depends on| |55917
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
--- Comment #5 from Mark Thomas <ma...@apache.org> ---
Here is a patch that adds support for sending HTTP headers in character sets
other than ISO-8859-1 and then uses that support for sending Set-Cookie
headers.
Both AJP and HTTP needed changes SPDY didn't as it already used the approach
the patch uses.
I still have some work to do to restore the filtering of CTLs.
http://people.apache.org/~markt/patches/2014-10-02-bug55951-tc8-v1.patch
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
Bug 55951 depends on bug 55917, which changed state.
Bug 55917 Summary: Cookie parsing fails hard with ISO-8859-1 values
https://issues.apache.org/bugzilla/show_bug.cgi?id=55917
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
Mark Thomas <ma...@apache.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |---
--- Comment #2 from Mark Thomas <ma...@apache.org> ---
Re-opening as the unit test didn't cover the end to end process are there are
still some issues to resolve.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
--- Comment #6 from Mark Thomas <ma...@apache.org> ---
This is the completed patch:
http://people.apache.org/~markt/patches/2014-10-06-bug55951-tc8-v2.patch
I'll give folks a day or so to review and comment and then commit it.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
--- Comment #3 from Konstantin Kolinko <kn...@gmail.com> ---
(In reply to Jeremy Boynes from comment #0)
> The HTML5 specification is specifying that cookie values may contain
> characters that are not part of US-ASCII or ISO-8859-1 and that those
> codepoints should be UTF-8 encoded for display.
>
> http://www.w3.org/html/wg/drafts/html/master/single-page.html#cookie
>
What is the exact wording?
The above link is broken - there is no "cookie" anchor in the current version
of that document. All I see are references to [COOKIES] document (#refsCOOKIES
anchor) = RFC 6265.
http://tools.ietf.org/html/rfc6265
RFC 6265 does not allow non-ascii characters in cookie value in Set-Cookie
header. Citing from its Chapter 4.1.1. Set-Cookie / Syntax,
set-cookie-header = "Set-Cookie:" SP set-cookie-string
set-cookie-string = cookie-pair *( ";" SP cookie-av )
cookie-pair = cookie-name "=" cookie-value
cookie-name = token
cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
; US-ASCII characters excluding CTLs,
; whitespace DQUOTE, comma, semicolon,
; and backslash
The cookie-value is limited to US-ASCII, even when quoted.
At the same time, attributes (cookie-av) do not have such limitation and as
such may be UTF-8:
path-av = "Path=" path-value
path-value = <any CHAR except CTLs or ";">
For reference, the place where UTF-8 is mentioned in RFC 6265 is in chapter
5.4. The Cookie Header. Citing:
NOTE: Despite its name, the cookie-string is actually a sequence of
octets, not a sequence of characters. To convert the cookie-string
(or components thereof) into a sequence of characters (e.g., for
presentation to the user), the user agent might wish to try using the
UTF-8 character encoding [RFC3629] to decode the octet sequence.
This decoding might fail, however, because not every sequence of
octets is valid UTF-8.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
Mark Thomas <ma...@apache.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #1 from Mark Thomas <ma...@apache.org> ---
This will be available in 8.0.15 onwards via the Rfc6265CookieProcessor.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 55951] HTML5 specifies UTF-8 encoding for cookie values
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=55951
--- Comment #4 from Mark Thomas <ma...@apache.org> ---
(In reply to Konstantin Kolinko from comment #3)
> The cookie-value is limited to US-ASCII, even when quoted.
Agreed.
> At the same time, attributes (cookie-av) do not have such limitation and as
> such may be UTF-8:
>
> path-av = "Path=" path-value
> path-value = <any CHAR except CTLs or ";">
Nope. CHAR is limited to USASCII. See the definition in section 2.2 of RFC
6265.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org