You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Oleg Kalnichevski (JIRA)" <ji...@apache.org> on 2019/06/21 09:04:00 UTC

[jira] [Commented] (HTTPCLIENT-1995) Percent-encoded ampersand in URI path not preserved

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869305#comment-16869305 ] 

Oleg Kalnichevski commented on HTTPCLIENT-1995:
-----------------------------------------------

RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax
3.3. Path Component
{noformat}
The path component contains data, specific to the authority (or the
scheme if there is no authority component), identifying the resource
within the scope of that scheme and authority.
path = [ abs_path | opaque_part ]
path_segments = segment *( "/" segment )
segment = *pchar *( ";" param )
param = *pchar
pchar = unreserved | escaped | ":" | "@" | "&" | "=" | "+" | "$" | ","
{noformat}

You are confusing encoding rules of the path component with those of query component.

Oleg

> Percent-encoded ampersand in URI path not preserved
> ---------------------------------------------------
>
>                 Key: HTTPCLIENT-1995
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1995
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (classic)
>    Affects Versions: 4.5.8, 4.5.9
>         Environment: Linux Mint 19, OpenJDK 8
>            Reporter: Hartmut Arlt
>            Priority: Major
>
> Starting with HttpClient 4.5.8, percent-encoded ampersand characters in URI path segments are not preserved any longer but written in decoded form to wire due to path normalization performed by URIUtils.rewriteURI(URI, HttpHost).
>  
> According to RFC 3986 (page 11+), the ampersand character is a delimiter and thus needs to be percent-encoded when not used for this purpose. Path normalization, as performed by HttpClient v4.5.8+, creates a new URI that is not equivalent to the original URI and thus leads to misinterpretation on server/receiver side.
> ??URIs that differ in the replacement of a reserved character with its??
> ??corresponding percent-encoded octet are not equivalent. Percent-??
> ??encoding a reserved character, or decoding a percent-encoded octet??
> ??that corresponds to a reserved character, will change how the URI is??
> ??interpreted by most applications??.
>   
> A very simple test case is as follows:
> {code:java}
> @Test
> public void testAmpersand() throws Throwable
> {
>     final URI uri = new URI("http://example.org/some/path%26with%20percent/encoded/segments");
>     final URI uri2 = URIUtils.rewriteURI(uri, null);
>         
>     Assert.assertEquals(uri, uri2);
> }
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org