You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Andreas Loth (Jira)" <ji...@apache.org> on 2023/04/05 15:00:00 UTC

[jira] [Updated] (HTTPCORE-739) org.apache.hc.core5.net.URIBuilder does not decode plus characters (`+`) in the query part

     [ https://issues.apache.org/jira/browse/HTTPCORE-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Loth updated HTTPCORE-739:
----------------------------------
    Description: 
Currently, when decoding the query part of an URL, a plus sign is kept als plus sign in the decoded name-value-pairs.
Expected would be that a plus sign is decoded to a space.

https://www.w3.org/Addressing/URL/uri-spec.html
> Within the query string, the plus sign is reserved as shorthand notation for a space. Therefore, real plus signs must be encoded.

I'm perfectly fine with encoding space everywhere to %20 and the plus sign everywhere to %2B (this is in my experience the most unambiguous and less error prone way to handle these characters). See HTTPCORE-628
However, during decoding the position is the plus sign has to be respected: decode it to space in the query part but leave it as plus everywhere else.

Test case for decoding:

{noformat}
 * URL: https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test
 * path: /abc/plus-+_enc-space- _enc-plus-+_/def
 * get argument 1 name: test
 * get argument 1 value: plus- _enc-space- _enc-plus-+_
 * get argument 2 name: plus- _enc-space- _enc-plus-+_
 * get argument 2 value: test
{noformat}

Test case for encoding:
{noformat}
 * path: /abc/plus-+_space- _/def
 * get argument 1 name: test
 * get argument 1 value: plus-+_space- _
 * get argument 2 name: plus-+_space- _
 * get argument 2 value: test
 * URL: https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test
{noformat}

Potential fix (untested):
https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410
Change the vaue of the `plusAsBlank` argument from `false` to `true`.

  was:
Currently, when decoding the query part of an URL, a plus sign is kept als plus sign in the decoded name-value-pairs.
Expected would be that a plus sign is decoded to a space.

https://www.w3.org/Addressing/URL/uri-spec.html
> Within the query string, the plus sign is reserved as shorthand notation for a space. Therefore, real plus signs must be encoded.

I'm perfectly fine with encoding space everywhere to %20 and the plus sign everywhere to %2B (this is in my experience the most unambiguous and less error prone way to handle these characters). See HTTPCORE-628
However, during decoding the position is the plus sign has to be respected: decode it to space in the query part but leave it as plus everywhere else.

Test case for decoding:
 * URL: https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test
 * path: /abc/plus-+_enc-space- _enc-plus-+_/def
 * get argument 1 name: test
 * get argument 1 value: plus- _enc-space- _enc-plus-+_
 * get argument 2 name: plus- _enc-space- _enc-plus-+_
 * get argument 2 value: test

Test case for encoding:
 * path: /abc/plus-+_space- _/def
 * get argument 1 name: test
 * get argument 1 value: plus-+_space- _
 * get argument 2 name: plus-+_space- _
 * get argument 2 value: test
 * URL: https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test

Potential fix (untested):
https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410
Change the vaue of the `plusAsBlank` argument from `false` to `true`.


> org.apache.hc.core5.net.URIBuilder does not decode plus characters (`+`) in the query part
> ------------------------------------------------------------------------------------------
>
>                 Key: HTTPCORE-739
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-739
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore
>    Affects Versions: 5.2.1
>            Reporter: Andreas Loth
>            Priority: Major
>
> Currently, when decoding the query part of an URL, a plus sign is kept als plus sign in the decoded name-value-pairs.
> Expected would be that a plus sign is decoded to a space.
> https://www.w3.org/Addressing/URL/uri-spec.html
> > Within the query string, the plus sign is reserved as shorthand notation for a space. Therefore, real plus signs must be encoded.
> I'm perfectly fine with encoding space everywhere to %20 and the plus sign everywhere to %2B (this is in my experience the most unambiguous and less error prone way to handle these characters). See HTTPCORE-628
> However, during decoding the position is the plus sign has to be respected: decode it to space in the query part but leave it as plus everywhere else.
> Test case for decoding:
> {noformat}
>  * URL: https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test
>  * path: /abc/plus-+_enc-space- _enc-plus-+_/def
>  * get argument 1 name: test
>  * get argument 1 value: plus- _enc-space- _enc-plus-+_
>  * get argument 2 name: plus- _enc-space- _enc-plus-+_
>  * get argument 2 value: test
> {noformat}
> Test case for encoding:
> {noformat}
>  * path: /abc/plus-+_space- _/def
>  * get argument 1 name: test
>  * get argument 1 value: plus-+_space- _
>  * get argument 2 name: plus-+_space- _
>  * get argument 2 value: test
>  * URL: https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test
> {noformat}
> Potential fix (untested):
> https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410
> Change the vaue of the `plusAsBlank` argument from `false` to `true`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org