You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Ian Springer (JIRA)" <ji...@apache.org> on 2018/08/31 01:15:00 UTC

[jira] [Commented] (HTTPCLIENT-1927) URLEncodedUtils#parse breaks at double quotes when parsing unquoted values

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598103#comment-16598103 ] 

Ian Springer commented on HTTPCLIENT-1927:
------------------------------------------

Can you please do a 4.5.7 release to get this fix out?

 

Thanks!

> URLEncodedUtils#parse breaks at double quotes when parsing unquoted values
> --------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-1927
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1927
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (async), HttpClient (classic)
>    Affects Versions: 4.5.5, 4.5.6
>            Reporter: Kadeem Hassam
>            Priority: Minor
>             Fix For: 4.5.7, 5.0 Beta2
>
>
> Assume a query string like {{a=b"c&d=e}}
> The expected mapping for that query string, would reasonably be expected to be
> {code:java}
> [a=b"c, d=e]
> {code}
> Actual result using httpcore 4.4.9 is
> {code:java}
> [a=bc&d=e]
> {code}
> Example code:
> {code:java}
> import java.nio.charset.StandardCharsets;
> import org.apache.http.client.utils.URLEncodedUtils;
> class QueryParser {
>     public static void main(String[] args) {
>         System.out.println(URLEncodedUtils.parse("a=b\"c&d=e", StandardCharsets.UTF_8, '&'));
>     }
> }
> {code}
> Using {{URLEncodedUtils}} from {{httpclient}} uses the {{TokenParser}} in {{httpcore}}.
> After successfully parsing the name ({{a}}), the value is parsed using the {{parseValue(CharArrayBuffer, ParserCursor, BitSet)}}[[link|https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/message/TokenParser.java#L119-L144]] method.
> The first character being neither a delimiter nor a double quote, ends up calling {{copyUnquotedContent(CharArrayBuffer, ParserCursor, BitSet, StringBuilder)}}[[link|https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/message/TokenParser.java#L205-L221]] which ends up returning when the double quote is reached ([[link|https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/message/TokenParser.java#L213-L214]]) instead of when the delimiter is reached.
> {{parseValue}} then continues parsing the value but as quoted content this time (because the now current position is a quote character). Copying quoted content reasonably does not break on the delimiter set, but this ends up consuming the rest of the query string.
> Other URI parsers parse the URI in the expected format, such as with Python.
> {noformat}
> Python 3.6.1 (default, Mar 23 2017, 13:04:44) [GCC] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import urllib.parse
> >>> urllib.parse.parse_qs('a=b"c&d=e')
> {'a': ['b"c'], 'd': ['e']}
> {noformat}
> Although I haven't explicitly tested with {{httpcore5}}, the code for {{TokenParser}} appears equivalent to {{4.4.9}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org