You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Marc Guillemot (JIRA)" <ji...@apache.org> on 2008/01/03 13:17:34 UTC

[jira] Created: (HTTPCLIENT-727) Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js

Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js
-----------------------------------------------------------------------------------------------

                 Key: HTTPCLIENT-727
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-727
             Project: HttpComponents HttpClient
          Issue Type: Bug
    Affects Versions: 3.1 Final
            Reporter: Marc Guillemot


public void testURI_getEscapedPath() throws Exception {
	URI uri = new URI("//js/includes/foo.js", false);
	assertEquals("//js/includes/foo.js", uri.toString()); // passes
	assertEquals("//js/includes/foo.js", uri.getEscapedPath()); // fails
}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-727) Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js

Posted by "Marc Guillemot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556528#action_12556528 ] 

Marc Guillemot commented on HTTPCLIENT-727:
-------------------------------------------

Thanks for the explanations. The HostConfiguration was misleading. I'll look if adding an additional // at the beginning of the path allows to reproduce browser's behaviour. Perhaps can I implement it in HtmlUnit for the next release.

Having to report bugs to Sun will not necessarily be an improvement ;-)

> Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js
> -----------------------------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-727
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-727
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 3.1 Final
>            Reporter: Marc Guillemot
>
> public void testURI_getEscapedPath() throws Exception {
> 	URI uri = new URI("//js/includes/foo.js", false);
> 	assertEquals("//js/includes/foo.js", uri.toString()); // passes
> 	assertEquals("//js/includes/foo.js", uri.getEscapedPath()); // fails
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-727) Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js

Posted by "Roland Weber (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556516#action_12556516 ] 

Roland Weber commented on HTTPCLIENT-727:
-----------------------------------------

Hello Marc,

that argument to the constructor is _not_ just a path. You are calling a constructor that expects a _URI_ string as argument, which may or may not have a host section. That string gets parsed by the constructor, according to the rules of some RFC.
The URI string you are giving to the constructor starts with //. That tells the parser that there is no scheme (protocol), but there is a host. Compare these two examples:

//hc.apache.org/index.html
//js/includes/foo.js

Do you see the problem now? If you want just a path, then make sure that the argument you are passing to the constructor starts with a _single_ slash, since that indicates the start of the path.

The example http://my.site//foo/bla.html should work with HttpClient's URI class, too. There is a scheme (protocol) "http", followed by a colon and the // that indicates the host section. The first / after the host section starts the path. Everything's peachy. Just don't omit the part before //foo.

Browser to all kinds of stuff to handle broken input and guess something useful from it. HttpClient is not a browser:
http://wiki.apache.org/jakarta-httpclient/ForAbsoluteBeginners#head-a110969063be34fcd964aeba55ae23bea12ac232

HttpClient 4.0 uses the java.net.URI class, where bugs can be reported to Sun. You'll find that this class also interprets a leading // as the start of the host section, because that's what the RFC says.

hope this helps,
  Roland


> Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js
> -----------------------------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-727
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-727
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 3.1 Final
>            Reporter: Marc Guillemot
>
> public void testURI_getEscapedPath() throws Exception {
> 	URI uri = new URI("//js/includes/foo.js", false);
> 	assertEquals("//js/includes/foo.js", uri.toString()); // passes
> 	assertEquals("//js/includes/foo.js", uri.getEscapedPath()); // fails
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Closed: (HTTPCLIENT-727) Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js

Posted by "Roland Weber (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HTTPCLIENT-727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roland Weber closed HTTPCLIENT-727.
-----------------------------------

    Resolution: Won't Fix

Hello Marc,

your test is invalid. In absence of other delimiters, the first double slash in the URL indicates the start of the host section, not of the path. When substituting the scheme and ports for HTTP, your example URL translates to
http://js:80/includes/foo.js
and not what you seem to have expected,
http:////js/includes/foo.js
with empty host and port section.

You have to create the URL as follows:
        URI uri = new URI("////js/includes/foo.js", false);
The first // indicates the start of the host section, which is empty.
The second // is then interpreted correctly as the start of the path.

Actually, we do have this bug in other constructors, when the string is passed as a path explicitly:
        URI uri = new URI(null, null, "//js/includes/foo.js", null);
And that's surely not the only bug in there. The URI class is broken beyond repair.
Hacking in more workarounds won't make it better.

sorry,
  Roland


> Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js
> -----------------------------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-727
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-727
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 3.1 Final
>            Reporter: Marc Guillemot
>
> public void testURI_getEscapedPath() throws Exception {
> 	URI uri = new URI("//js/includes/foo.js", false);
> 	assertEquals("//js/includes/foo.js", uri.toString()); // passes
> 	assertEquals("//js/includes/foo.js", uri.getEscapedPath()); // fails
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-727) Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js

Posted by "Marc Guillemot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556506#action_12556506 ] 

Marc Guillemot commented on HTTPCLIENT-727:
-------------------------------------------

I'm not sure to understand you explanation: why does URI look for host settings? The arg c'tor is only a path, the host information is hold in HostConfiguration.

Rather that trying to provide a test, I'll try to explain: for http://my.site//foo/bla.html a normal browser performs a GET to //foo/bla.html to the the desired host.
When you do the same with HttpClient, the GetMethod c'tor delegates to URI's c'tor with new URI(uri, true, charset)... which is exactly what I provided in the test case.

Additional question: you seem to say that there is no chance to fix it in HttpClient 3.1. How does it look with future 4.0?

> Misbehaviour of URI.getEscapedPath() with uri containing double slash like //js/includes/foo.js
> -----------------------------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-727
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-727
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 3.1 Final
>            Reporter: Marc Guillemot
>
> public void testURI_getEscapedPath() throws Exception {
> 	URI uri = new URI("//js/includes/foo.js", false);
> 	assertEquals("//js/includes/foo.js", uri.toString()); // passes
> 	assertEquals("//js/includes/foo.js", uri.getEscapedPath()); // fails
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org