You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Dave Clemmer (JIRA)" <ji...@apache.org> on 2008/07/16 17:22:42 UTC

[jira] Created: (HTTPCLIENT-787) Redirects with spaces in them are not handled correctly

Redirects with spaces in them are not handled correctly
-------------------------------------------------------

                 Key: HTTPCLIENT-787
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-787
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpClient
            Reporter: Dave Clemmer
            Priority: Minor


If a redirect address has spaces in it (yes, I know, the person creating that situation should be beaten, but, alas, that is not an option), they are not converted to %20 before opening, and, hence, fail to open.

changing line 107 of DefaultRedirectHandler to
String location = locationHeader.getValue().replaceAll (" ", "%20");

seems to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-787) Redirects with spaces in them are not handled correctly

Posted by "Eric Sword (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661696#action_12661696 ] 

Eric Sword commented on HTTPCLIENT-787:
---------------------------------------

For anyone who comes across this issue and is tied to HttpClient 3.x, it is possible to do this pretty easily by overriding GetMethod.readResponseHeaders (assuming that the main use case for doing this is with GET).  The child class can just call super.readResponseHeaders and then check the response header group for any "location" headers with bad URLs.  It's not the most obvious approach, but it was the only one I found that works (along with being very simple to implement), so I thought I would record it for posterity.

> Redirects with spaces in them are not handled correctly
> -------------------------------------------------------
>
>                 Key: HTTPCLIENT-787
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-787
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>            Reporter: Dave Clemmer
>            Priority: Minor
>
> If a redirect address has spaces in it (yes, I know, the person creating that situation should be beaten, but, alas, that is not an option), they are not converted to %20 before opening, and, hence, fail to open.
> changing line 107 of DefaultRedirectHandler to
> String location = locationHeader.getValue().replaceAll (" ", "%20");
> seems to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Resolved: (HTTPCLIENT-787) Redirects with spaces in them are not handled correctly

Posted by "Oleg Kalnichevski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HTTPCLIENT-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Oleg Kalnichevski resolved HTTPCLIENT-787.
------------------------------------------

    Resolution: Won't Fix

Dave,

If you are using HttpClient 4.0, consider implementing a custom redirect handler as suggested by Odi. If you are using HttpClient 3.x or older you are out of luck. There is no point fixing those versions.

Oleg

> Redirects with spaces in them are not handled correctly
> -------------------------------------------------------
>
>                 Key: HTTPCLIENT-787
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-787
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>            Reporter: Dave Clemmer
>            Priority: Minor
>
> If a redirect address has spaces in it (yes, I know, the person creating that situation should be beaten, but, alas, that is not an option), they are not converted to %20 before opening, and, hence, fail to open.
> changing line 107 of DefaultRedirectHandler to
> String location = locationHeader.getValue().replaceAll (" ", "%20");
> seems to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Issue Comment Edited: (HTTPCLIENT-787) Redirects with spaces in them are not handled correctly

Posted by "Ortwin Glück (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613996#action_12613996 ] 

oglueck edited comment on HTTPCLIENT-787 at 7/16/08 8:54 AM:
------------------------------------------------------------------

This is a server bug, not a client issue. And you have heard it before "HttpClient is not a browser". Also as mentioned dozends of times before, URLs can not be escaped once they are represented as a series of bytes. Yes, in this particular case it is sort of possible. But please consider:

a) we can't know the right encoding for the URL. UTF-8 is only a recommendation. So assuming an ASCII-compatible encoding is arbitrary.
b) This is correct for spaces in URI components (like path), but it is wrong for spaces in application/x-www-form-urlencoded values of HTML forms (query string): they use a plus sign + to escape spaces. 
     And we have no reason to assume that the query string uses x-www-form-urlencoded. It could just use anything.
    As specified here: http://www.w3.org/TR/html401/interact/forms.html#form-content-type
    see also: http://marc.info/?l=httpclient-commons-dev&m=116859139319469&w=2
  
   Sample: http://people.apache.org/~oglueck/composite path/servlet?composite name=composite value
   is correctly escaped like so: http://people.apache.org/~oglueck/composite%20path/servlet?composite+name=composite+value
c) you can implement your own redirect handler that can handle all sort of malformed responses

To be fair, for *most* servers out there this shouldn't be a problem, because:
a) they expect URI encodings to be UTF-8
b) they all have "compatible" (broken) parsers that allow + and %20 to be used interchangibly

However, the really relevant point is: if the server does not even care to escape the space character, it will most like not escape any other non-URI characters. Most likely because of some careless programming. Such a server or application grossly violates the HTTP protocol and should be considered broken.

I would like to mark this issue as invalid.

Maybe a good thing to have would be a "CompatibilityRedirectHandler" that immitates the convenient behaviour of popular browsers. Consider contributing one.

      was (Author: oglueck):
    This is a server bug, not a client issue. And you have heard it before "HttpClient is not a browser". Also as mentioned dozends of times before, URLs can not be escaped once they are represented as a series of bytes. Yes, in this particular case it is sort of possible. But please consider:

a) we can't know the right encoding for the URL. UTF-8 is only a recommendation. So assuming an ASCII-compatible encoding is arbitrary.
b) This is correct for spaces in URI components (like path), but it is wrong for spaces in application/x-www-form-urlencoded values of HTML forms (query string): they use a plus sign + to escape spaces. 
     And we have no reason to assume that the query string uses x-www-form-urlencoded. It could just use anything.
    As specified here: http://www.w3.org/TR/html401/interact/forms.html#form-content-type
    see also: http://marc.info/?l=httpclient-commons-dev&m=116859139319469&w=2
  
   Sample: http://people.apache.org/~oglueck/composite path/servlet?composite name=composite value
   is correctly escaped like so: http://people.apache.org/~oglueck/composite%20path/servlet?composite+name=composite+value

To be fair, for *most* servers out there this shouldn't be a problem, because:
a) they expect URI encodings to be UTF-8
b) they all have "compatible" (broken) parsers that allow + and %20 to be used interchangibly
c) you can implement your own redirect handler that can handle all sort of malformed responses

However, the really relevant point is: if the server does not even care to escape the space character, it will most like not escape any other non-URI characters. Most likely because of some careless programming. Such a server or application grossly violates the HTTP protocol and should be considered broken.

I would like to mark this issue as invalid.

Maybe a good thing to have would be a "CompatibilityRedirectHandler" that immitates the convenient behaviour of popular browsers. Consider contributing one.
  
> Redirects with spaces in them are not handled correctly
> -------------------------------------------------------
>
>                 Key: HTTPCLIENT-787
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-787
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>            Reporter: Dave Clemmer
>            Priority: Minor
>
> If a redirect address has spaces in it (yes, I know, the person creating that situation should be beaten, but, alas, that is not an option), they are not converted to %20 before opening, and, hence, fail to open.
> changing line 107 of DefaultRedirectHandler to
> String location = locationHeader.getValue().replaceAll (" ", "%20");
> seems to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-787) Redirects with spaces in them are not handled correctly

Posted by "Ortwin Glück (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613996#action_12613996 ] 

Ortwin Glück commented on HTTPCLIENT-787:
-----------------------------------------

This is a server bug, not a client issue. And you have heard it before "HttpClient is not a browser". Also as mentioned dozends of times before, URLs can not be escaped once they are represented as a series of bytes. Yes, in this particular case it is sort of possible. But please consider:

a) we can't know the right encoding for the URL. UTF-8 is only a recommendation. So assuming an ASCII-compatible encoding is arbitrary.
b) This is correct for spaces in URI components (like path), but it is wrong for spaces in application/x-www-form-urlencoded values of HTML forms (query string): they use a plus sign + to escape spaces. 
     And we have no reason to assume that the query string uses x-www-form-urlencoded. It could just use anything.
    As specified here: http://www.w3.org/TR/html401/interact/forms.html#form-content-type
    see also: http://marc.info/?l=httpclient-commons-dev&m=116859139319469&w=2
  
   Sample: http://people.apache.org/~oglueck/composite path/servlet?composite name=composite value
   is correctly escaped like so: http://people.apache.org/~oglueck/composite%20path/servlet?composite+name=composite+value

To be fair, for *most* servers out there this shouldn't be a problem, because:
a) they expect URI encodings to be UTF-8
b) they all have "compatible" (broken) parsers that allow + and %20 to be used interchangibly
c) you can implement your own redirect handler that can handle all sort of malformed responses

However, the really relevant point is: if the server does not even care to escape the space character, it will most like not escape any other non-URI characters. Most likely because of some careless programming. Such a server or application grossly violates the HTTP protocol and should be considered broken.

I would like to mark this issue as invalid.

Maybe a good thing to have would be a "CompatibilityRedirectHandler" that immitates the convenient behaviour of popular browsers. Consider contributing one.

> Redirects with spaces in them are not handled correctly
> -------------------------------------------------------
>
>                 Key: HTTPCLIENT-787
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-787
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>            Reporter: Dave Clemmer
>            Priority: Minor
>
> If a redirect address has spaces in it (yes, I know, the person creating that situation should be beaten, but, alas, that is not an option), they are not converted to %20 before opening, and, hence, fail to open.
> changing line 107 of DefaultRedirectHandler to
> String location = locationHeader.getValue().replaceAll (" ", "%20");
> seems to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-787) Redirects with spaces in them are not handled correctly

Posted by "Sam Berlin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614044#action_12614044 ] 

Sam Berlin commented on HTTPCLIENT-787:
---------------------------------------

Perhaps something like a BROWSER_COMPABLE_MODE would be useful for the whole of HttpClient (as opposed to just Cookies, which is I think where it's used now).

> Redirects with spaces in them are not handled correctly
> -------------------------------------------------------
>
>                 Key: HTTPCLIENT-787
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-787
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>            Reporter: Dave Clemmer
>            Priority: Minor
>
> If a redirect address has spaces in it (yes, I know, the person creating that situation should be beaten, but, alas, that is not an option), they are not converted to %20 before opening, and, hence, fail to open.
> changing line 107 of DefaultRedirectHandler to
> String location = locationHeader.getValue().replaceAll (" ", "%20");
> seems to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org