You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Vicky_Dev <vi...@yahoo.co.in> on 2009/08/12 13:26:29 UTC

Re: Re:Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????

I am facing similar issue whilst calling Solr (search engine) with HTTPClient

Following URL works very well within browser 

http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple%29&spellcheck=true&start=0&rows=10&qt=dismaxrequest

But same URL is not coming up from HTTPClient.

Error:
org.apache.commons.httpclient.URIException: Invalid query


~Vikrant



nonopo12345 wrote:
> 
> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>> 
>>> Hi,i'm using httpclient to connect a url. The problem is that i accept a
>>> error redirect location ,for example http://wapp.baidu.com/f?kw=????????
>>> , when to visit the url.
>>>  
>>> why there appeared some characters like "???????? "? The correct
>>> redirect laoction should be
>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>
>>This is most likely because the redirect location in the HTTP response
>>is not correctly escaped. HTTP messages are expected to consist of
>>US-ASCII characters only. Non-US-ASCII characters are supposed to be
>>escaped.
>>
>>Oleg
> 
> how can httpclient escape Non-US-ASCII characters correctly ?
> could you give me a example on visiting this url
> :http://gate.baidu.com/tc?m=2&w=0_5_%E5%93%80%E9%B8%A3&t=wap&ssid=0&from=0&bd_page_type=0&p=b43ed516d9c21fff57ee96685c52&order=2&vit=osres&uid=wap_1237916098_46&src=http%3A%2F%2Fpost%2Ebaidu%2Ecom%2Ff%3Fkw%3D%B9%C2%D1%E3%B0%A7%C3%F9
> 
>>>  
>>> thd exception is:
>>> org.apache.commons.httpclient.InvalidRedirectLocationException: Invalid
>>> redirect location: http://wapp.baidu.com/f?kw=????????
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.processRedirectResponse(HttpMethodDirector.java:619) 
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:179) 
>>> at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) 
>>> at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) 
>>> at cn.rjb.app.test.TestClickA.getSourceCode(TestClickA.java:28) 
>>> at cn.rjb.app.test.TestClickA.main(TestClickA.java:15) 
>>> Caused by: org.apache.commons.httpclient.URIException: Invalid query 
>>> at org.apache.commons.httpclient.URI.parseUriReference(URI.java:2049) 
>>> at org.apache.commons.httpclient.URI. <init>(URI.java:147) 
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.processRedirectResponse(HttpMethodDirector.java:601) 
>>> ... 5 more 
>>> 
>>>  
>>> the code i use is :
>>>  
>>> public static void main(String args[]) {
>>>   
>>>   // the url will be to visit
>>>   String requestURL =
>>> "http://gate.baidu.com/tc?m=2&w=0_5_%E5%93%80%E9%B8%A3&t=wap&ssid=0&from=0&bd_page_type=0&p=b43ed516d9c21fff57ee96685c52&order=2&vit=osres&uid=wap_1237916098_46&src=http%3A%2F%2Fpost%2Ebaidu%2Ecom%2Ff%3Fkw%3D%B9%C2%D1%E3%B0%A7%C3%F9";
>>>   
>>>   // use httpclient to get the response of the url
>>>   String response = null ;
>>>   HttpClient httpClient = new HttpClient();
>>>   GetMethod method = new GetMethod(requestURL);
>>>  
>>>   try {
>>>    httpClient.executeMethod(method);
>>>    response = new String(method.getResponseBody(), "utf-8");
>>>   } catch (Exception e) {
>>>    e.printStackTrace();
>>>   } finally {
>>>    method.releaseConnection();
>>>   }
>>>   
>>>   // print the response ,if ok
>>>   System.out.println(response);
>>>  }
>>>  
>>> 
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>>For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Invalid-redirect-location%3A-http%3A--wapp.baidu.com-f-kw%3D--------tp22757662p24934482.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????

Posted by Ken Krugler <kk...@transpac.com>.
Hi Vikrant,

On Aug 17, 2009, at 7:49pm, Vicky_Dev wrote:

>
> Thanks Ken for your response
>
>
> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/? 
> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple 
> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>
>
> I have tried to set charset:
> objGetMethod.setRequestHeader("Content-Type", "text/plain; charset=" +
> "UTF-8");
> objGetMethod.setFollowRedirects(true);
>
> But still I am facing "org.apache.commons.httpclient.URIException:  
> Invalid
> query" issue

If I understand your situation correctly, then it's not going to make  
a difference what you specify in the request header for the content- 
type. The issue is that you're getting a redirect where the server is  
sending back an improperly encoded URL in the location response header.

So as per my previous response, you'll have to disable redirect  
following via setFollowRedirects(false), then handle the HTTP moved  
response codes yourself. At this time you'll have the chance to try to  
fix up the URL you're getting back in the response header.

> We can not encode "Â" character since Apache Solr --accent  
> normalization
> will not work after encoding.

Sorry, no idea what this has to do with your problem. But note that if  
you have accented characters in your URLs, you'll need to make sure  
that the Solr webapp container (e.g. Tomcat) is properly configured to  
use UTF-8 for URLs.

-- Ken

>
> Ken Krugler wrote:
>>
>> Hi Vikrant,
>>
>> On Aug 12, 2009, at 4:26am, Vicky_Dev wrote:
>>
>>> I am facing similar issue whilst calling Solr (search engine) with
>>> HTTPClient
>>>
>>> Following URL works very well within browser
>>>
>>> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/ 
>>> select/?
>>> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple
>>> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>>>
>>> But same URL is not coming up from HTTPClient.
>>>
>>> Error:
>>> org.apache.commons.httpclient.URIException: Invalid query
>>
>> I'm assuming the issue for your URL is that "Âpple" has a non- 
>> escaped
>> character in it, and the encoding being used to process the URL is
>> something other than UTF-8.
>>
>> But I'm using HttpClient 4.x currently, and don't have the 3.x source
>> handy - which it looks like you're using.
>>
>> One other inline comment below, from the older email question you
>> referenced
>>
>> [snip]
>>
>>>> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>>>> On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>>>>
>>>>>> Hi,i'm using httpclient to connect a url. The problem is that i
>>>>>> accept a
>>>>>> error redirect location ,for example http://wapp.baidu.com/f?
>>>>>> kw=????????
>>>>>> , when to visit the url.
>>>>>>
>>>>>> why there appeared some characters like "???????? "? The correct
>>>>>> redirect laoction should be
>>>>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>>>>
>>>>> This is most likely because the redirect location in the HTTP
>>>>> response
>>>>> is not correctly escaped. HTTP messages are expected to consist of
>>>>> US-ASCII characters only. Non-US-ASCII characters are supposed  
>>>>> to be
>>>>> escaped.
>>>>>
>>>>> Oleg
>>>>
>>>> how can httpclient escape Non-US-ASCII characters correctly ?
>>
>> This isn't an issue with HttpClient.
>>
>> The problem is that the server is sending back an invalid redirect  
>> URL
>> (in the response header), where it hasn't been properly encoded as  
>> US-
>> ASCII.
>>
>> When HttpClient tries to automatically follow this redirect, it runs
>> into problems.
>>
>> To fix this, you'd have to disable auto-following of redirects, then
>> handle the redirect response yourself. If you set things up this way,
>> you could try to detect improperly encoded redirect URLs in the
>> response header, and fix them up before following them.
>>
>> -- Ken
>>
>> --------------------------
>> Ken Krugler
>> TransPac Software, Inc.
>> <http://www.transpac.com>
>> +1 530-210-6378
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Invalid-redirect-location%3A-http%3A--wapp.baidu.com-f-kw%3D--------tp22757662p25017600.html
> Sent from the HttpClient-User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>

--------------------------
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-210-6378


Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????

Posted by Vicky_Dev <vi...@yahoo.co.in>.
Thanks Ken for your response


http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple%29&spellcheck=true&start=0&rows=10&qt=dismaxrequest 


I have tried to set charset:
objGetMethod.setRequestHeader("Content-Type", "text/plain; charset=" +
"UTF-8"); 
objGetMethod.setFollowRedirects(true);

But still I am facing "org.apache.commons.httpclient.URIException: Invalid
query" issue

We can not encode "Â" character since Apache Solr --accent normalization
will not work after encoding.

Please advice

~Vikrant



Ken Krugler wrote:
> 
> Hi Vikrant,
> 
> On Aug 12, 2009, at 4:26am, Vicky_Dev wrote:
> 
>> I am facing similar issue whilst calling Solr (search engine) with  
>> HTTPClient
>>
>> Following URL works very well within browser
>>
>> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/? 
>> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple 
>> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>>
>> But same URL is not coming up from HTTPClient.
>>
>> Error:
>> org.apache.commons.httpclient.URIException: Invalid query
> 
> I'm assuming the issue for your URL is that "Âpple" has a non-escaped  
> character in it, and the encoding being used to process the URL is  
> something other than UTF-8.
> 
> But I'm using HttpClient 4.x currently, and don't have the 3.x source  
> handy - which it looks like you're using.
> 
> One other inline comment below, from the older email question you  
> referenced
> 
> [snip]
> 
>>> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>>> On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>>>
>>>>> Hi,i'm using httpclient to connect a url. The problem is that i  
>>>>> accept a
>>>>> error redirect location ,for example http://wapp.baidu.com/f? 
>>>>> kw=????????
>>>>> , when to visit the url.
>>>>>
>>>>> why there appeared some characters like "???????? "? The correct
>>>>> redirect laoction should be
>>>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>>>
>>>> This is most likely because the redirect location in the HTTP  
>>>> response
>>>> is not correctly escaped. HTTP messages are expected to consist of
>>>> US-ASCII characters only. Non-US-ASCII characters are supposed to be
>>>> escaped.
>>>>
>>>> Oleg
>>>
>>> how can httpclient escape Non-US-ASCII characters correctly ?
> 
> This isn't an issue with HttpClient.
> 
> The problem is that the server is sending back an invalid redirect URL  
> (in the response header), where it hasn't been properly encoded as US- 
> ASCII.
> 
> When HttpClient tries to automatically follow this redirect, it runs  
> into problems.
> 
> To fix this, you'd have to disable auto-following of redirects, then  
> handle the redirect response yourself. If you set things up this way,  
> you could try to detect improperly encoded redirect URLs in the  
> response header, and fix them up before following them.
> 
> -- Ken
> 
> --------------------------
> Ken Krugler
> TransPac Software, Inc.
> <http://www.transpac.com>
> +1 530-210-6378
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Invalid-redirect-location%3A-http%3A--wapp.baidu.com-f-kw%3D--------tp22757662p25017600.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????

Posted by Ken Krugler <kk...@transpac.com>.
Hi Vikrant,

On Aug 12, 2009, at 4:26am, Vicky_Dev wrote:

> I am facing similar issue whilst calling Solr (search engine) with  
> HTTPClient
>
> Following URL works very well within browser
>
> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/? 
> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple 
> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>
> But same URL is not coming up from HTTPClient.
>
> Error:
> org.apache.commons.httpclient.URIException: Invalid query

I'm assuming the issue for your URL is that "Âpple" has a non-escaped  
character in it, and the encoding being used to process the URL is  
something other than UTF-8.

But I'm using HttpClient 4.x currently, and don't have the 3.x source  
handy - which it looks like you're using.

One other inline comment below, from the older email question you  
referenced

[snip]

>> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>> On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>>
>>>> Hi,i'm using httpclient to connect a url. The problem is that i  
>>>> accept a
>>>> error redirect location ,for example http://wapp.baidu.com/f? 
>>>> kw=????????
>>>> , when to visit the url.
>>>>
>>>> why there appeared some characters like "???????? "? The correct
>>>> redirect laoction should be
>>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>>
>>> This is most likely because the redirect location in the HTTP  
>>> response
>>> is not correctly escaped. HTTP messages are expected to consist of
>>> US-ASCII characters only. Non-US-ASCII characters are supposed to be
>>> escaped.
>>>
>>> Oleg
>>
>> how can httpclient escape Non-US-ASCII characters correctly ?

This isn't an issue with HttpClient.

The problem is that the server is sending back an invalid redirect URL  
(in the response header), where it hasn't been properly encoded as US- 
ASCII.

When HttpClient tries to automatically follow this redirect, it runs  
into problems.

To fix this, you'd have to disable auto-following of redirects, then  
handle the redirect response yourself. If you set things up this way,  
you could try to detect improperly encoded redirect URLs in the  
response header, and fix them up before following them.

-- Ken

--------------------------
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-210-6378


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org