You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Vicky_Dev <vi...@yahoo.co.in> on 2009/08/12 13:26:29 UTC
Re: Re:Re: Invalid redirect location:
http://wapp.baidu.com/f?kw=???????
I am facing similar issue whilst calling Solr (search engine) with HTTPClient
Following URL works very well within browser
http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple%29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
But same URL is not coming up from HTTPClient.
Error:
org.apache.commons.httpclient.URIException: Invalid query
~Vikrant
nonopo12345 wrote:
>
> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>
>>> Hi,i'm using httpclient to connect a url. The problem is that i accept a
>>> error redirect location ,for example http://wapp.baidu.com/f?kw=????????
>>> , when to visit the url.
>>>
>>> why there appeared some characters like "???????? "? The correct
>>> redirect laoction should be
>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>
>>This is most likely because the redirect location in the HTTP response
>>is not correctly escaped. HTTP messages are expected to consist of
>>US-ASCII characters only. Non-US-ASCII characters are supposed to be
>>escaped.
>>
>>Oleg
>
> how can httpclient escape Non-US-ASCII characters correctly ?
> could you give me a example on visiting this url
> :http://gate.baidu.com/tc?m=2&w=0_5_%E5%93%80%E9%B8%A3&t=wap&ssid=0&from=0&bd_page_type=0&p=b43ed516d9c21fff57ee96685c52&order=2&vit=osres&uid=wap_1237916098_46&src=http%3A%2F%2Fpost%2Ebaidu%2Ecom%2Ff%3Fkw%3D%B9%C2%D1%E3%B0%A7%C3%F9
>
>>>
>>> thd exception is:
>>> org.apache.commons.httpclient.InvalidRedirectLocationException: Invalid
>>> redirect location: http://wapp.baidu.com/f?kw=????????
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.processRedirectResponse(HttpMethodDirector.java:619)
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:179)
>>> at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>>> at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>>> at cn.rjb.app.test.TestClickA.getSourceCode(TestClickA.java:28)
>>> at cn.rjb.app.test.TestClickA.main(TestClickA.java:15)
>>> Caused by: org.apache.commons.httpclient.URIException: Invalid query
>>> at org.apache.commons.httpclient.URI.parseUriReference(URI.java:2049)
>>> at org.apache.commons.httpclient.URI. <init>(URI.java:147)
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.processRedirectResponse(HttpMethodDirector.java:601)
>>> ... 5 more
>>>
>>>
>>> the code i use is :
>>>
>>> public static void main(String args[]) {
>>>
>>> // the url will be to visit
>>> String requestURL =
>>> "http://gate.baidu.com/tc?m=2&w=0_5_%E5%93%80%E9%B8%A3&t=wap&ssid=0&from=0&bd_page_type=0&p=b43ed516d9c21fff57ee96685c52&order=2&vit=osres&uid=wap_1237916098_46&src=http%3A%2F%2Fpost%2Ebaidu%2Ecom%2Ff%3Fkw%3D%B9%C2%D1%E3%B0%A7%C3%F9";
>>>
>>> // use httpclient to get the response of the url
>>> String response = null ;
>>> HttpClient httpClient = new HttpClient();
>>> GetMethod method = new GetMethod(requestURL);
>>>
>>> try {
>>> httpClient.executeMethod(method);
>>> response = new String(method.getResponseBody(), "utf-8");
>>> } catch (Exception e) {
>>> e.printStackTrace();
>>> } finally {
>>> method.releaseConnection();
>>> }
>>>
>>> // print the response ,if ok
>>> System.out.println(response);
>>> }
>>>
>>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>>For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
>
>
--
View this message in context: http://www.nabble.com/Invalid-redirect-location%3A-http%3A--wapp.baidu.com-f-kw%3D--------tp22757662p24934482.html
Sent from the HttpClient-User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????
Posted by Ken Krugler <kk...@transpac.com>.
Hi Vikrant,
On Aug 17, 2009, at 7:49pm, Vicky_Dev wrote:
>
> Thanks Ken for your response
>
>
> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?
> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple
> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>
>
> I have tried to set charset:
> objGetMethod.setRequestHeader("Content-Type", "text/plain; charset=" +
> "UTF-8");
> objGetMethod.setFollowRedirects(true);
>
> But still I am facing "org.apache.commons.httpclient.URIException:
> Invalid
> query" issue
If I understand your situation correctly, then it's not going to make
a difference what you specify in the request header for the content-
type. The issue is that you're getting a redirect where the server is
sending back an improperly encoded URL in the location response header.
So as per my previous response, you'll have to disable redirect
following via setFollowRedirects(false), then handle the HTTP moved
response codes yourself. At this time you'll have the chance to try to
fix up the URL you're getting back in the response header.
> We can not encode "Â" character since Apache Solr --accent
> normalization
> will not work after encoding.
Sorry, no idea what this has to do with your problem. But note that if
you have accented characters in your URLs, you'll need to make sure
that the Solr webapp container (e.g. Tomcat) is properly configured to
use UTF-8 for URLs.
-- Ken
>
> Ken Krugler wrote:
>>
>> Hi Vikrant,
>>
>> On Aug 12, 2009, at 4:26am, Vicky_Dev wrote:
>>
>>> I am facing similar issue whilst calling Solr (search engine) with
>>> HTTPClient
>>>
>>> Following URL works very well within browser
>>>
>>> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/
>>> select/?
>>> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple
>>> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>>>
>>> But same URL is not coming up from HTTPClient.
>>>
>>> Error:
>>> org.apache.commons.httpclient.URIException: Invalid query
>>
>> I'm assuming the issue for your URL is that "Âpple" has a non-
>> escaped
>> character in it, and the encoding being used to process the URL is
>> something other than UTF-8.
>>
>> But I'm using HttpClient 4.x currently, and don't have the 3.x source
>> handy - which it looks like you're using.
>>
>> One other inline comment below, from the older email question you
>> referenced
>>
>> [snip]
>>
>>>> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>>>> On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>>>>
>>>>>> Hi,i'm using httpclient to connect a url. The problem is that i
>>>>>> accept a
>>>>>> error redirect location ,for example http://wapp.baidu.com/f?
>>>>>> kw=????????
>>>>>> , when to visit the url.
>>>>>>
>>>>>> why there appeared some characters like "???????? "? The correct
>>>>>> redirect laoction should be
>>>>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>>>>
>>>>> This is most likely because the redirect location in the HTTP
>>>>> response
>>>>> is not correctly escaped. HTTP messages are expected to consist of
>>>>> US-ASCII characters only. Non-US-ASCII characters are supposed
>>>>> to be
>>>>> escaped.
>>>>>
>>>>> Oleg
>>>>
>>>> how can httpclient escape Non-US-ASCII characters correctly ?
>>
>> This isn't an issue with HttpClient.
>>
>> The problem is that the server is sending back an invalid redirect
>> URL
>> (in the response header), where it hasn't been properly encoded as
>> US-
>> ASCII.
>>
>> When HttpClient tries to automatically follow this redirect, it runs
>> into problems.
>>
>> To fix this, you'd have to disable auto-following of redirects, then
>> handle the redirect response yourself. If you set things up this way,
>> you could try to detect improperly encoded redirect URLs in the
>> response header, and fix them up before following them.
>>
>> -- Ken
>>
>> --------------------------
>> Ken Krugler
>> TransPac Software, Inc.
>> <http://www.transpac.com>
>> +1 530-210-6378
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Invalid-redirect-location%3A-http%3A--wapp.baidu.com-f-kw%3D--------tp22757662p25017600.html
> Sent from the HttpClient-User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>
--------------------------
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-210-6378
Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????
Posted by Vicky_Dev <vi...@yahoo.co.in>.
Thanks Ken for your response
http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple%29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
I have tried to set charset:
objGetMethod.setRequestHeader("Content-Type", "text/plain; charset=" +
"UTF-8");
objGetMethod.setFollowRedirects(true);
But still I am facing "org.apache.commons.httpclient.URIException: Invalid
query" issue
We can not encode "Â" character since Apache Solr --accent normalization
will not work after encoding.
Please advice
~Vikrant
Ken Krugler wrote:
>
> Hi Vikrant,
>
> On Aug 12, 2009, at 4:26am, Vicky_Dev wrote:
>
>> I am facing similar issue whilst calling Solr (search engine) with
>> HTTPClient
>>
>> Following URL works very well within browser
>>
>> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?
>> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple
>> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>>
>> But same URL is not coming up from HTTPClient.
>>
>> Error:
>> org.apache.commons.httpclient.URIException: Invalid query
>
> I'm assuming the issue for your URL is that "Âpple" has a non-escaped
> character in it, and the encoding being used to process the URL is
> something other than UTF-8.
>
> But I'm using HttpClient 4.x currently, and don't have the 3.x source
> handy - which it looks like you're using.
>
> One other inline comment below, from the older email question you
> referenced
>
> [snip]
>
>>> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>>> On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>>>
>>>>> Hi,i'm using httpclient to connect a url. The problem is that i
>>>>> accept a
>>>>> error redirect location ,for example http://wapp.baidu.com/f?
>>>>> kw=????????
>>>>> , when to visit the url.
>>>>>
>>>>> why there appeared some characters like "???????? "? The correct
>>>>> redirect laoction should be
>>>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>>>
>>>> This is most likely because the redirect location in the HTTP
>>>> response
>>>> is not correctly escaped. HTTP messages are expected to consist of
>>>> US-ASCII characters only. Non-US-ASCII characters are supposed to be
>>>> escaped.
>>>>
>>>> Oleg
>>>
>>> how can httpclient escape Non-US-ASCII characters correctly ?
>
> This isn't an issue with HttpClient.
>
> The problem is that the server is sending back an invalid redirect URL
> (in the response header), where it hasn't been properly encoded as US-
> ASCII.
>
> When HttpClient tries to automatically follow this redirect, it runs
> into problems.
>
> To fix this, you'd have to disable auto-following of redirects, then
> handle the redirect response yourself. If you set things up this way,
> you could try to detect improperly encoded redirect URLs in the
> response header, and fix them up before following them.
>
> -- Ken
>
> --------------------------
> Ken Krugler
> TransPac Software, Inc.
> <http://www.transpac.com>
> +1 530-210-6378
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>
>
>
--
View this message in context: http://www.nabble.com/Invalid-redirect-location%3A-http%3A--wapp.baidu.com-f-kw%3D--------tp22757662p25017600.html
Sent from the HttpClient-User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Invalid redirect location: http://wapp.baidu.com/f?kw=???????
Posted by Ken Krugler <kk...@transpac.com>.
Hi Vikrant,
On Aug 12, 2009, at 4:26am, Vicky_Dev wrote:
> I am facing similar issue whilst calling Solr (search engine) with
> HTTPClient
>
> Following URL works very well within browser
>
> http://<server>:8080/apache-solr-1.3.0/CORE_WWW.PUFFIN.CO.UK/select/?
> q=Index_Type_s%3AproductIndex+AND+%28test_raman_sub%20Âpple
> %29&spellcheck=true&start=0&rows=10&qt=dismaxrequest
>
> But same URL is not coming up from HTTPClient.
>
> Error:
> org.apache.commons.httpclient.URIException: Invalid query
I'm assuming the issue for your URL is that "Âpple" has a non-escaped
character in it, and the encoding being used to process the URL is
something other than UTF-8.
But I'm using HttpClient 4.x currently, and don't have the 3.x source
handy - which it looks like you're using.
One other inline comment below, from the older email question you
referenced
[snip]
>> 在2009-03-30,"Oleg Kalnichevski" <ol...@apache.org> 写道:
>>> On Sat, 2009-03-28 at 22:50 +0800, nonopo12345 wrote:
>>>>
>>>> Hi,i'm using httpclient to connect a url. The problem is that i
>>>> accept a
>>>> error redirect location ,for example http://wapp.baidu.com/f?
>>>> kw=????????
>>>> , when to visit the url.
>>>>
>>>> why there appeared some characters like "???????? "? The correct
>>>> redirect laoction should be
>>>> http://wapp.baidu.com/f?kw=%B9%C2%D1%E3%B0%A7%C3%F9.
>>>
>>> This is most likely because the redirect location in the HTTP
>>> response
>>> is not correctly escaped. HTTP messages are expected to consist of
>>> US-ASCII characters only. Non-US-ASCII characters are supposed to be
>>> escaped.
>>>
>>> Oleg
>>
>> how can httpclient escape Non-US-ASCII characters correctly ?
This isn't an issue with HttpClient.
The problem is that the server is sending back an invalid redirect URL
(in the response header), where it hasn't been properly encoded as US-
ASCII.
When HttpClient tries to automatically follow this redirect, it runs
into problems.
To fix this, you'd have to disable auto-following of redirects, then
handle the redirect response yourself. If you set things up this way,
you could try to detect improperly encoded redirect URLs in the
response header, and fix them up before following them.
-- Ken
--------------------------
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-210-6378
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org