You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@any23.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2013/10/22 14:00:52 UTC

Re: user Digest 20 Oct 2013 15:17:20 -0000 Issue 48

On Sun, Oct 20, 2013 at 4:17 PM, <us...@any23.apache.org> wrote:

>
> user Digest 20 Oct 2013 15:17:20 -0000 Issue 48
>
> Topics (messages 110 through 111)
>
> Re: URL Encoding Issues in Apache Any23
>         110 by: S.L
>
> Foaf:Depiction Values in Any23 getting transformed.
>         111 by: S.L
>
> Administrivia:
>
> ---------------------------------------------------------------------
> To post to the list, e-mail: user@any23.apache.org
> To unsubscribe, e-mail: user-digest-unsubscribe@any23.apache.org
> For additional commands, e-mail: user-digest-help@any23.apache.org
>
> ----------------------------------------------------------------------
>
>
>
> Lewis,
>
> That is correct , that is the only discrepancy that I have noticed so far
> , I think whats happening here is that any23 is encoding an already encoded
> URL , I have not found a way to avoid that in Java i.e avoid encoding an
> already encoded URL. Is there a way to do so ? Does any23 consider the
> possibility of the URL being already encoded ?
>
> Thanks.
>
>
> On Wed, Oct 2, 2013 at 8:39 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Hi,
>>
>> On Sun, Sep 29, 2013 at 6:44 PM, <us...@any23.apache.org>wrote:
>>
>>>
>>> I seem to be running into issues where the URL that is submitted to
>>> Any23 is being encoded in a format that is causing the URL to become
>>> invalid , I am not sure if the URL that is being encoded was already
>>> encoded by Any23 or if Any23 just encoded the URL in a wrong format.
>>>
>>
>> From what I can see below the 2nd (encoded) URL includes hash # as the
>> only difference. Is this correct? Are there any other discrepancies which
>> you've noticed.
>> I checked out Jira instance and nothing like this has been reported
>> before.
>> Thanks
>> Lewis
>>
>>
>>>
>>> Please see the example below and advise.
>>>
>>> URL (submitted to Any23 i.e before encoding happens) :
>>>
>>>
>>> http://www.xxx.com/site/searchpage.jsp?_dyncharset=ISO-8859-1&id=pcat17071&type=page&ks=960&st=Just_Dance_Disney_Party_67055&sc=Global&cp=1&sp=&qp=crootcategoryid%23%23-1%23%23-1~~q4a7573745f44616e63655f4469736e65795f50617274795f3637303535~~ncabcat0700000%23%231%23%231&list=y&usc=All+Categories&nrp=15&iht=n
>>>
>>> URL After Encoding ( I know this by printing the URL from
>>> DefaultHttpCleint.java):
>>>
>>>
>>> http://www.xxx.com/site/searchpage.jsp?_dyncharset=ISO-8859-1&id=pcat17071&type=page&ks=960&st=Just_Dance_Disney_Party_67055&sc=Global&cp=1&sp=&qp=crootcategoryid#%23-1%23%23-1~~q4a7573745f44616e63655f4469736e65795f50617274795f3637303535~~ncabcat0700000%23%231%23%231&list=y&usc=All%20Categories&nrp=15&iht=n
>>>
>>>
>>>
>>
>>
>> --
>> *Lewis*
>>
>
>
> I am parsing the below URL and Iam interested in the foaf:depiction
>
>
> http://www.kmart.com/canon-eos-rebel-t3i-18-55mm-is-ii/p-00339693000P?prdNo=1&blockNo=1&blockType=G1
>
> The foaf:depcition I get is the following
>
>
> http://s.shld.net/is/image/Sears/http://c.shld.net/rpx/i/s/i/spin/image/spin_prod_ec_463043901?hei=&wid=&op_sharpen=&resMode=sharp&op_usm=0.9,0.5,0,0
>
> However from the HTML markup the foaf:depiction is
>
>
> http://s.shld.net/is/image/Sears/http://c.shld.net/rpx/i/s/i/spin/image/spin_prod_ec_463043901?hei=&amp;wid=&amp;op_sharpen=&amp;resMode=sharp&amp;op_usm=0.9,0.5,0,0<http://s.shld.net/is/image/Sears/http://c.shld.net/rpx/i/s/i/spin/image/spin_prod_ec_463043901?hei=&wid=&op_sharpen=&resMode=sharp&op_usm=0.9,0.5,0,0>
>
> Looks like the foafdepiction URL is getting transformed in the Any23
> parsing , I am using the latest Any23 trunk code .
>
> The & is getting replaced with  &amp;
>
> Please advise.
>
>


-- 
*Lewis*