You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Furkan KAMACI <fu...@gmail.com> on 2013/11/23 01:03:54 UTC

URL Encoding of Plus Sign Character Problem at Solr

Java URL encoder has that explanation at its api docs:


*When encoding a String, the following rules apply:*

*The space character " " is converted into a plus sign "+".*

at: http://docs.oracle.com/javase/6/docs/api/java/net/URLEncoder.html

so its expected to have %20 for a space character but *+* is a valid
encoding. I've faced with a similar issue when interacting my custom search
API and Solr.

Solr wiki says that:

"Please note that many characters in the Solr Query Syntax (most notable
the plus sign: "+") are special characters in URLs, so when constructing
request URLs manually, you must properly URL-Encode these characters."

Example from wiki is that:

                                  q=  +popularity:[10   TO   *]
+section:0

http://localhost:8983/solr/select?q=%2Bpopularity:[10%20TO%20*]%20%2Bsection:0

However converting a space character into a plus sign is a valid encoding
of URL. Should client be responsible for such kind of things or Solr code
(i.e. calling: .replace("+", "%20"))? I can fire a Jira for it and apply a
patch.

Thanks;
Furkan KAMACI

Re: URL Encoding of Plus Sign Character Problem at Solr

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/22/2013 5:03 PM, Furkan KAMACI wrote:
> Java URL encoder has that explanation at its api docs:
>
> /When encoding a String, the following rules apply:
> /
> /
> /
> */The space character " " is converted into a plus sign "+"./*
>
> at: http://docs.oracle.com/javase/6/docs/api/java/net/URLEncoder.html
>
> so its expected to have %20 for a space character but *+*is a valid 
> encoding. I've faced with a similar issue when interacting my custom 
> search API and Solr.
>
> Solr wiki says that:
>
> "Please note that many characters in the Solr Query Syntax (most 
> notable the plus sign: "+") are special characters in URLs, so when 
> constructing request URLs manually, you must properly URL-Encode these 
> characters."
>
> Example from wiki is that:
>
>                                   q=  +popularity:[10   TO   *]     
> +section:0
> http://localhost:8983/solr/select?q=%2Bpopularity:[10%20TO%20*]%20%2Bsection:0
>
> However converting a space character into a plus sign is a valid 
> encoding of URL. Should client be responsible for such kind of things 
> or Solr code (i.e. calling: .replace("+", "%20"))? I can fire a Jira 
> for it and apply a patch.

If you only use the plus sign when you want an actual plus sign in the 
query, and use real spaces when constructing the query, you can URL 
encode the entire thing and it will work.  The plus signs will be 
encoded as %2B, so Solr will see a real plus sign.

Using a plus sign for a space is completely optional. After the encoding 
mentioned above, if you want to replace %20 with a plus sign, it will 
shrink the URL a little bit and make it easier to read.  It's not 
necessary, though.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org