You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by manju16832003 <ma...@gmail.com> on 2014/02/24 04:58:47 UTC

Issue with PHP urlencode and solr encoding

Hi,
I come across the issue with urlencoding between PHP and Solr.
I have a field indexed with value *WBE(Honda Edix)* in Solr.

>From PHP codes, if I urlencode($string) and send to Solr, I do not get the
accurate results.
Here is the part of the solr query *fq=model:WBE(Honda+Edix)*

However, If I do it *fq=model:WBE\(Honda+Edix\)* this way directly from
Solr, I would get the accurate results.

I assume that the '(' and ')' part of the solr query.

How do I escape '(' and ')' from the client side.




--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-PHP-urlencode-and-solr-encoding-tp4119176.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue with PHP urlencode and solr encoding

Posted by manju16832003 <ma...@gmail.com>.
Hi Shawn and Rico,
Thanks you for your suggestions, those are valuable suggestions :-). 

If Pharse Query does not work as we expected sometimes, I guess we could use
*TermQuery* instead.

http://blog.florian-hopf.de/2013/01/make-your-filters-match-faceting-in-solr.html

This worked fine *fq={!term%20f=model%20v="WBE(Honda%20Edix)"}*.

I agree with Shawn comments that "Escaping query characters must be done
before URL encoding."

:-).
Thanks again for your replies.




--
View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-PHP-urlencode-and-solr-encoding-tp4119176p4119187.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue with PHP urlencode and solr encoding

Posted by Rico P <go...@gmail.com>.
On Mon, Feb 24, 2014 at 11:52 AM, Shawn Heisey <so...@elyograg.org> wrote:
>
>
> The Solarium library for PHP also says that it does escaping, but I
> can't find the manual section that they mention about term escaping.
> Here's a section that has an example of phrase escaping (putting the
> value in double quotes):
>
> http://wiki.solarium-project.org/index.php/V3:Escaping
>
> Thanks,
> Shawn
>
> They do have it.
https://github.com/basdenooijer/solarium/blob/master/library/Solarium/Core/Query/Helper.php#L104

regards,

rico

Re: Issue with PHP urlencode and solr encoding

Posted by Shawn Heisey <so...@elyograg.org>.
On 2/23/2014 8:58 PM, manju16832003 wrote:
> I come across the issue with urlencoding between PHP and Solr.
> I have a field indexed with value *WBE(Honda Edix)* in Solr.
> 
> From PHP codes, if I urlencode($string) and send to Solr, I do not get the
> accurate results.
> Here is the part of the solr query *fq=model:WBE(Honda+Edix)*
> 
> However, If I do it *fq=model:WBE\(Honda+Edix\)* this way directly from
> Solr, I would get the accurate results.
> 
> I assume that the '(' and ')' part of the solr query.
> 
> How do I escape '(' and ')' from the client side.

This reply got to be a lot longer than I intended.  Here's the novel:

URL encoding is only what needs to be done when you are constructing a
URL.  Those values will be decoded by Solr before passing it to the
query parser.

The query parser has its own set of characters that are special.  If you
intend any of these characters to be literal, they must be escaped with
a backslash.

http://lucene.apache.org/core/4_6_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html?is-external=true#Escaping_Special_Characters

Although there is some overlap between URL encoding and query escaping,
they do have different lists of characters that require changing.
Escaping query characters must be done before URL encoding.

Another way to allow special characters in your query is to make it a
phrase query - enclose it in double quotes.  This would be your query
using this method, before URL encoding:

fq=model:"WBE(Honda Edix)"

Note that the phrase query method does not always produce the expected
results, and depending on your configuration, in some cases won't work
at all.

The PECL Solr library for PHP has a query escaping method similar to
what can be found in SolrJ.  Here's their documentation reference for it:

http://www.php.net/manual/en/solrutils.escapequerychars.php

The Solarium library for PHP also says that it does escaping, but I
can't find the manual section that they mention about term escaping.
Here's a section that has an example of phrase escaping (putting the
value in double quotes):

http://wiki.solarium-project.org/index.php/V3:Escaping

There is a bug in the PECL library that makes it not work with Solr 4.x.
 I created a patch for this bug, but they haven't fixed it in any
downloadable version.

https://bugs.php.net/bug.php?id=62332

Thanks,
Shawn