You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Madhav Bahuguna <ma...@gmail.com> on 2015/05/08 08:28:02 UTC

How to handle special characters in fuzzy search query

So my solr query is implemented in two parts,first query does an exact
search if there are no results found for exact then it goes to the second
query that does a fuzzy search.
every things works fine but in situations like-->A user enters "burg +"
So in exact search no records will come,so second query is called to do a
fuzzy search.Now comes the problem my fuzzy query does not understand
special characters like +,-* which throws and error.If i dont pass special
characters it works fine.  But in real world a user can put characters with
their search,which will throw an error.
Now iam stuck in this and dont know how to resolve this issue.
This is how my exact search query looks like

    $query1="(business_name:$data*^100 OR city_name:$data*^1 OR
    locality_name:$data*^6 OR business_search_tag_name:$data*^8 OR
    type_name:$data*^7) AND (business_active_flag:1) AND
    (business_visible_flag:1) AND (delete_status_businessmasters:0)";

This is how my fuzzy query looks like


$query2='(_query_:%20"{!complexphrase%20qf=business_name^100+type_name^0.4+locality_name^6%27}%20'.$url_new.')AND(business_active_flag:1)AND(business_point:[1.5
TO 2.0])&q.op=AND&wt=json&indent=true';

Iam new to solr and dont know how to tackle this situation.

Details
Solrphpclient
php
solr 4.9


-- 
Regards
Madhav Bahuguna

Re: How to handle special characters in fuzzy search query

Posted by Tomasz Borek <to...@gmail.com>.
FWIW you may also want to drop the boolean ops in favour of + and - (OR
being default)

pozdrawiam,
LAFK

2015-05-08 18:59 GMT+02:00 Erick Erickson <er...@gmail.com>:

> Steven:
>
> They're listed on the ref guide I posted. Not a concise list, but
> you'll see && || and other "interesting" bits.
>
> On Fri, May 8, 2015 at 9:20 AM, Steven White <sw...@gmail.com> wrote:
> > Hi Erick,
> >
> > Is there a documented list of all operators (AND, OR, NOT, etc.) that
> also
> > need to be escaped?  Are there more beside the 3 I listed?
> >
> > Thanks
> >
> > Steve
> >
> > On Fri, May 8, 2015 at 11:47 AM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> Each of the characters you identified are characters that have meaning
> >> to the query parser, '+' is a mandatory clause, '-' is a NOT operator
> >> and * is a wildcard. To get through the query parser, these (and a
> >> bunch of others, see below) must be escaped.
> >>
> >> Personally, though, I'd pre-scrub the data. Depending on your analysis
> >> chain such things may be thrown away anyway.
> >>
> >>
> https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser
> >> - the "escaping special characters" bit.
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, May 7, 2015 at 11:28 PM, Madhav Bahuguna
> >> <ma...@gmail.com> wrote:
> >> > So my solr query is implemented in two parts,first query does an exact
> >> > search if there are no results found for exact then it goes to the
> second
> >> > query that does a fuzzy search.
> >> > every things works fine but in situations like-->A user enters "burg
> +"
> >> > So in exact search no records will come,so second query is called to
> do a
> >> > fuzzy search.Now comes the problem my fuzzy query does not understand
> >> > special characters like +,-* which throws and error.If i dont pass
> >> special
> >> > characters it works fine.  But in real world a user can put characters
> >> with
> >> > their search,which will throw an error.
> >> > Now iam stuck in this and dont know how to resolve this issue.
> >> > This is how my exact search query looks like
> >> >
> >> >     $query1="(business_name:$data*^100 OR city_name:$data*^1 OR
> >> >     locality_name:$data*^6 OR business_search_tag_name:$data*^8 OR
> >> >     type_name:$data*^7) AND (business_active_flag:1) AND
> >> >     (business_visible_flag:1) AND (delete_status_businessmasters:0)";
> >> >
> >> > This is how my fuzzy query looks like
> >> >
> >> >
> >> >
> >>
> $query2='(_query_:%20"{!complexphrase%20qf=business_name^100+type_name^0.4+locality_name^6%27}%20'.$url_new.')AND(business_active_flag:1)AND(business_point:[1.5
> >> > TO 2.0])&q.op=AND&wt=json&indent=true';
> >> >
> >> > Iam new to solr and dont know how to tackle this situation.
> >> >
> >> > Details
> >> > Solrphpclient
> >> > php
> >> > solr 4.9
> >> >
> >> >
> >> > --
> >> > Regards
> >> > Madhav Bahuguna
> >>
>

Re: How to handle special characters in fuzzy search query

Posted by Erick Erickson <er...@gmail.com>.
Steven:

They're listed on the ref guide I posted. Not a concise list, but
you'll see && || and other "interesting" bits.

On Fri, May 8, 2015 at 9:20 AM, Steven White <sw...@gmail.com> wrote:
> Hi Erick,
>
> Is there a documented list of all operators (AND, OR, NOT, etc.) that also
> need to be escaped?  Are there more beside the 3 I listed?
>
> Thanks
>
> Steve
>
> On Fri, May 8, 2015 at 11:47 AM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Each of the characters you identified are characters that have meaning
>> to the query parser, '+' is a mandatory clause, '-' is a NOT operator
>> and * is a wildcard. To get through the query parser, these (and a
>> bunch of others, see below) must be escaped.
>>
>> Personally, though, I'd pre-scrub the data. Depending on your analysis
>> chain such things may be thrown away anyway.
>>
>> https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser
>> - the "escaping special characters" bit.
>>
>> Best,
>> Erick
>>
>> On Thu, May 7, 2015 at 11:28 PM, Madhav Bahuguna
>> <ma...@gmail.com> wrote:
>> > So my solr query is implemented in two parts,first query does an exact
>> > search if there are no results found for exact then it goes to the second
>> > query that does a fuzzy search.
>> > every things works fine but in situations like-->A user enters "burg +"
>> > So in exact search no records will come,so second query is called to do a
>> > fuzzy search.Now comes the problem my fuzzy query does not understand
>> > special characters like +,-* which throws and error.If i dont pass
>> special
>> > characters it works fine.  But in real world a user can put characters
>> with
>> > their search,which will throw an error.
>> > Now iam stuck in this and dont know how to resolve this issue.
>> > This is how my exact search query looks like
>> >
>> >     $query1="(business_name:$data*^100 OR city_name:$data*^1 OR
>> >     locality_name:$data*^6 OR business_search_tag_name:$data*^8 OR
>> >     type_name:$data*^7) AND (business_active_flag:1) AND
>> >     (business_visible_flag:1) AND (delete_status_businessmasters:0)";
>> >
>> > This is how my fuzzy query looks like
>> >
>> >
>> >
>> $query2='(_query_:%20"{!complexphrase%20qf=business_name^100+type_name^0.4+locality_name^6%27}%20'.$url_new.')AND(business_active_flag:1)AND(business_point:[1.5
>> > TO 2.0])&q.op=AND&wt=json&indent=true';
>> >
>> > Iam new to solr and dont know how to tackle this situation.
>> >
>> > Details
>> > Solrphpclient
>> > php
>> > solr 4.9
>> >
>> >
>> > --
>> > Regards
>> > Madhav Bahuguna
>>

Re: How to handle special characters in fuzzy search query

Posted by Steven White <sw...@gmail.com>.
Hi Erick,

Is there a documented list of all operators (AND, OR, NOT, etc.) that also
need to be escaped?  Are there more beside the 3 I listed?

Thanks

Steve

On Fri, May 8, 2015 at 11:47 AM, Erick Erickson <er...@gmail.com>
wrote:

> Each of the characters you identified are characters that have meaning
> to the query parser, '+' is a mandatory clause, '-' is a NOT operator
> and * is a wildcard. To get through the query parser, these (and a
> bunch of others, see below) must be escaped.
>
> Personally, though, I'd pre-scrub the data. Depending on your analysis
> chain such things may be thrown away anyway.
>
> https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser
> - the "escaping special characters" bit.
>
> Best,
> Erick
>
> On Thu, May 7, 2015 at 11:28 PM, Madhav Bahuguna
> <ma...@gmail.com> wrote:
> > So my solr query is implemented in two parts,first query does an exact
> > search if there are no results found for exact then it goes to the second
> > query that does a fuzzy search.
> > every things works fine but in situations like-->A user enters "burg +"
> > So in exact search no records will come,so second query is called to do a
> > fuzzy search.Now comes the problem my fuzzy query does not understand
> > special characters like +,-* which throws and error.If i dont pass
> special
> > characters it works fine.  But in real world a user can put characters
> with
> > their search,which will throw an error.
> > Now iam stuck in this and dont know how to resolve this issue.
> > This is how my exact search query looks like
> >
> >     $query1="(business_name:$data*^100 OR city_name:$data*^1 OR
> >     locality_name:$data*^6 OR business_search_tag_name:$data*^8 OR
> >     type_name:$data*^7) AND (business_active_flag:1) AND
> >     (business_visible_flag:1) AND (delete_status_businessmasters:0)";
> >
> > This is how my fuzzy query looks like
> >
> >
> >
> $query2='(_query_:%20"{!complexphrase%20qf=business_name^100+type_name^0.4+locality_name^6%27}%20'.$url_new.')AND(business_active_flag:1)AND(business_point:[1.5
> > TO 2.0])&q.op=AND&wt=json&indent=true';
> >
> > Iam new to solr and dont know how to tackle this situation.
> >
> > Details
> > Solrphpclient
> > php
> > solr 4.9
> >
> >
> > --
> > Regards
> > Madhav Bahuguna
>

Re: How to handle special characters in fuzzy search query

Posted by Erick Erickson <er...@gmail.com>.
Each of the characters you identified are characters that have meaning
to the query parser, '+' is a mandatory clause, '-' is a NOT operator
and * is a wildcard. To get through the query parser, these (and a
bunch of others, see below) must be escaped.

Personally, though, I'd pre-scrub the data. Depending on your analysis
chain such things may be thrown away anyway.

https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser
- the "escaping special characters" bit.

Best,
Erick

On Thu, May 7, 2015 at 11:28 PM, Madhav Bahuguna
<ma...@gmail.com> wrote:
> So my solr query is implemented in two parts,first query does an exact
> search if there are no results found for exact then it goes to the second
> query that does a fuzzy search.
> every things works fine but in situations like-->A user enters "burg +"
> So in exact search no records will come,so second query is called to do a
> fuzzy search.Now comes the problem my fuzzy query does not understand
> special characters like +,-* which throws and error.If i dont pass special
> characters it works fine.  But in real world a user can put characters with
> their search,which will throw an error.
> Now iam stuck in this and dont know how to resolve this issue.
> This is how my exact search query looks like
>
>     $query1="(business_name:$data*^100 OR city_name:$data*^1 OR
>     locality_name:$data*^6 OR business_search_tag_name:$data*^8 OR
>     type_name:$data*^7) AND (business_active_flag:1) AND
>     (business_visible_flag:1) AND (delete_status_businessmasters:0)";
>
> This is how my fuzzy query looks like
>
>
> $query2='(_query_:%20"{!complexphrase%20qf=business_name^100+type_name^0.4+locality_name^6%27}%20'.$url_new.')AND(business_active_flag:1)AND(business_point:[1.5
> TO 2.0])&q.op=AND&wt=json&indent=true';
>
> Iam new to solr and dont know how to tackle this situation.
>
> Details
> Solrphpclient
> php
> solr 4.9
>
>
> --
> Regards
> Madhav Bahuguna