You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Prasi S <pr...@gmail.com> on 2013/09/13 09:56:39 UTC

Escaping *, ? in Solr

Hi,
I want to do regex search in solr.

E.g: Googl* . In my query api, i have used the ClientUtils.escapeQueryChars
funtion to escape characters special to solr.

In the above case, a search for
1. Google -> gives 677 records.
2. Googl* -> Escaped as Googl\* in code-> gives 12 results
3. When given q=Google* directly in the Browser -> gives 677 records.

Which is correct if I want to achieve regex search ( Googl*). Should i
restrict from escaping *, ? in the code for handling regex?

Pls suggest.

Thanks,
Prasi.

Re: Escaping *, ? in Solr

Posted by Jack Krupansky <ja...@basetechnology.com>.
Asterisk and question mark are wildcards, not regex. Regex query is a 
regular expression enclosed in slashes, such as:

q=/Googl.*/

And note that not all analyzer filters will be applied to regex terms. You 
may need to do the analysis yourself. Although simple filters likethe lower 
case filter should work fine.

-- Jack Krupansky

-----Original Message----- 
From: Prasi S
Sent: Friday, September 13, 2013 3:56 AM
To: solr-user@lucene.apache.org
Subject: Escaping *, ? in Solr

Hi,
I want to do regex search in solr.

E.g: Googl* . In my query api, i have used the ClientUtils.escapeQueryChars
funtion to escape characters special to solr.

In the above case, a search for
1. Google -> gives 677 records.
2. Googl* -> Escaped as Googl\* in code-> gives 12 results
3. When given q=Google* directly in the Browser -> gives 677 records.

Which is correct if I want to achieve regex search ( Googl*). Should i
restrict from escaping *, ? in the code for handling regex?

Pls suggest.

Thanks,
Prasi. 


Re: Escaping *, ? in Solr

Posted by Shawn Heisey <so...@elyograg.org>.
On 9/13/2013 1:56 AM, Prasi S wrote:
> I want to do regex search in solr.
>
> E.g: Googl* . In my query api, i have used the ClientUtils.escapeQueryChars
> funtion to escape characters special to solr.
>
> In the above case, a search for
> 1. Google -> gives 677 records.
> 2. Googl* -> Escaped as Googl\* in code-> gives 12 results
> 3. When given q=Google* directly in the Browser -> gives 677 records.
>
> Which is correct if I want to achieve regex search ( Googl*). Should i
> restrict from escaping *, ? in the code for handling regex?

Your third example is using * as a wildcard.  That's NOT the same thing 
as regex.

If you sent q=/Google.*/ then that would be a regex that should do the 
same thing as your wildcard example.  This requires Solr 4.0 or later.

You can't use the escapeQueryChars method if you're wanting to do regex 
or wildcard search.  The point behind that escape method is to search 
for special characters rather than let them have their special meanings.

Thanks,
Shawn