You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by deniz <de...@gmail.com> on 2014/02/27 07:36:42 UTC

Searching with special chars

Hello,

We are facing some kinda weird problem. So here is the scenario:

We have a frontend and a middle-ware which is dealing with user input search
queries before posting to Solr.

So when a user enters city:Frankenthal_(Pfalz) and then searches, there is
no result although there are fields on some documents matching
city:Frankenthal_(Pfalz). We are aware that we can escape those chars, but
the middleware which is accepting queries is running on a Glassfish server,
which is refusing URLs with backslashes in it, hence using backslashes is
not okay for posting the query.

To make everyone clear about the system it looks like:

(PHP) -> Encoded JSON -> (Glassfish App - Middleware) -> Javabin -> Solr
 
any other ideas who to deal with queries with special chars like this one? 



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-with-special-chars-tp4120047.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Searching with special chars

Posted by "Petersen, Robert" <ro...@mail.rakuten.com>.
I agree with Erick, but if you want the special characters to count in searches, you might consider not just stripping them out but replacing them with textual placeholders (which would also have to be done at indexing time).  For instance, I replace C# with csharp and C++ with cplusplus during indexing and during searching before passing them along to my solr layer.

Hope that helps,
Robi

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Thursday, February 27, 2014 7:45 AM
To: solr-user@lucene.apache.org
Subject: Re: Searching with special chars

Good luck! You'll need it.

Problem is this is such a sticky wicket. You can move the cleaning up to the PHP layer, that is strip out the parens.

You could write a Solr component that got the query _very_ early and transformed it. You'd have to get here before parsing.

Either way, though, you'll be endlessly trying to second-guess the query parsing and/or intent of the user.

I'd recommend the PHP layer if anything, it's closer to the user and you may have a better chance to guess right.

Best,
Erick


On Wed, Feb 26, 2014 at 10:36 PM, deniz <de...@gmail.com> wrote:

> Hello,
>
> We are facing some kinda weird problem. So here is the scenario:
>
> We have a frontend and a middle-ware which is dealing with user input 
> search queries before posting to Solr.
>
> So when a user enters city:Frankenthal_(Pfalz) and then searches, 
> there is no result although there are fields on some documents 
> matching city:Frankenthal_(Pfalz). We are aware that we can escape 
> those chars, but the middleware which is accepting queries is running 
> on a Glassfish server, which is refusing URLs with backslashes in it, 
> hence using backslashes is not okay for posting the query.
>
> To make everyone clear about the system it looks like:
>
> (PHP) -> Encoded JSON -> (Glassfish App - Middleware) -> Javabin -> 
> Solr
>
> any other ideas who to deal with queries with special chars like this one?
>
>
>
> -----
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Searching-with-special-chars-tp4120
> 047.html Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Searching with special chars

Posted by Erick Erickson <er...@gmail.com>.
Good luck! You'll need it.

Problem is this is such a sticky wicket. You can
move the cleaning up to the PHP layer, that is
strip out the parens.

You could write a Solr component that got the
query _very_ early and transformed it. You'd
have to get here before parsing.

Either way, though, you'll be endlessly trying
to second-guess the query parsing and/or
intent of the user.

I'd recommend the PHP layer if anything, it's
closer to the user and you may have a better
chance to guess right.

Best,
Erick


On Wed, Feb 26, 2014 at 10:36 PM, deniz <de...@gmail.com> wrote:

> Hello,
>
> We are facing some kinda weird problem. So here is the scenario:
>
> We have a frontend and a middle-ware which is dealing with user input
> search
> queries before posting to Solr.
>
> So when a user enters city:Frankenthal_(Pfalz) and then searches, there is
> no result although there are fields on some documents matching
> city:Frankenthal_(Pfalz). We are aware that we can escape those chars, but
> the middleware which is accepting queries is running on a Glassfish server,
> which is refusing URLs with backslashes in it, hence using backslashes is
> not okay for posting the query.
>
> To make everyone clear about the system it looks like:
>
> (PHP) -> Encoded JSON -> (Glassfish App - Middleware) -> Javabin -> Solr
>
> any other ideas who to deal with queries with special chars like this one?
>
>
>
> -----
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Searching-with-special-chars-tp4120047.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Searching with special chars

Posted by deniz <de...@gmail.com>.
So as there was no quick work around to this issue, we simply change the http
method from get to post, to avoid further problems which could be triggered
by user input too. though this violates the restful standards... at least we
have something running properly



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-with-special-chars-tp4120047p4121043.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Searching with special chars

Posted by Jack Krupansky <ja...@basetechnology.com>.
Backslashes are used to escape special characters in queries, but the 
backslash must in turn be encoded in the URL as %5C.

-- Jack Krupansky

-----Original Message----- 
From: deniz
Sent: Thursday, February 27, 2014 1:36 AM
To: solr-user@lucene.apache.org
Subject: Searching with special chars

Hello,

We are facing some kinda weird problem. So here is the scenario:

We have a frontend and a middle-ware which is dealing with user input search
queries before posting to Solr.

So when a user enters city:Frankenthal_(Pfalz) and then searches, there is
no result although there are fields on some documents matching
city:Frankenthal_(Pfalz). We are aware that we can escape those chars, but
the middleware which is accepting queries is running on a Glassfish server,
which is refusing URLs with backslashes in it, hence using backslashes is
not okay for posting the query.

To make everyone clear about the system it looks like:

(PHP) -> Encoded JSON -> (Glassfish App - Middleware) -> Javabin -> Solr

any other ideas who to deal with queries with special chars like this one?



-----
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-with-special-chars-tp4120047.html
Sent from the Solr - User mailing list archive at Nabble.com.