You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doug Turnbull <dt...@opensourceconnections.com> on 2014/12/15 17:05:43 UTC

Escaping of parametrized queries {v=$q}

Hello all,

I've been working on a search project that has a lot of special characters
in text (its programming language content). We use edismax as a base query
parser, but will use other query parsers in boost queries (such as field or
a custom query parser).

For example, we might have

q=c\+\+&
bq={!field f=someField v=$q}&
defType=edismax

Notice how I escape the "+" in C++ so that edismax will not interpret the +
as part of a lucene query.

My problem is when other query parsers that don't need lucene syntax
escaped, they see query text with escape slashes. So the field query parser
attempts to search for "c\+\+". This makes sense as I'm setting v to the
value of $q.

I'm not sure quite how to solve this problem. Ideally I could see--

(a) A way to get the edismax (or any other) query parser to communicate
that q is actually some escaped piece of text?
(b) A way to trick the receiving query parser into escaping?

Or am I just best not using parameterized queries, and instead should I
force the burden onto the client to send:

bq={!field f=someField}c++

Any ideas on clean ways to solve this problem?

Thanks,
-- 
Doug Turnbull
Search & Big Data Architect
OpenSource Connections <http://o19s.com>

Re: Escaping of parametrized queries {v=$q}

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Good idea. That's cleaner than my current solution (hack up a query
parser). I'll try it out and report back.

-Doug

On Mon, Dec 15, 2014 at 12:04 PM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:
>
> How about a pre-pended search component to un-escape Q and put it into
> QClean or some such. Then, your other parsers could use the other
> variable name as needed?
>
> Would that work?
>
> Regards,
>    Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On 15 December 2014 at 11:05, Doug Turnbull
> <dt...@opensourceconnections.com> wrote:
> > Hello all,
> >
> > I've been working on a search project that has a lot of special
> characters
> > in text (its programming language content). We use edismax as a base
> query
> > parser, but will use other query parsers in boost queries (such as field
> or
> > a custom query parser).
> >
> > For example, we might have
> >
> > q=c\+\+&
> > bq={!field f=someField v=$q}&
> > defType=edismax
> >
> > Notice how I escape the "+" in C++ so that edismax will not interpret
> the +
> > as part of a lucene query.
> >
> > My problem is when other query parsers that don't need lucene syntax
> > escaped, they see query text with escape slashes. So the field query
> parser
> > attempts to search for "c\+\+". This makes sense as I'm setting v to the
> > value of $q.
> >
> > I'm not sure quite how to solve this problem. Ideally I could see--
> >
> > (a) A way to get the edismax (or any other) query parser to communicate
> > that q is actually some escaped piece of text?
> > (b) A way to trick the receiving query parser into escaping?
> >
> > Or am I just best not using parameterized queries, and instead should I
> > force the burden onto the client to send:
> >
> > bq={!field f=someField}c++
> >
> > Any ideas on clean ways to solve this problem?
> >
> > Thanks,
> > --
> > Doug Turnbull
> > Search & Big Data Architect
> > OpenSource Connections <http://o19s.com>
>


-- 
Doug Turnbull
Search & Big Data Architect
OpenSource Connections <http://o19s.com>

Re: Escaping of parametrized queries {v=$q}

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
How about a pre-pended search component to un-escape Q and put it into
QClean or some such. Then, your other parsers could use the other
variable name as needed?

Would that work?

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 15 December 2014 at 11:05, Doug Turnbull
<dt...@opensourceconnections.com> wrote:
> Hello all,
>
> I've been working on a search project that has a lot of special characters
> in text (its programming language content). We use edismax as a base query
> parser, but will use other query parsers in boost queries (such as field or
> a custom query parser).
>
> For example, we might have
>
> q=c\+\+&
> bq={!field f=someField v=$q}&
> defType=edismax
>
> Notice how I escape the "+" in C++ so that edismax will not interpret the +
> as part of a lucene query.
>
> My problem is when other query parsers that don't need lucene syntax
> escaped, they see query text with escape slashes. So the field query parser
> attempts to search for "c\+\+". This makes sense as I'm setting v to the
> value of $q.
>
> I'm not sure quite how to solve this problem. Ideally I could see--
>
> (a) A way to get the edismax (or any other) query parser to communicate
> that q is actually some escaped piece of text?
> (b) A way to trick the receiving query parser into escaping?
>
> Or am I just best not using parameterized queries, and instead should I
> force the burden onto the client to send:
>
> bq={!field f=someField}c++
>
> Any ideas on clean ways to solve this problem?
>
> Thanks,
> --
> Doug Turnbull
> Search & Big Data Architect
> OpenSource Connections <http://o19s.com>