You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Christian Spitzlay <ch...@biologis.com> on 2018/05/24 15:23:51 UTC

Escaping in streaming expression

Hello,

I’m experimenting with streaming expressions and I wonder how to escape a double quote in a value.
I am on 7.3.0 and trying with the text area on http://localhost:8983/solr/#/collname/stream

The following expression works for me and returns results:
search(kmm, q="sds_endpoint_name:F2", fl="sds_endpoint_name", sort="sds_endpoint_name ASC", qt="/export“)

When I try to add a double quote to the value quoted with a backslash like this: 
search(kmm, q="sds_endpoint_name:F\"2", fl="sds_endpoint_name", sort="sds_endpoint_name ASC", qt="/export")
I get an exception with message:

org.apache.solr.search.SyntaxError:  Cannot parse 'sds_endpoint_name:F\"2': Lexical error at line 1, column 22.  Encountered:  after : \"\\\"2\“",

I tried several more levels of escaping with backslashes but none worked so far
(only the error message was different as sometimes the expression was broken in different ways)


On http://localhost:8983/solr/#/collname/query, entering
sds_endpoint_name:F\“2
as the q parameter does not throw a syntax error and an empty result is returned
(which is to be expected as there is no document with a quote in the name at the moment).


Is there a correct way to escape the double quote in a streaming expression?


Best regards
Christian


Re: Escaping in streaming expression

Posted by Joel Bernstein <jo...@gmail.com>.
I did a little experimentation:

This query is sent down to Solr and performs a proper wildcard:

search(collection2, q="hel*", fl="id", sort="id asc")

This query properly escapes the wildcard:

search(collection2, q="hel\*", fl="id", sort="id asc")

So it appears that the main issue is with the double quotes. This because
double quotes breaks the syntax of streaming expressions so it needed
special handling. That special handling breaks the ability to search on
double quotes as a character.

SOLR-10894 only appears to mention double quotes as well.









Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, May 25, 2018 at 4:13 AM, Christian Spitzlay <
christian.spitzlay@biologis.com> wrote:

> Thanks for your help.
> Yes, I think SOLR-10894 is exactly about the issue I have seen.
>
> So if I understand correctly there is currently no way to create a method
> in client code (like Drupal’s search_api_solr module) that takes arbitrary
> user input
> and escapes it to get *always* a valid expression for a search for literal
> string values.
>
> The streaming expression builder in that module uses the normal escaping
> method from
> the Solarium library.  I assume those work correctly for non-streaming
> queries.
> But given this unresolved issue I guess the Solarium library and the
> Drupal module
> will inherit the problem.
>
> Is this only about double quotes or are there other meta characters that
> will work
> with backslash-escaping in non-streaming queries but will not parse as
> part of
> streaming expressions?
>
>
> Christian Spitzlay
>
>
>
> > Am 24.05.2018 um 18:55 schrieb Joel Bernstein <jo...@gmail.com>:
> >
> > I just confirmed that the following query works as expected:
> >
> > search(collection2, q="test_s:\"hello world\"", fl="id", sort="id desc")
> >
> > In this case the double quotes are used to specify a phrase query.
> >
> > But this fails:
> >
> > search(collection2, q="test_s:\"hello world", fl="id", sort="id desc")
> >
> > In this case the double quote is used as part of the term, which is what
> I
> > believe you had in mind.
> >
> > SOLR-10894, I believe was created to address this issue but as of yet no
> > patch is available.
> >
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Thu, May 24, 2018 at 12:38 PM, Joel Bernstein <jo...@gmail.com>
> wrote:
> >
> >> Also while looking at you're query it looks like you are getting error
> >> from the solr query parser. I believe the this is the issue you are
> facing:
> >>
> >> https://issues.apache.org/jira/browse/SOLR-10894
> >>
> >> I'll confirm, but I believe this query should work:
> >>
> >> search(collection1, q="test \"hello world\""...)
> >>
> >> In the query about the double quotes are escaped and send to Solr
> >> unescaped to form the query: test "hello world". The query parse has no
> >> problem parsing this.
> >>
> >> But you're using a double quote not as part of query syntax, but as part
> >> of the query term. This is where I believe SOLR-10894 comes into play.
> >>
> >>
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> On Thu, May 24, 2018 at 11:23 AM, Christian Spitzlay <
> >> christian.spitzlay@biologis.com> wrote:
> >>
> >>> Hello,
> >>>
> >>> I’m experimenting with streaming expressions and I wonder how to
> escape a
> >>> double quote in a value.
> >>> I am on 7.3.0 and trying with the text area on
> >>> http://localhost:8983/solr/#/collname/stream
> >>>
> >>> The following expression works for me and returns results:
> >>> search(kmm, q="sds_endpoint_name:F2", fl="sds_endpoint_name",
> >>> sort="sds_endpoint_name ASC", qt="/export“)
> >>>
> >>> When I try to add a double quote to the value quoted with a backslash
> >>> like this:
> >>> search(kmm, q="sds_endpoint_name:F\"2", fl="sds_endpoint_name",
> >>> sort="sds_endpoint_name ASC", qt="/export")
> >>> I get an exception with message:
> >>>
> >>> org.apache.solr.search.SyntaxError:  Cannot parse
> >>> 'sds_endpoint_name:F\"2': Lexical error at line 1, column 22.
> >>> Encountered:  after : \"\\\"2\“",
> >>>
> >>> I tried several more levels of escaping with backslashes but none
> worked
> >>> so far
> >>> (only the error message was different as sometimes the expression was
> >>> broken in different ways)
> >>>
> >>>
> >>> On http://localhost:8983/solr/#/collname/query, entering
> >>> sds_endpoint_name:F\“2
> >>> as the q parameter does not throw a syntax error and an empty result is
> >>> returned
> >>> (which is to be expected as there is no document with a quote in the
> name
> >>> at the moment).
> >>>
> >>>
> >>> Is there a correct way to escape the double quote in a streaming
> >>> expression?
> >>>
> >>>
> >>> Best regards
> >>> Christian
> >>>
> >>>
> >>
>
>

Re: Escaping in streaming expression

Posted by Christian Spitzlay <ch...@biologis.com>.
Thanks for your help.
Yes, I think SOLR-10894 is exactly about the issue I have seen.

So if I understand correctly there is currently no way to create a method
in client code (like Drupal’s search_api_solr module) that takes arbitrary user input 
and escapes it to get *always* a valid expression for a search for literal string values.

The streaming expression builder in that module uses the normal escaping method from 
the Solarium library.  I assume those work correctly for non-streaming queries.
But given this unresolved issue I guess the Solarium library and the Drupal module 
will inherit the problem.

Is this only about double quotes or are there other meta characters that will work 
with backslash-escaping in non-streaming queries but will not parse as part of 
streaming expressions?


Christian Spitzlay



> Am 24.05.2018 um 18:55 schrieb Joel Bernstein <jo...@gmail.com>:
> 
> I just confirmed that the following query works as expected:
> 
> search(collection2, q="test_s:\"hello world\"", fl="id", sort="id desc")
> 
> In this case the double quotes are used to specify a phrase query.
> 
> But this fails:
> 
> search(collection2, q="test_s:\"hello world", fl="id", sort="id desc")
> 
> In this case the double quote is used as part of the term, which is what I
> believe you had in mind.
> 
> SOLR-10894, I believe was created to address this issue but as of yet no
> patch is available.
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> On Thu, May 24, 2018 at 12:38 PM, Joel Bernstein <jo...@gmail.com> wrote:
> 
>> Also while looking at you're query it looks like you are getting error
>> from the solr query parser. I believe the this is the issue you are facing:
>> 
>> https://issues.apache.org/jira/browse/SOLR-10894
>> 
>> I'll confirm, but I believe this query should work:
>> 
>> search(collection1, q="test \"hello world\""...)
>> 
>> In the query about the double quotes are escaped and send to Solr
>> unescaped to form the query: test "hello world". The query parse has no
>> problem parsing this.
>> 
>> But you're using a double quote not as part of query syntax, but as part
>> of the query term. This is where I believe SOLR-10894 comes into play.
>> 
>> 
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> On Thu, May 24, 2018 at 11:23 AM, Christian Spitzlay <
>> christian.spitzlay@biologis.com> wrote:
>> 
>>> Hello,
>>> 
>>> I’m experimenting with streaming expressions and I wonder how to escape a
>>> double quote in a value.
>>> I am on 7.3.0 and trying with the text area on
>>> http://localhost:8983/solr/#/collname/stream
>>> 
>>> The following expression works for me and returns results:
>>> search(kmm, q="sds_endpoint_name:F2", fl="sds_endpoint_name",
>>> sort="sds_endpoint_name ASC", qt="/export“)
>>> 
>>> When I try to add a double quote to the value quoted with a backslash
>>> like this:
>>> search(kmm, q="sds_endpoint_name:F\"2", fl="sds_endpoint_name",
>>> sort="sds_endpoint_name ASC", qt="/export")
>>> I get an exception with message:
>>> 
>>> org.apache.solr.search.SyntaxError:  Cannot parse
>>> 'sds_endpoint_name:F\"2': Lexical error at line 1, column 22.
>>> Encountered:  after : \"\\\"2\“",
>>> 
>>> I tried several more levels of escaping with backslashes but none worked
>>> so far
>>> (only the error message was different as sometimes the expression was
>>> broken in different ways)
>>> 
>>> 
>>> On http://localhost:8983/solr/#/collname/query, entering
>>> sds_endpoint_name:F\“2
>>> as the q parameter does not throw a syntax error and an empty result is
>>> returned
>>> (which is to be expected as there is no document with a quote in the name
>>> at the moment).
>>> 
>>> 
>>> Is there a correct way to escape the double quote in a streaming
>>> expression?
>>> 
>>> 
>>> Best regards
>>> Christian
>>> 
>>> 
>> 


Re: Escaping in streaming expression

Posted by Joel Bernstein <jo...@gmail.com>.
I just confirmed that the following query works as expected:

search(collection2, q="test_s:\"hello world\"", fl="id", sort="id desc")

In this case the double quotes are used to specify a phrase query.

But this fails:

search(collection2, q="test_s:\"hello world", fl="id", sort="id desc")

In this case the double quote is used as part of the term, which is what I
believe you had in mind.

SOLR-10894, I believe was created to address this issue but as of yet no
patch is available.













Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, May 24, 2018 at 12:38 PM, Joel Bernstein <jo...@gmail.com> wrote:

> Also while looking at you're query it looks like you are getting error
> from the solr query parser. I believe the this is the issue you are facing:
>
> https://issues.apache.org/jira/browse/SOLR-10894
>
> I'll confirm, but I believe this query should work:
>
> search(collection1, q="test \"hello world\""...)
>
> In the query about the double quotes are escaped and send to Solr
> unescaped to form the query: test "hello world". The query parse has no
> problem parsing this.
>
> But you're using a double quote not as part of query syntax, but as part
> of the query term. This is where I believe SOLR-10894 comes into play.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, May 24, 2018 at 11:23 AM, Christian Spitzlay <
> christian.spitzlay@biologis.com> wrote:
>
>> Hello,
>>
>> I’m experimenting with streaming expressions and I wonder how to escape a
>> double quote in a value.
>> I am on 7.3.0 and trying with the text area on
>> http://localhost:8983/solr/#/collname/stream
>>
>> The following expression works for me and returns results:
>> search(kmm, q="sds_endpoint_name:F2", fl="sds_endpoint_name",
>> sort="sds_endpoint_name ASC", qt="/export“)
>>
>> When I try to add a double quote to the value quoted with a backslash
>> like this:
>> search(kmm, q="sds_endpoint_name:F\"2", fl="sds_endpoint_name",
>> sort="sds_endpoint_name ASC", qt="/export")
>> I get an exception with message:
>>
>> org.apache.solr.search.SyntaxError:  Cannot parse
>> 'sds_endpoint_name:F\"2': Lexical error at line 1, column 22.
>> Encountered:  after : \"\\\"2\“",
>>
>> I tried several more levels of escaping with backslashes but none worked
>> so far
>> (only the error message was different as sometimes the expression was
>> broken in different ways)
>>
>>
>> On http://localhost:8983/solr/#/collname/query, entering
>> sds_endpoint_name:F\“2
>> as the q parameter does not throw a syntax error and an empty result is
>> returned
>> (which is to be expected as there is no document with a quote in the name
>> at the moment).
>>
>>
>> Is there a correct way to escape the double quote in a streaming
>> expression?
>>
>>
>> Best regards
>> Christian
>>
>>
>

Re: Escaping in streaming expression

Posted by Joel Bernstein <jo...@gmail.com>.
Also while looking at you're query it looks like you are getting error from
the solr query parser. I believe the this is the issue you are facing:

https://issues.apache.org/jira/browse/SOLR-10894

I'll confirm, but I believe this query should work:

search(collection1, q="test \"hello world\""...)

In the query about the double quotes are escaped and send to Solr unescaped
to form the query: test "hello world". The query parse has no problem
parsing this.

But you're using a double quote not as part of query syntax, but as part of
the query term. This is where I believe SOLR-10894 comes into play.




Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, May 24, 2018 at 11:23 AM, Christian Spitzlay <
christian.spitzlay@biologis.com> wrote:

> Hello,
>
> I’m experimenting with streaming expressions and I wonder how to escape a
> double quote in a value.
> I am on 7.3.0 and trying with the text area on
> http://localhost:8983/solr/#/collname/stream
>
> The following expression works for me and returns results:
> search(kmm, q="sds_endpoint_name:F2", fl="sds_endpoint_name",
> sort="sds_endpoint_name ASC", qt="/export“)
>
> When I try to add a double quote to the value quoted with a backslash like
> this:
> search(kmm, q="sds_endpoint_name:F\"2", fl="sds_endpoint_name",
> sort="sds_endpoint_name ASC", qt="/export")
> I get an exception with message:
>
> org.apache.solr.search.SyntaxError:  Cannot parse
> 'sds_endpoint_name:F\"2': Lexical error at line 1, column 22.
> Encountered:  after : \"\\\"2\“",
>
> I tried several more levels of escaping with backslashes but none worked
> so far
> (only the error message was different as sometimes the expression was
> broken in different ways)
>
>
> On http://localhost:8983/solr/#/collname/query, entering
> sds_endpoint_name:F\“2
> as the q parameter does not throw a syntax error and an empty result is
> returned
> (which is to be expected as there is no document with a quote in the name
> at the moment).
>
>
> Is there a correct way to escape the double quote in a streaming
> expression?
>
>
> Best regards
> Christian
>
>

Re: Escaping in streaming expression

Posted by Joel Bernstein <jo...@gmail.com>.
This ticket originally addressed the issue:

https://issues.apache.org/jira/browse/SOLR-8409

It's a confusing ticket though and I'm not seeing test cases that prove out
that this is still working. I write a quick test case to see how escaping
of quotes is being handled.

This is a followup issue which has not yet been resolved:

https://issues.apache.org/jira/browse/SOLR-10894


Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, May 24, 2018 at 11:23 AM, Christian Spitzlay <
christian.spitzlay@biologis.com> wrote:

> Hello,
>
> I’m experimenting with streaming expressions and I wonder how to escape a
> double quote in a value.
> I am on 7.3.0 and trying with the text area on
> http://localhost:8983/solr/#/collname/stream
>
> The following expression works for me and returns results:
> search(kmm, q="sds_endpoint_name:F2", fl="sds_endpoint_name",
> sort="sds_endpoint_name ASC", qt="/export“)
>
> When I try to add a double quote to the value quoted with a backslash like
> this:
> search(kmm, q="sds_endpoint_name:F\"2", fl="sds_endpoint_name",
> sort="sds_endpoint_name ASC", qt="/export")
> I get an exception with message:
>
> org.apache.solr.search.SyntaxError:  Cannot parse
> 'sds_endpoint_name:F\"2': Lexical error at line 1, column 22.
> Encountered:  after : \"\\\"2\“",
>
> I tried several more levels of escaping with backslashes but none worked
> so far
> (only the error message was different as sometimes the expression was
> broken in different ways)
>
>
> On http://localhost:8983/solr/#/collname/query, entering
> sds_endpoint_name:F\“2
> as the q parameter does not throw a syntax error and an empty result is
> returned
> (which is to be expected as there is no document with a quote in the name
> at the moment).
>
>
> Is there a correct way to escape the double quote in a streaming
> expression?
>
>
> Best regards
> Christian
>
>