You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rahul Verma <ra...@contify.com> on 2016/07/04 13:04:41 UTC

Inconsistent parsing of pure negative queries inside brackets

Hi everyone,

While tracing a bug in one of our systems we notices some interesting
behavior from Solr.

These two queries return different results. I fail to understand why the
second query returns empty results just by adding brackets. Can you please
help us understand this behavior?
*1. Without Brackets:*
{ "responseHeader": { "status": 0, "QTime": 0, "params": { "q": "*:*", "
indent": "true", "fq": "-fl_monitoring_channel: 36 AND (title: salesforce)",
"wt": "json", "_": "1467637035433" } }, "response": { "numFound": 35541, "
start": 0, "docs": [...

*2. With Brackets:*
{ "responseHeader": { "status": 0, "QTime": 0, "params": { "q": "*:*", "
indent": "true", "fq": "*(*-fl_monitoring_channel: 36*)* AND (title:
salesforce)", "wt": "json", "_": "1467637344339" } }, "response": { "
numFound": 0, "start": 0, "docs": [] } }

Re: Inconsistent parsing of pure negative queries inside brackets

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/4/2016 7:04 AM, Rahul Verma wrote:
> While tracing a bug in one of our systems we notices some interesting
> behavior from Solr. These two queries return different results. I fail
> to understand why the second query returns empty results just by
> adding brackets. Can you please help us understand this behavior?

Supplementing the excellent info that Erick gave you:

I was slightly surprised that the first query even works like you
expect.  Here's why the second one DOESN'T work:

What you've got is a negative query clause:

-fl_monitoring_channel:36

At the Lucene level, a purely negative query will never work in
isolation, because you can't subtract from nothing and expect results --
you must always start with something, and THEN subtract from it.

Solr has some logic in its version of the Lucene query parser that can
detect *simple* negative query clauses and implicitly add a "*:*"
starting point before sending the query to Lucene for handling.  This
logic only works at the top level -- if the negative query is a
subordinate clause, it will not be seen by this logic, and the implicit
fix will not be added, so the query won't work.

By adding parentheses, you have turned this negative query clause into a
subordinate clause.  If you explicitly add the *:* starting point, it
will work:

(*:* -fl_monitoring_channel:36)

Thanks,
Shawn


Re: Inconsistent parsing of pure negative queries inside brackets

Posted by Erick Erickson <er...@gmail.com>.
The Lucene query parser is _not_ a boolean query
language, see Hossman's excellent explanation here:
https://lucidworks.com/blog/2011/12/28/why-not-and-or-and-not/

In this case, add &debug=query to them both and you'll see something like:
---no parens
"-cat:electronics +name:test"

---parens
"+(-cat:electronics) +name:test"

The first is an optional (SHOULD) clause removing all docs with
electronics in the category
and a MUST clause for name:test

The second is a mandatory (MUST) clause removing all electronics
mentions in category
and a MUST clause for name:test.

This trips up a lot of people.

Best,
Erick


On Mon, Jul 4, 2016 at 6:04 AM, Rahul Verma <ra...@contify.com> wrote:
> Hi everyone,
>
> While tracing a bug in one of our systems we notices some interesting
> behavior from Solr.
>
> These two queries return different results. I fail to understand why the
> second query returns empty results just by adding brackets. Can you please
> help us understand this behavior?
> *1. Without Brackets:*
> { "responseHeader": { "status": 0, "QTime": 0, "params": { "q": "*:*", "
> indent": "true", "fq": "-fl_monitoring_channel: 36 AND (title: salesforce)",
> "wt": "json", "_": "1467637035433" } }, "response": { "numFound": 35541, "
> start": 0, "docs": [...
>
> *2. With Brackets:*
> { "responseHeader": { "status": 0, "QTime": 0, "params": { "q": "*:*", "
> indent": "true", "fq": "*(*-fl_monitoring_channel: 36*)* AND (title:
> salesforce)", "wt": "json", "_": "1467637344339" } }, "response": { "
> numFound": 0, "start": 0, "docs": [] } }