You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Pawel Rog <pp...@gmail.com> on 2016/11/09 09:37:46 UTC

Query parser and default operator

Hello ,
I have a query `foo AND bar OR baz`. When I use "AND" as a default operator
this is the resulting Lucene  query:

`+test:foo test:bar test:baz`

When I use "OR" this is the resulting query

`+test:foo +test:bar test:baz`


I expected these two return exactly the same Lucene query because I used
operator explicitly. I thought that the default operator is used only when
operator is not explicitly mentioned in the query. Am I missing something
or this is not expected behavior (bug)?

--
Paweł Róg

Re: Query parser and default operator

Posted by Pawel Rog <pa...@gmail.com>.
Thank you Dawid :)

--
Paweł Róg

On Thu, Nov 10, 2016 at 1:30 PM, Dawid Weiss <da...@gmail.com> wrote:

> This does look odd. I filed this issue to track it:
>
> https://issues.apache.org/jira/browse/LUCENE-7550
>
> But I can't promise you I'll have the time to look into this any time
> soon. Feel free to step down through the source and see why the
> difference is there (patches welcome!).
>
>
> On Wed, Nov 9, 2016 at 11:26 PM, Pawel Rog <pa...@gmail.com> wrote:
> > Hi Dawid,
> > Thanks for your email. It seems StandardQueryParser is free from
> > this unexpected behavior.
> >
> > I used the code below with Lucene 6.2.1
> > (org.apache.lucene.queryparser.classic.QueryParser)
> >
> >     QueryParser parser = new QueryParser("test", new
> WhitespaceAnalyzer());
> >
> >     parser.setDefaultOperator(QueryParser.Operator.AND);
> >     Query query = parser.parse("foo AND bar OR baz ");
> >     System.out.println(query.toString());
> >
> >     parser.setDefaultOperator(QueryParser.Operator.OR);
> >     query = parser.parse("foo AND bar OR baz ");
> >     System.out.println(query.toString());
> >
> >
> > I can also reproduce it on Elasticsearch 2.2 which uses Lucene 5.4.0
> >
> > $  curl -s 'localhost:9200/test/_search?pretty' -d '{ "query": {
> > "query_string": { "query": "foo AND bar OR baz" , "default_operator":
> "and"
> > } } , "profile" : true}' | grep luce
> >           "lucene" : "+_all:foo _all:bar _all:baz",
> > ...
> >
> > $ curl -s 'localhost:9200/test/_search?pretty' -d '{ "query": {
> > "query_string": { "query": "foo AND bar OR baz" , "default_operator":
> "or"
> > } } , "profile" : true}' | grep luce
> >           "lucene" : "+_all:foo +_all:bar _all:baz",
> > ...
> >
> > Elasticsearch uses class called MapperQueryParser which extends
> > org.apache.lucene.queryparser.classic.QueryParser
> >
> > --
> > Paweł Róg
> >
> > On Wed, Nov 9, 2016 at 6:10 PM, Dawid Weiss <da...@gmail.com>
> wrote:
> >
> >> Which Lucene version and which query parser is this? Can you provide a
> >> test case/ code sample?
> >> I just tried with StandardQueryParser and for:
> >>
> >>         sqp.setDefaultOperator(StandardQueryConfigHandler.
> Operator.AND);
> >>         dump(sqp.parse("foo AND bar OR baz", "field_a"));
> >>         sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.OR);
> >>         dump(sqp.parse("foo AND bar OR baz", "field_a"));
> >>
> >> I get the same result:
> >>
> >> BooleanQuery: +field_a:foo +field_a:bar field_a:baz
> >>
> >> Dawid
> >>
> >> On Wed, Nov 9, 2016 at 6:04 PM, Pawel Rog <pp...@gmail.com>
> wrote:
> >> > Hi Eric,
> >> > Thank you for your email.
> >> > I understand that Lucene queries are not in boolean logic. My point is
> >> only
> >> > that I would expect identical Lucene queries build from the same input
> >> > string. My intuition says that default operator should not matter in 2
> >> > examples I presented in previous email.
> >> >
> >> > --
> >> > Paweł Róg
> >> >
> >> > On Wed, Nov 9, 2016 at 4:32 PM, Erick Erickson <
> erickerickson@gmail.com>
> >> > wrote:
> >> >
> >> >> Lucene queries aren't boolean logic. You can simulate boolean logic
> by
> >> >> explicitly parenthesizing, here's an excellent blog on this:
> >> >>
> >> >> https://lucidworks.com/blog/why-not-and-or-and-not/
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Wed, Nov 9, 2016 at 1:37 AM, Pawel Rog <pp...@gmail.com>
> >> wrote:
> >> >> > Hello ,
> >> >> > I have a query `foo AND bar OR baz`. When I use "AND" as a default
> >> >> operator
> >> >> > this is the resulting Lucene  query:
> >> >> >
> >> >> > `+test:foo test:bar test:baz`
> >> >> >
> >> >> > When I use "OR" this is the resulting query
> >> >> >
> >> >> > `+test:foo +test:bar test:baz`
> >> >> >
> >> >> >
> >> >> > I expected these two return exactly the same Lucene query because I
> >> used
> >> >> > operator explicitly. I thought that the default operator is used
> only
> >> >> when
> >> >> > operator is not explicitly mentioned in the query. Am I missing
> >> something
> >> >> > or this is not expected behavior (bug)?
> >> >> >
> >> >> > --
> >> >> > Paweł Róg
> >> >>
> >> >> ------------------------------------------------------------
> ---------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query parser and default operator

Posted by Dawid Weiss <da...@gmail.com>.
This does look odd. I filed this issue to track it:

https://issues.apache.org/jira/browse/LUCENE-7550

But I can't promise you I'll have the time to look into this any time
soon. Feel free to step down through the source and see why the
difference is there (patches welcome!).


On Wed, Nov 9, 2016 at 11:26 PM, Pawel Rog <pa...@gmail.com> wrote:
> Hi Dawid,
> Thanks for your email. It seems StandardQueryParser is free from
> this unexpected behavior.
>
> I used the code below with Lucene 6.2.1
> (org.apache.lucene.queryparser.classic.QueryParser)
>
>     QueryParser parser = new QueryParser("test", new WhitespaceAnalyzer());
>
>     parser.setDefaultOperator(QueryParser.Operator.AND);
>     Query query = parser.parse("foo AND bar OR baz ");
>     System.out.println(query.toString());
>
>     parser.setDefaultOperator(QueryParser.Operator.OR);
>     query = parser.parse("foo AND bar OR baz ");
>     System.out.println(query.toString());
>
>
> I can also reproduce it on Elasticsearch 2.2 which uses Lucene 5.4.0
>
> $  curl -s 'localhost:9200/test/_search?pretty' -d '{ "query": {
> "query_string": { "query": "foo AND bar OR baz" , "default_operator": "and"
> } } , "profile" : true}' | grep luce
>           "lucene" : "+_all:foo _all:bar _all:baz",
> ...
>
> $ curl -s 'localhost:9200/test/_search?pretty' -d '{ "query": {
> "query_string": { "query": "foo AND bar OR baz" , "default_operator": "or"
> } } , "profile" : true}' | grep luce
>           "lucene" : "+_all:foo +_all:bar _all:baz",
> ...
>
> Elasticsearch uses class called MapperQueryParser which extends
> org.apache.lucene.queryparser.classic.QueryParser
>
> --
> Paweł Róg
>
> On Wed, Nov 9, 2016 at 6:10 PM, Dawid Weiss <da...@gmail.com> wrote:
>
>> Which Lucene version and which query parser is this? Can you provide a
>> test case/ code sample?
>> I just tried with StandardQueryParser and for:
>>
>>         sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.AND);
>>         dump(sqp.parse("foo AND bar OR baz", "field_a"));
>>         sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.OR);
>>         dump(sqp.parse("foo AND bar OR baz", "field_a"));
>>
>> I get the same result:
>>
>> BooleanQuery: +field_a:foo +field_a:bar field_a:baz
>>
>> Dawid
>>
>> On Wed, Nov 9, 2016 at 6:04 PM, Pawel Rog <pp...@gmail.com> wrote:
>> > Hi Eric,
>> > Thank you for your email.
>> > I understand that Lucene queries are not in boolean logic. My point is
>> only
>> > that I would expect identical Lucene queries build from the same input
>> > string. My intuition says that default operator should not matter in 2
>> > examples I presented in previous email.
>> >
>> > --
>> > Paweł Róg
>> >
>> > On Wed, Nov 9, 2016 at 4:32 PM, Erick Erickson <er...@gmail.com>
>> > wrote:
>> >
>> >> Lucene queries aren't boolean logic. You can simulate boolean logic by
>> >> explicitly parenthesizing, here's an excellent blog on this:
>> >>
>> >> https://lucidworks.com/blog/why-not-and-or-and-not/
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Wed, Nov 9, 2016 at 1:37 AM, Pawel Rog <pp...@gmail.com>
>> wrote:
>> >> > Hello ,
>> >> > I have a query `foo AND bar OR baz`. When I use "AND" as a default
>> >> operator
>> >> > this is the resulting Lucene  query:
>> >> >
>> >> > `+test:foo test:bar test:baz`
>> >> >
>> >> > When I use "OR" this is the resulting query
>> >> >
>> >> > `+test:foo +test:bar test:baz`
>> >> >
>> >> >
>> >> > I expected these two return exactly the same Lucene query because I
>> used
>> >> > operator explicitly. I thought that the default operator is used only
>> >> when
>> >> > operator is not explicitly mentioned in the query. Am I missing
>> something
>> >> > or this is not expected behavior (bug)?
>> >> >
>> >> > --
>> >> > Paweł Róg
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query parser and default operator

Posted by Pawel Rog <pa...@gmail.com>.
Hi Dawid,
Thanks for your email. It seems StandardQueryParser is free from
this unexpected behavior.

I used the code below with Lucene 6.2.1
(org.apache.lucene.queryparser.classic.QueryParser)

    QueryParser parser = new QueryParser("test", new WhitespaceAnalyzer());

    parser.setDefaultOperator(QueryParser.Operator.AND);
    Query query = parser.parse("foo AND bar OR baz ");
    System.out.println(query.toString());

    parser.setDefaultOperator(QueryParser.Operator.OR);
    query = parser.parse("foo AND bar OR baz ");
    System.out.println(query.toString());


I can also reproduce it on Elasticsearch 2.2 which uses Lucene 5.4.0

$  curl -s 'localhost:9200/test/_search?pretty' -d '{ "query": {
"query_string": { "query": "foo AND bar OR baz" , "default_operator": "and"
} } , "profile" : true}' | grep luce
          "lucene" : "+_all:foo _all:bar _all:baz",
...

$ curl -s 'localhost:9200/test/_search?pretty' -d '{ "query": {
"query_string": { "query": "foo AND bar OR baz" , "default_operator": "or"
} } , "profile" : true}' | grep luce
          "lucene" : "+_all:foo +_all:bar _all:baz",
...

Elasticsearch uses class called MapperQueryParser which extends
org.apache.lucene.queryparser.classic.QueryParser

--
Paweł Róg

On Wed, Nov 9, 2016 at 6:10 PM, Dawid Weiss <da...@gmail.com> wrote:

> Which Lucene version and which query parser is this? Can you provide a
> test case/ code sample?
> I just tried with StandardQueryParser and for:
>
>         sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.AND);
>         dump(sqp.parse("foo AND bar OR baz", "field_a"));
>         sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.OR);
>         dump(sqp.parse("foo AND bar OR baz", "field_a"));
>
> I get the same result:
>
> BooleanQuery: +field_a:foo +field_a:bar field_a:baz
>
> Dawid
>
> On Wed, Nov 9, 2016 at 6:04 PM, Pawel Rog <pp...@gmail.com> wrote:
> > Hi Eric,
> > Thank you for your email.
> > I understand that Lucene queries are not in boolean logic. My point is
> only
> > that I would expect identical Lucene queries build from the same input
> > string. My intuition says that default operator should not matter in 2
> > examples I presented in previous email.
> >
> > --
> > Paweł Róg
> >
> > On Wed, Nov 9, 2016 at 4:32 PM, Erick Erickson <er...@gmail.com>
> > wrote:
> >
> >> Lucene queries aren't boolean logic. You can simulate boolean logic by
> >> explicitly parenthesizing, here's an excellent blog on this:
> >>
> >> https://lucidworks.com/blog/why-not-and-or-and-not/
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Nov 9, 2016 at 1:37 AM, Pawel Rog <pp...@gmail.com>
> wrote:
> >> > Hello ,
> >> > I have a query `foo AND bar OR baz`. When I use "AND" as a default
> >> operator
> >> > this is the resulting Lucene  query:
> >> >
> >> > `+test:foo test:bar test:baz`
> >> >
> >> > When I use "OR" this is the resulting query
> >> >
> >> > `+test:foo +test:bar test:baz`
> >> >
> >> >
> >> > I expected these two return exactly the same Lucene query because I
> used
> >> > operator explicitly. I thought that the default operator is used only
> >> when
> >> > operator is not explicitly mentioned in the query. Am I missing
> something
> >> > or this is not expected behavior (bug)?
> >> >
> >> > --
> >> > Paweł Róg
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query parser and default operator

Posted by Dawid Weiss <da...@gmail.com>.
Which Lucene version and which query parser is this? Can you provide a
test case/ code sample?
I just tried with StandardQueryParser and for:

        sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.AND);
        dump(sqp.parse("foo AND bar OR baz", "field_a"));
        sqp.setDefaultOperator(StandardQueryConfigHandler.Operator.OR);
        dump(sqp.parse("foo AND bar OR baz", "field_a"));

I get the same result:

BooleanQuery: +field_a:foo +field_a:bar field_a:baz

Dawid

On Wed, Nov 9, 2016 at 6:04 PM, Pawel Rog <pp...@gmail.com> wrote:
> Hi Eric,
> Thank you for your email.
> I understand that Lucene queries are not in boolean logic. My point is only
> that I would expect identical Lucene queries build from the same input
> string. My intuition says that default operator should not matter in 2
> examples I presented in previous email.
>
> --
> Paweł Róg
>
> On Wed, Nov 9, 2016 at 4:32 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Lucene queries aren't boolean logic. You can simulate boolean logic by
>> explicitly parenthesizing, here's an excellent blog on this:
>>
>> https://lucidworks.com/blog/why-not-and-or-and-not/
>>
>> Best,
>> Erick
>>
>> On Wed, Nov 9, 2016 at 1:37 AM, Pawel Rog <pp...@gmail.com> wrote:
>> > Hello ,
>> > I have a query `foo AND bar OR baz`. When I use "AND" as a default
>> operator
>> > this is the resulting Lucene  query:
>> >
>> > `+test:foo test:bar test:baz`
>> >
>> > When I use "OR" this is the resulting query
>> >
>> > `+test:foo +test:bar test:baz`
>> >
>> >
>> > I expected these two return exactly the same Lucene query because I used
>> > operator explicitly. I thought that the default operator is used only
>> when
>> > operator is not explicitly mentioned in the query. Am I missing something
>> > or this is not expected behavior (bug)?
>> >
>> > --
>> > Paweł Róg
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query parser and default operator

Posted by Pawel Rog <pp...@gmail.com>.
Hi Eric,
Thank you for your email.
I understand that Lucene queries are not in boolean logic. My point is only
that I would expect identical Lucene queries build from the same input
string. My intuition says that default operator should not matter in 2
examples I presented in previous email.

--
Paweł Róg

On Wed, Nov 9, 2016 at 4:32 PM, Erick Erickson <er...@gmail.com>
wrote:

> Lucene queries aren't boolean logic. You can simulate boolean logic by
> explicitly parenthesizing, here's an excellent blog on this:
>
> https://lucidworks.com/blog/why-not-and-or-and-not/
>
> Best,
> Erick
>
> On Wed, Nov 9, 2016 at 1:37 AM, Pawel Rog <pp...@gmail.com> wrote:
> > Hello ,
> > I have a query `foo AND bar OR baz`. When I use "AND" as a default
> operator
> > this is the resulting Lucene  query:
> >
> > `+test:foo test:bar test:baz`
> >
> > When I use "OR" this is the resulting query
> >
> > `+test:foo +test:bar test:baz`
> >
> >
> > I expected these two return exactly the same Lucene query because I used
> > operator explicitly. I thought that the default operator is used only
> when
> > operator is not explicitly mentioned in the query. Am I missing something
> > or this is not expected behavior (bug)?
> >
> > --
> > Paweł Róg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query parser and default operator

Posted by Erick Erickson <er...@gmail.com>.
Lucene queries aren't boolean logic. You can simulate boolean logic by
explicitly parenthesizing, here's an excellent blog on this:

https://lucidworks.com/blog/why-not-and-or-and-not/

Best,
Erick

On Wed, Nov 9, 2016 at 1:37 AM, Pawel Rog <pp...@gmail.com> wrote:
> Hello ,
> I have a query `foo AND bar OR baz`. When I use "AND" as a default operator
> this is the resulting Lucene  query:
>
> `+test:foo test:bar test:baz`
>
> When I use "OR" this is the resulting query
>
> `+test:foo +test:bar test:baz`
>
>
> I expected these two return exactly the same Lucene query because I used
> operator explicitly. I thought that the default operator is used only when
> operator is not explicitly mentioned in the query. Am I missing something
> or this is not expected behavior (bug)?
>
> --
> Paweł Róg

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org