You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by YouPeng Yang <yy...@gmail.com> on 2023/09/19 13:27:10 UTC

Can the BooleanQuery execution be optimized with same term queries

Hi All

 Sorry to bother you.The happiest thing is  studying the Lucene source
codes,thank you for all the  great works .


  About the BooleanQuery.I am encountered by a question about the execution
of BooleanQuery:although,BooleanQuery#rewrite has done some  works to
remove duplicate FILTER,SHOULD clauses.however still the same term query
can been executed the several times.

  I copied the test code in the TestBooleanQuery to confirm my assumption.

  Unit Test Code as follows:



BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();

qBuilder = new BooleanQuery.Builder();

qBuilder.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);

qBuilder.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);

qBuilder.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);

BooleanQuery.Builder nestQuery  = new BooleanQuery.Builder();

nestQuery.add(new TermQuery(new Term("field", "b")), Occur.*FILTER*);

nestQuery.add(new TermQuery(new Term("field", "a")), Occur.*SHOULD*);

nestQuery.add(new TermQuery(new Term("field", "d")), Occur.*SHOULD*);

qBuilder.add(nestQuery.build(),Occur.*SHOULD*);

qBuilder.setMinimumNumberShouldMatch(1);

BooleanQuery q = qBuilder.build();

q = qBuilder.build();

assertSameScoresWithoutFilters(searcher, q);


In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple
term-query ,1 nested boolean query that contains the same 3 term-query).

The underlying execution is that all the 6 term query were executed(see
TermQuery.Termweight#getTermsEnum()).

Apparently and theoretically,  the executions can be merged to increase the
time,right?.


So,is it possible or necessary  that Lucene merge the execution to optimize
the query performance, even though I know the optimization may be difficult.

Re: Can the BooleanQuery execution be optimized with same term queries

Posted by Adrien Grand <jp...@gmail.com>.
Thanks for letting me know, I'm glad you like them!


Le ven. 22 sept. 2023, 16:36, YouPeng Yang <yy...@gmail.com> a
écrit :

> Hi Adrien
>    Glad to have your opinion.I am reading your excellent articles  on
> elastic blog.
>
> Best regards
>
>
> Adrien Grand <jp...@gmail.com> 于2023年9月19日周二 21:32写道:
>
>> Hi Yang,
>>
>> It would be legal for Lucene to perform such optimizations indeed.
>>
>> On Tue, Sep 19, 2023 at 3:27 PM YouPeng Yang <yy...@gmail.com>
>> wrote:
>> >
>> > Hi All
>> >
>> >  Sorry to bother you.The happiest thing is  studying the Lucene source
>> codes,thank you for all the  great works .
>> >
>> >
>> >   About the BooleanQuery.I am encountered by a question about the
>> execution of BooleanQuery:although,BooleanQuery#rewrite has done some
>> works to remove duplicate FILTER,SHOULD clauses.however still the same term
>> query can been executed the several times.
>> >
>> >   I copied the test code in the TestBooleanQuery to confirm my
>> assumption.
>> >
>> >   Unit Test Code as follows:
>> >
>> >
>> >
>> > BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
>> >
>> > qBuilder = new BooleanQuery.Builder();
>> >
>> > qBuilder.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>> >
>> > qBuilder.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>> >
>> > qBuilder.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>> >
>> > BooleanQuery.Builder nestQuery  = new BooleanQuery.Builder();
>> >
>> > nestQuery.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>> >
>> > nestQuery.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>> >
>> > nestQuery.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>> >
>> > qBuilder.add(nestQuery.build(),Occur.SHOULD);
>> >
>> > qBuilder.setMinimumNumberShouldMatch(1);
>> >
>> > BooleanQuery q = qBuilder.build();
>> >
>> > q = qBuilder.build();
>> >
>> > assertSameScoresWithoutFilters(searcher, q);
>> >
>> >
>> > In this test, the top boolean query(qBuilder) contains 4 clauses(3
>> simple term-query ,1 nested boolean query that contains the same 3
>> term-query).
>> >
>> > The underlying execution is that all the 6 term query were executed(see
>> TermQuery.Termweight#getTermsEnum()).
>> >
>> > Apparently and theoretically,  the executions can be merged to increase
>> the time,right?.
>> >
>> >
>> > So,is it possible or necessary  that Lucene merge the execution to
>> optimize the query performance, even though I know the optimization may be
>> difficult.
>> >
>> >
>> >
>>
>>
>> --
>> Adrien
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>

Re: Can the BooleanQuery execution be optimized with same term queries

Posted by YouPeng Yang <yy...@gmail.com>.
Hi Adrien
   Glad to have your opinion.I am reading your excellent articles  on
elastic blog.

Best regards


Adrien Grand <jp...@gmail.com> 于2023年9月19日周二 21:32写道:

> Hi Yang,
>
> It would be legal for Lucene to perform such optimizations indeed.
>
> On Tue, Sep 19, 2023 at 3:27 PM YouPeng Yang <yy...@gmail.com>
> wrote:
> >
> > Hi All
> >
> >  Sorry to bother you.The happiest thing is  studying the Lucene source
> codes,thank you for all the  great works .
> >
> >
> >   About the BooleanQuery.I am encountered by a question about the
> execution of BooleanQuery:although,BooleanQuery#rewrite has done some
> works to remove duplicate FILTER,SHOULD clauses.however still the same term
> query can been executed the several times.
> >
> >   I copied the test code in the TestBooleanQuery to confirm my
> assumption.
> >
> >   Unit Test Code as follows:
> >
> >
> >
> > BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
> >
> > qBuilder = new BooleanQuery.Builder();
> >
> > qBuilder.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
> >
> > qBuilder.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
> >
> > qBuilder.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
> >
> > BooleanQuery.Builder nestQuery  = new BooleanQuery.Builder();
> >
> > nestQuery.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
> >
> > nestQuery.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
> >
> > nestQuery.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
> >
> > qBuilder.add(nestQuery.build(),Occur.SHOULD);
> >
> > qBuilder.setMinimumNumberShouldMatch(1);
> >
> > BooleanQuery q = qBuilder.build();
> >
> > q = qBuilder.build();
> >
> > assertSameScoresWithoutFilters(searcher, q);
> >
> >
> > In this test, the top boolean query(qBuilder) contains 4 clauses(3
> simple term-query ,1 nested boolean query that contains the same 3
> term-query).
> >
> > The underlying execution is that all the 6 term query were executed(see
> TermQuery.Termweight#getTermsEnum()).
> >
> > Apparently and theoretically,  the executions can be merged to increase
> the time,right?.
> >
> >
> > So,is it possible or necessary  that Lucene merge the execution to
> optimize the query performance, even though I know the optimization may be
> difficult.
> >
> >
> >
>
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Can the BooleanQuery execution be optimized with same term queries

Posted by Adrien Grand <jp...@gmail.com>.
Hi Yang,

It would be legal for Lucene to perform such optimizations indeed.

On Tue, Sep 19, 2023 at 3:27 PM YouPeng Yang <yy...@gmail.com> wrote:
>
> Hi All
>
>  Sorry to bother you.The happiest thing is  studying the Lucene source codes,thank you for all the  great works .
>
>
>   About the BooleanQuery.I am encountered by a question about the execution of BooleanQuery:although,BooleanQuery#rewrite has done some  works to remove duplicate FILTER,SHOULD clauses.however still the same term query can been executed the several times.
>
>   I copied the test code in the TestBooleanQuery to confirm my assumption.
>
>   Unit Test Code as follows:
>
>
>
> BooleanQuery.Builder qBuilder = new BooleanQuery.Builder();
>
> qBuilder = new BooleanQuery.Builder();
>
> qBuilder.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>
> qBuilder.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>
> qBuilder.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>
> BooleanQuery.Builder nestQuery  = new BooleanQuery.Builder();
>
> nestQuery.add(new TermQuery(new Term("field", "b")), Occur.FILTER);
>
> nestQuery.add(new TermQuery(new Term("field", "a")), Occur.SHOULD);
>
> nestQuery.add(new TermQuery(new Term("field", "d")), Occur.SHOULD);
>
> qBuilder.add(nestQuery.build(),Occur.SHOULD);
>
> qBuilder.setMinimumNumberShouldMatch(1);
>
> BooleanQuery q = qBuilder.build();
>
> q = qBuilder.build();
>
> assertSameScoresWithoutFilters(searcher, q);
>
>
> In this test, the top boolean query(qBuilder) contains 4 clauses(3 simple term-query ,1 nested boolean query that contains the same 3 term-query).
>
> The underlying execution is that all the 6 term query were executed(see TermQuery.Termweight#getTermsEnum()).
>
> Apparently and theoretically,  the executions can be merged to increase the time,right?.
>
>
> So,is it possible or necessary  that Lucene merge the execution to optimize the query performance, even though I know the optimization may be difficult.
>
>
>


-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org