You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Andres de la Peña <ad...@stratio.com> on 2016/07/22 13:32:47 UTC

Filter strategy in Lucene 6.0

Hi all,

Suppose that we have a boolean query composed by two filtering queries,
where one of them is fast and the other is slow:

BooleanQuery.Builder builder = new BooleanQuery.Builder();
builder.add(fastQuery, FILTER);
builder.add(slowQuery, FILTER);
Query query = builder.build();


How is the intersection between the two sub-queries calculated? Is the
order in which they are added to the boolean query relevant? Is there
something we could do to take advantage of my knowledge about the expected
performance of the sub-queries?

Prior to Lucene 6.0 there was a FilteredQuery.FilterStrategy providing some
control about this. Is there something analogous in Lucene 6.0?

Thanks in advance,

-- 
Andrés de la Peña

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

Re: Filter strategy in Lucene 6.0

Posted by Adrien Grand <jp...@gmail.com>.
You can read about the inception of the feature at
https://issues.apache.org/jira/browse/LUCENE-6198 and since two-phase
iteration is mostly useful for conjunctions, you could look at
ConjunctionDISI which is the class that takes care of intersecting multiple
iterators. I am afraid there is not much besides that.

Le mer. 3 août 2016 à 09:42, Parit Bansal <Pa...@sib.swiss> a écrit :

> Hi,
>
> Could you point to some resource where I can read about two-phase
> iterators in slightly more depth? There are still confusions for me as
> to how exactly it works.
>
> - Best
> Parit
>
> On 08/02/2016 07:07 PM, Andres de la Peña wrote:
> > Thanks Adrien, this is very helpful.
> >
> > I have just read your blog post about this
> > <
> https://www.elastic.co/blog/better-query-execution-coming-elasticsearch-2-0
> >,
> > the two-phase iteration is really cool!
> >
> > 2016-07-31 20:17 GMT+01:00 Adrien Grand <jp...@gmail.com>:
> >
> >> Lucene 5.0 introduced two-phase iteration
> >> (see Scorer.twoPhaseIterator()) as a way to tackle slow queries. In
> short,
> >> queries can be split into a fast approximation and a slower confirmation
> >> and Lucene makes sure to reach agreement between the different
> >> approximations of a BooleanQuery before checking the confirmations.
> >>
> >> Both approximations and confirmations have an API that allows them to
> >> expose a "cost" and the least costly components are always evaluated
> before
> >> the most costly ones. So the filter strategy is not useful anymore,
> every
> >> decision is made based on these cost APIs.
> >>
> >>
> >> Le ven. 22 juil. 2016 à 15:32, Andres de la Peña <ad...@stratio.com>
> a
> >> écrit :
> >>
> >>> Hi all,
> >>>
> >>> Suppose that we have a boolean query composed by two filtering queries,
> >>> where one of them is fast and the other is slow:
> >>>
> >>> BooleanQuery.Builder builder = new BooleanQuery.Builder();
> >>> builder.add(fastQuery, FILTER);
> >>> builder.add(slowQuery, FILTER);
> >>> Query query = builder.build();
> >>>
> >>>
> >>> How is the intersection between the two sub-queries calculated? Is the
> >>> order in which they are added to the boolean query relevant? Is there
> >>> something we could do to take advantage of my knowledge about the
> >> expected
> >>> performance of the sub-queries?
> >>>
> >>> Prior to Lucene 6.0 there was a FilteredQuery.FilterStrategy providing
> >> some
> >>> control about this. Is there something analogous in Lucene 6.0?
> >>>
> >>> Thanks in advance,
> >>>
> >>> --
> >>> Andrés de la Peña
> >>>
> >>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> >>> 28224 Pozuelo de Alarcón, Madrid
> >>> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> >>> <https://twitter.com/StratioBD>*
> >>>
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Filter strategy in Lucene 6.0

Posted by Parit Bansal <Pa...@sib.swiss>.
Hi,

Could you point to some resource where I can read about two-phase 
iterators in slightly more depth? There are still confusions for me as 
to how exactly it works.

- Best
Parit

On 08/02/2016 07:07 PM, Andres de la Pe�a wrote:
> Thanks Adrien, this is very helpful.
>
> I have just read your blog post about this
> <https://www.elastic.co/blog/better-query-execution-coming-elasticsearch-2-0>,
> the two-phase iteration is really cool!
>
> 2016-07-31 20:17 GMT+01:00 Adrien Grand <jp...@gmail.com>:
>
>> Lucene 5.0 introduced two-phase iteration
>> (see Scorer.twoPhaseIterator()) as a way to tackle slow queries. In short,
>> queries can be split into a fast approximation and a slower confirmation
>> and Lucene makes sure to reach agreement between the different
>> approximations of a BooleanQuery before checking the confirmations.
>>
>> Both approximations and confirmations have an API that allows them to
>> expose a "cost" and the least costly components are always evaluated before
>> the most costly ones. So the filter strategy is not useful anymore, every
>> decision is made based on these cost APIs.
>>
>>
>> Le ven. 22 juil. 2016 � 15:32, Andres de la Pe�a <ad...@stratio.com> a
>> �crit :
>>
>>> Hi all,
>>>
>>> Suppose that we have a boolean query composed by two filtering queries,
>>> where one of them is fast and the other is slow:
>>>
>>> BooleanQuery.Builder builder = new BooleanQuery.Builder();
>>> builder.add(fastQuery, FILTER);
>>> builder.add(slowQuery, FILTER);
>>> Query query = builder.build();
>>>
>>>
>>> How is the intersection between the two sub-queries calculated? Is the
>>> order in which they are added to the boolean query relevant? Is there
>>> something we could do to take advantage of my knowledge about the
>> expected
>>> performance of the sub-queries?
>>>
>>> Prior to Lucene 6.0 there was a FilteredQuery.FilterStrategy providing
>> some
>>> control about this. Is there something analogous in Lucene 6.0?
>>>
>>> Thanks in advance,
>>>
>>> --
>>> Andr�s de la Pe�a
>>>
>>> V�a de las dos Castillas, 33, �tica 4, 3� Planta
>>> 28224 Pozuelo de Alarc�n, Madrid
>>> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Filter strategy in Lucene 6.0

Posted by Andres de la Peña <ad...@stratio.com>.
Thanks Adrien, this is very helpful.

I have just read your blog post about this
<https://www.elastic.co/blog/better-query-execution-coming-elasticsearch-2-0>,
the two-phase iteration is really cool!

2016-07-31 20:17 GMT+01:00 Adrien Grand <jp...@gmail.com>:

> Lucene 5.0 introduced two-phase iteration
> (see Scorer.twoPhaseIterator()) as a way to tackle slow queries. In short,
> queries can be split into a fast approximation and a slower confirmation
> and Lucene makes sure to reach agreement between the different
> approximations of a BooleanQuery before checking the confirmations.
>
> Both approximations and confirmations have an API that allows them to
> expose a "cost" and the least costly components are always evaluated before
> the most costly ones. So the filter strategy is not useful anymore, every
> decision is made based on these cost APIs.
>
>
> Le ven. 22 juil. 2016 à 15:32, Andres de la Peña <ad...@stratio.com> a
> écrit :
>
> > Hi all,
> >
> > Suppose that we have a boolean query composed by two filtering queries,
> > where one of them is fast and the other is slow:
> >
> > BooleanQuery.Builder builder = new BooleanQuery.Builder();
> > builder.add(fastQuery, FILTER);
> > builder.add(slowQuery, FILTER);
> > Query query = builder.build();
> >
> >
> > How is the intersection between the two sub-queries calculated? Is the
> > order in which they are added to the boolean query relevant? Is there
> > something we could do to take advantage of my knowledge about the
> expected
> > performance of the sub-queries?
> >
> > Prior to Lucene 6.0 there was a FilteredQuery.FilterStrategy providing
> some
> > control about this. Is there something analogous in Lucene 6.0?
> >
> > Thanks in advance,
> >
> > --
> > Andrés de la Peña
> >
> > Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> > 28224 Pozuelo de Alarcón, Madrid
> > Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> > <https://twitter.com/StratioBD>*
> >
>



-- 
Andrés de la Peña

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

Re: Filter strategy in Lucene 6.0

Posted by Adrien Grand <jp...@gmail.com>.
Lucene 5.0 introduced two-phase iteration
(see Scorer.twoPhaseIterator()) as a way to tackle slow queries. In short,
queries can be split into a fast approximation and a slower confirmation
and Lucene makes sure to reach agreement between the different
approximations of a BooleanQuery before checking the confirmations.

Both approximations and confirmations have an API that allows them to
expose a "cost" and the least costly components are always evaluated before
the most costly ones. So the filter strategy is not useful anymore, every
decision is made based on these cost APIs.


Le ven. 22 juil. 2016 à 15:32, Andres de la Peña <ad...@stratio.com> a
écrit :

> Hi all,
>
> Suppose that we have a boolean query composed by two filtering queries,
> where one of them is fast and the other is slow:
>
> BooleanQuery.Builder builder = new BooleanQuery.Builder();
> builder.add(fastQuery, FILTER);
> builder.add(slowQuery, FILTER);
> Query query = builder.build();
>
>
> How is the intersection between the two sub-queries calculated? Is the
> order in which they are added to the boolean query relevant? Is there
> something we could do to take advantage of my knowledge about the expected
> performance of the sub-queries?
>
> Prior to Lucene 6.0 there was a FilteredQuery.FilterStrategy providing some
> control about this. Is there something analogous in Lucene 6.0?
>
> Thanks in advance,
>
> --
> Andrés de la Peña
>
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> <https://twitter.com/StratioBD>*
>