You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Arun Kumar K <ar...@gmail.com> on 2013/06/24 12:54:57 UTC

DocIDBitSets & Grouping

Hi Guys,

I am using Lucene 4.2.

1> For my use case i am doing a search say name:xyz* and then i have a need
to do a grouping with (from query same as name:xyz* + Filter + GroupSort)
may be in same/different thread.

>From my understanding the second internal search will be faster but i have
good number of threads doing the same with different queries which may
affect the IO Cache.

Still, i don't want to perform same search internally again for grouping .

Reusing the previous search results by having a Bitset and using
BitsFilteredDocIDSet to Filter may solve till filtering but is there any
way to wrap these result DocIDsets as input for grouping ?
                       or
                Any smart way ?


2> For an AND Query i have tried
a) BooleanQuery, Query
b) FiledCacheTermsFilter
c) DocIDBitSet + BitsFilteredDocIDSet.
with 1 GB index and 4 lakh documents matching First Query and 2 lakh
documents matching second query but retrieving/collecting 10000 documents
only.

With prior warming i find that (a) & (b) take almost same time. I knew that
only when we reuse the Filter we get its benefits.
(c) takes around 30-40ms less time.

Can we conclude from this that method (c) is better ?
Is my choice Bitset implementation appropriate ?

Did i get somethings wrong & Are there any smart ways to do these ?

Thanks,
Arun

Re: DocIDBitSets & Grouping

Posted by Arun Kumar K <ar...@gmail.com>.
Thanks Uwe !
For part (1) of my query are there any smart ways ?

Arun


On Mon, Jun 24, 2013 at 4:29 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi,
>
>
> > With prior warming i find that (a) & (b) take almost same time. I knew
> that
> > only when we reuse the Filter we get its benefits.
> > (c) takes around 30-40ms less time.
> >
> > Can we conclude from this that method (c) is better ?
> > Is my choice Bitset implementation appropriate ?
>
> Use FixedBitSet from oa.lucene.util package to implement your filter. This
> might bring further improvements.
>
> > Did i get somethings wrong & Are there any smart ways to do these ?
> >
> > Thanks,
> > Arun
>
> Uwe
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

RE: DocIDBitSets & Grouping

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,


> With prior warming i find that (a) & (b) take almost same time. I knew that
> only when we reuse the Filter we get its benefits.
> (c) takes around 30-40ms less time.
> 
> Can we conclude from this that method (c) is better ?
> Is my choice Bitset implementation appropriate ?

Use FixedBitSet from oa.lucene.util package to implement your filter. This might bring further improvements.

> Did i get somethings wrong & Are there any smart ways to do these ?
> 
> Thanks,
> Arun

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org