You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael Stoppelman <st...@gmail.com> on 2008/04/16 20:04:53 UTC

QueryWrapperFilter question...

Hi all,
I've been doing some performance testing and found that using
QueryWrapperFilter for a location field
restriction I have to do allows my search results to approach 5-10ms. This
was surprising.
Before the performance was between 50ms-100ms.

The queries from before the optimization look like the following:
+(+(text:cats) +(loc:1 loc:2 loc:3 ...))

The QueryWrapperFilter does do a search itself. Why would performance be so
drastically different when the
QueryWrapperFilter needs to do a search? Does lucene just not have the
statistics to optimize this query so it
can decide which terms to filter by first?

M

Re: QueryWrapperFilter question...

Posted by Paul Elschot <pa...@xs4all.nl>.
Op Thursday 17 April 2008 06:37:18 schreef Michael Stoppelman:
> Actually, I screwed up the timing info. I wasn't including the time
> for the QueryWrapperFilter#bits(IndexReader) call. Sadly,
> it actually takes longer than the original query that had both terms
> included. Bummer. I had really convinced myself till the
> thought came to me at lunch :).

For a single query, adding a filter off course has a cost.
But when the location part can be reused in later queries,
give CachingWrapperFilter a try.

Regards,
Paul Elschot


>
> -M
>
> On Wed, Apr 16, 2008 at 6:43 PM, Karl Wettin <ka...@gmail.com> 
wrote:
> > Michael Stoppelman skrev:
> >
> >  Hi all,
> >
> > > I've been doing some performance testing and found that using
> > > QueryWrapperFilter for a location field
> > > restriction I have to do allows my search results to approach
> > > 5-10ms. This
> > > was surprising.
> > > Before the performance was between 50ms-100ms.
> > >
> > > The queries from before the optimization look like the following:
> > > +(+(text:cats) +(loc:1 loc:2 loc:3 ...))
> > >
> > > The QueryWrapperFilter does do a search itself. Why would
> > > performance be so
> > > drastically different when the
> > > QueryWrapperFilter needs to do a search? Does lucene just not
> > > have the statistics to optimize this query so it
> > > can decide which terms to filter by first?
> >
> > Do you wonder why a QueryWrapperFilter is faster than a Query? Then
> > the answer is that the filter uses a bitset to know if a document
> > matches a document or not. For each document that match text:cats
> > it checks the flag in the bitset for that document number instead
> > of seeking in the index to find out if also match loc:1, loc:2 or
> > loc:3.
> >
> >
> >     karl
> >
> > -------------------------------------------------------------------
> >-- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryWrapperFilter question...

Posted by Michael Stoppelman <st...@gmail.com>.
Actually, I screwed up the timing info. I wasn't including the time for the
QueryWrapperFilter#bits(IndexReader) call. Sadly,
it actually takes longer than the original query that had both terms
included. Bummer. I had really convinced myself till the
thought came to me at lunch :).

-M


On Wed, Apr 16, 2008 at 6:43 PM, Karl Wettin <ka...@gmail.com> wrote:

> Michael Stoppelman skrev:
>
>  Hi all,
> > I've been doing some performance testing and found that using
> > QueryWrapperFilter for a location field
> > restriction I have to do allows my search results to approach 5-10ms.
> > This
> > was surprising.
> > Before the performance was between 50ms-100ms.
> >
> > The queries from before the optimization look like the following:
> > +(+(text:cats) +(loc:1 loc:2 loc:3 ...))
> >
> > The QueryWrapperFilter does do a search itself. Why would performance be
> > so
> > drastically different when the
> > QueryWrapperFilter needs to do a search? Does lucene just not have the
> > statistics to optimize this query so it
> > can decide which terms to filter by first?
> >
>
> Do you wonder why a QueryWrapperFilter is faster than a Query? Then the
> answer is that the filter uses a bitset to know if a document matches a
> document or not. For each document that match text:cats it checks the flag
> in the bitset for that document number instead of seeking in the index to
> find out if also match loc:1, loc:2 or loc:3.
>
>
>     karl
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: QueryWrapperFilter question...

Posted by Karl Wettin <ka...@gmail.com>.
Michael Stoppelman skrev:
> Hi all,
> I've been doing some performance testing and found that using
> QueryWrapperFilter for a location field
> restriction I have to do allows my search results to approach 5-10ms. This
> was surprising.
> Before the performance was between 50ms-100ms.
> 
> The queries from before the optimization look like the following:
> +(+(text:cats) +(loc:1 loc:2 loc:3 ...))
> 
> The QueryWrapperFilter does do a search itself. Why would performance be so
> drastically different when the
> QueryWrapperFilter needs to do a search? Does lucene just not have the
> statistics to optimize this query so it
> can decide which terms to filter by first?

Do you wonder why a QueryWrapperFilter is faster than a Query? Then the 
answer is that the filter uses a bitset to know if a document matches a 
document or not. For each document that match text:cats it checks the 
flag in the bitset for that document number instead of seeking in the 
index to find out if also match loc:1, loc:2 or loc:3.


      karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org