You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Rajesh Munavalli <fi...@gmail.com> on 2006/03/03 16:18:17 UTC

Query filter and span query

I am not sure if a Query Filter would reduce the retrieval time in the
following scenario. The idea is to retrieve top N documents where the query
terms appear near one another within a document.

Let

A = Total number of query terms

B = Subset of A where

Number of Query terms(B) << Number of Query terms(A)

C = Pairs of terms in B

But I make sure that the pair selection results in, the Number of term
pairs(C) = Number of query terms(B)

I need to retrieve the documents by formulating the query as

BooleanQuery(TermQuery(A) OR SpanNearQuery(C))

Option 1: BooleanQuery(TermQuery(A) OR SpanNearQuery(C))

Option 2:

Step1: BooleanQuery(TermQuery(A))

Step2: Use a query filter top N documents. The number of documents N is much
less compared to the entire document collection.

Step3: SpanNearQuery(C) on those top N documents
Since Span queries are slower. Considering the fact that the number of
documents N would be much less than the total number of documents in the
index, is it better to query them on reduced number of N documents as in
Option 2?

Thanks,

Rajesh Munavalli