You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Rajesh Munavalli <fi...@gmail.com> on 2006/02/10 17:31:29 UTC

query formulation

Does anyone have a good way to formulate the query in terms of performance
as well as ordering of retrieved documents for the following query?

Query: "field1:t1 t2 t3 t4 AND field2:t5 t6 t7"

I want to achieve the following
* The document which matches the query exactly in both the fields gets rank
1
* The documents with different orders of the query terms get the subsequent
ranks depending on their edit distance

This can be achieved by phrase queries ANDed together.
Modified Query: BooleanQuery(PhraseQuery("field1", "t1 t2 t3 t4" SLOPE:m)
AND PhraseQuery("field2", "t5 t6 t7", SLOPE:n))

However I also want to retrieve those documents (in order) where one or more
of the terms is missing from either of the fields. i.e,

Rank 1: All terms exist in both fields with certain slope factor

Rank 2: One term missing from one of the field
            field1:t1 t2 t3 t4 AND field2:t5 t6
            field2:t1 t2 t3 t4 AND field2:t6 t7
            field2:t1 t2 t3 t4 AND field2:t5 t7
           ...
           ...

Rank 3: Two terms missing from either of the field

...


Rank n: Only one term exists in both field1 and field 2

Thanks,

Rajesh Munavalli

Re: query formulation

Posted by Yonik Seeley <ys...@gmail.com>.
On 2/10/06, Rajesh Munavalli <fi...@gmail.com> wrote:
> However I also want to retrieve those documents (in order) where one or more
> of the terms is missing from either of the fields. i.e,

BooleanQuery.setMinimumNumberShouldMatch() in the development version
(1.9) of Lucene may help out in that respect.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org