You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Walt Stoneburner <wa...@gmail.com> on 2009/01/05 20:33:15 UTC

ORs and Ranks

Got an interesting question about Lucene's behavior, as recently I was
handed something that look like this:
  ( +MEDICAL CAT^2 )  OR  ( +ANIMAL CAT^-2 )

The intention of the query is to say "if medical is found, then rank cat
[scans] high, but if animal is found then rank cat [a feline] low."

Problem is my understanding of Lucene tells me that there really is no AND /
OR set operations, and that instead Lucene has a REQUIRED, NOT_REQUIRED,
SHOULD setting for each term. As such, one might be able to "simulate"
certain kinds of AND/OR expressions, but that subexpression statements are
independently isn't happening.

Is my understanding correct, or will Lucene do this kind of magic?  The
actual query I was given to construct programatically was significantly more
complex.

-wls

Re: ORs and Ranks

Posted by Erick Erickson <er...@gmail.com>.
As you say, your "real" queries are more complex, but your
example seems like a simple boost to me joined by an OR clause.

MEDICAL:CAT^10 OR ANIMAL:CAT

which you can construct in a BooleanQuery as two clauses
and "SHOULD".

The sense of this is that a hit must contain "CAT" in either
the MEDICAL or the ANIMAL fields, but occurrences in the
MEDICAL field will tend to sort to the top....

Remember too that Lucene query logic isn't strictly Boolean...

Best
Erick

On Mon, Jan 5, 2009 at 2:33 PM, Walt Stoneburner <walt.stoneburner@gmail.com
> wrote:

> Got an interesting question about Lucene's behavior, as recently I was
> handed something that look like this:
>  ( +MEDICAL CAT^2 )  OR  ( +ANIMAL CAT^-2 )
>
> The intention of the query is to say "if medical is found, then rank cat
> [scans] high, but if animal is found then rank cat [a feline] low."
>
> Problem is my understanding of Lucene tells me that there really is no AND
> /
> OR set operations, and that instead Lucene has a REQUIRED, NOT_REQUIRED,
> SHOULD setting for each term. As such, one might be able to "simulate"
> certain kinds of AND/OR expressions, but that subexpression statements are
> independently isn't happening.
>
> Is my understanding correct, or will Lucene do this kind of magic?  The
> actual query I was given to construct programatically was significantly
> more
> complex.
>
> -wls
>