You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Maxim Patramanskij <ma...@osua.de> on 2005/10/24 19:13:55 UTC

Cross-field multi-word and query

I have the following problem:

I need to construct programmatically a Boolean query against n fields
having m words in my query.

All possible unique combinations(sub-queries) are disjunctive between
each other while boolean clauses of each combination combines with AND
operator.

The reason of such complexity is that I have to find a result of AND
query against several field, when parts of my query could appear in
different fields and I can't create just one single field because each
field has its own boost level.

Does anyone have an experience of writing such query builder?

Best regards,
 Maxim


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re[3]: Cross-field multi-word and query

Posted by Maxim Patramanskij <ma...@osua.de>.
Hello Chris,

Thanks a lot for the helping hand.

I plugged in MaxDisjunctQuery and it is working so far, but I
need to check accuracy of it. Next problem I met is highlighter, which
must be adopted to understand MaxDisjunctQuery(because now it stops to
highlight anything due to unknown new Query type), but this is yet
another story :).

Greetings,
Max



Wednesday, October 26, 2005, 12:23:39 AM, you wrote:


CH> : I have n fields, for simplicity let's say 3: f1, f2, f3.
CH> : I have an AND query with m words in it, lets' also simplify: w1, w2, w3.
CH> :
CH> : To cover all possible cases I should finally have the following
CH> : BooleanQuery:

CH> it really depends on what you want.  if I understand what you mean in the
CH> below query (I'm assuming you want all of those boolean queries to
CH> themselves be optional clauses in one big wrapping BooleanQuery) then I
CH> think you can acomplish roughly the same thing using one boolean query
CH> wrapping a MaxDisjunct query for each word (where each unique clause of a
CH> MaxDisjunct is for the different fields.

CH> Expressed in the syntax used by Chuck Willians (the author of
CH> MaxDisjunctQuery), what I mean is...

CH> ( +(f1:w1 | f2:w1 | f3:w1) +(f1:w2 | f2:w2 | f3:w2) +(f1:w3 | f2:w3 | f3:w3))

CH> ...this will garuntee that all three words appear in your index, in one of
CH> hte three fields.  it will also result in the score contribution for each
CH> word being dominated by on whichever field results in a Term that
CH> generates the highest score for that word.

CH> Please look at the "albino elephant" example provided by the Chuck in his
CH> initial issue report...

CH>         http://issues.apache.org/jira/browse/LUCENE-323

CH> :
CH> : (+(f1:w1) +(f1:w2) +(f1:w3))
CH> : (+(f2:w1) +(f2:w2) +(f2:w3))
CH> : (+(f3:w1) +(f3:w2) +(f3:w3))
CH> :
CH> : (+(f1:w1) +(f2:w2) +(f3:w3))
CH> : (+(f1:w2) +(f2:w1) +(f3:w3))
CH> : (+ f1:w3) +(f2:w1) +(f3:w2))
CH> : (+(f1:w1) +(f2:w3) +(f3:w2))
CH> : (+(f1:w2) +(f2:w3) +(f3:w1))
CH> : (+ f1:w3) +(f2:w2) +(f3:w1))
CH> :
CH> : (+(f1:w1) +(f1:w2) +(f2:w3))
CH> : (+(f1:w1) +(f1:w2) +(f3:w3))
CH> :
CH> : (+(f1:w1) +(f1:w3) +(f2:w2))
CH> : (+(f1:w1) +(f1:w3) +(f3:w2))
CH> :
CH> : (+(f1:w2) +(f1:w3) +(f2:w1))
CH> : (+(f1:w2) +(f1:w3) +(f3:w1))
CH> :
CH> : (+(f2:w1) +(f2:w2) +(f1:w3))
CH> : (+(f2:w1) +(f2:w2) +(f3:w3))
CH> :
CH> : (+(f2:w1) +(f2:w3) +(f1:w2))
CH> : (+(f2:w1) +(f2:w3) +(f3:w2))
CH> :
CH> : (+(f2:w2) +(f2:w3) +(f1:w1))
CH> : (+(f2:w2) +(f2:w3) +(f3:w1))
CH> :
CH> : (+(f3:w1) +(f3:w2) +(f1:w3))
CH> : (+(f3:w1) +(f3:w2) +(f2:w3))
CH> :
CH> : (+(f3:w1) +(f3:w3) +(f1:w2))
CH> : (+(f3:w1) +(f3:w3) +(f2:w2))
CH> :
CH> : (+(f3:w2) +(f3:w3) +(f1:w1))
CH> : (+(f3:w2) +(f3:w3) +(f2:w1))


CH> -Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re[2]: Cross-field multi-word and query

Posted by Chris Hostetter <ho...@fucit.org>.
: I have n fields, for simplicity let's say 3: f1, f2, f3.
: I have an AND query with m words in it, lets' also simplify: w1, w2, w3.
:
: To cover all possible cases I should finally have the following
: BooleanQuery:

it really depends on what you want.  if I understand what you mean in the
below query (I'm assuming you want all of those boolean queries to
themselves be optional clauses in one big wrapping BooleanQuery) then I
think you can acomplish roughly the same thing using one boolean query
wrapping a MaxDisjunct query for each word (where each unique clause of a
MaxDisjunct is for the different fields.

Expressed in the syntax used by Chuck Willians (the author of
MaxDisjunctQuery), what I mean is...

( +(f1:w1 | f2:w1 | f3:w1) +(f1:w2 | f2:w2 | f3:w2) +(f1:w3 | f2:w3 | f3:w3))

...this will garuntee that all three words appear in your index, in one of
hte three fields.  it will also result in the score contribution for each
word being dominated by on whichever field results in a Term that
generates the highest score for that word.

Please look at the "albino elephant" example provided by the Chuck in his
initial issue report...

	http://issues.apache.org/jira/browse/LUCENE-323

:
: (+(f1:w1) +(f1:w2) +(f1:w3))
: (+(f2:w1) +(f2:w2) +(f2:w3))
: (+(f3:w1) +(f3:w2) +(f3:w3))
:
: (+(f1:w1) +(f2:w2) +(f3:w3))
: (+(f1:w2) +(f2:w1) +(f3:w3))
: (+ f1:w3) +(f2:w1) +(f3:w2))
: (+(f1:w1) +(f2:w3) +(f3:w2))
: (+(f1:w2) +(f2:w3) +(f3:w1))
: (+ f1:w3) +(f2:w2) +(f3:w1))
:
: (+(f1:w1) +(f1:w2) +(f2:w3))
: (+(f1:w1) +(f1:w2) +(f3:w3))
:
: (+(f1:w1) +(f1:w3) +(f2:w2))
: (+(f1:w1) +(f1:w3) +(f3:w2))
:
: (+(f1:w2) +(f1:w3) +(f2:w1))
: (+(f1:w2) +(f1:w3) +(f3:w1))
:
: (+(f2:w1) +(f2:w2) +(f1:w3))
: (+(f2:w1) +(f2:w2) +(f3:w3))
:
: (+(f2:w1) +(f2:w3) +(f1:w2))
: (+(f2:w1) +(f2:w3) +(f3:w2))
:
: (+(f2:w2) +(f2:w3) +(f1:w1))
: (+(f2:w2) +(f2:w3) +(f3:w1))
:
: (+(f3:w1) +(f3:w2) +(f1:w3))
: (+(f3:w1) +(f3:w2) +(f2:w3))
:
: (+(f3:w1) +(f3:w3) +(f1:w2))
: (+(f3:w1) +(f3:w3) +(f2:w2))
:
: (+(f3:w2) +(f3:w3) +(f1:w1))
: (+(f3:w2) +(f3:w3) +(f2:w1))


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re[2]: Cross-field multi-word and query

Posted by Maxim Patramanskij <ma...@osua.de>.
Hello Chris,

thanks for the tip.

However, I'm not sure, how can I implement with MaxDisjunctionQuery
the following:

I have n fields, for simplicity let's say 3: f1, f2, f3.
I have an AND query with m words in it, lets' also simplify: w1, w2, w3.

To cover all possible cases I should finally have the following
BooleanQuery:

(+(f1:w1) +(f1:w2) +(f1:w3))
(+(f2:w1) +(f2:w2) +(f2:w3))
(+(f3:w1) +(f3:w2) +(f3:w3))

(+(f1:w1) +(f2:w2) +(f3:w3))
(+(f1:w2) +(f2:w1) +(f3:w3))
(+ f1:w3) +(f2:w1) +(f3:w2))
(+(f1:w1) +(f2:w3) +(f3:w2))
(+(f1:w2) +(f2:w3) +(f3:w1))
(+ f1:w3) +(f2:w2) +(f3:w1))

(+(f1:w1) +(f1:w2) +(f2:w3))
(+(f1:w1) +(f1:w2) +(f3:w3))

(+(f1:w1) +(f1:w3) +(f2:w2))
(+(f1:w1) +(f1:w3) +(f3:w2))

(+(f1:w2) +(f1:w3) +(f2:w1))
(+(f1:w2) +(f1:w3) +(f3:w1))

(+(f2:w1) +(f2:w2) +(f1:w3))
(+(f2:w1) +(f2:w2) +(f3:w3))

(+(f2:w1) +(f2:w3) +(f1:w2))
(+(f2:w1) +(f2:w3) +(f3:w2))

(+(f2:w2) +(f2:w3) +(f1:w1))
(+(f2:w2) +(f2:w3) +(f3:w1))

(+(f3:w1) +(f3:w2) +(f1:w3))
(+(f3:w1) +(f3:w2) +(f2:w3))

(+(f3:w1) +(f3:w3) +(f1:w2))
(+(f3:w1) +(f3:w3) +(f2:w2))

(+(f3:w2) +(f3:w3) +(f1:w1))
(+(f3:w2) +(f3:w3) +(f2:w1))


If I'm wrong and this query can be simplified containing several
MaxDisjunctionQuery, it would be great.

Greetings,
Max



Tuesday, October 25, 2005, 9:41:13 AM, you wrote:


CH> I may be wrong, but i think what you are talking about is a BooleanQuery
CH> containing several MaxDisjunctionQuery.  take a look at the code in this
CH> patch...

CH>         http://issues.apache.org/jira/browse/LUCENE-323


CH> : Date: Mon, 24 Oct 2005 20:13:55 +0300
CH> : From: Maxim Patramanskij <ma...@osua.de>
CH> : Reply-To: java-user@lucene.apache.org, Maxim Patramanskij <ma...@osua.de>
CH> : To: java-user@lucene.apache.org
CH> : Subject: Cross-field multi-word and query
CH> :
CH> :
CH> : I have the following problem:
CH> :
CH> : I need to construct programmatically a Boolean query against n fields
CH> : having m words in my query.
CH> :
CH> : All possible unique combinations(sub-queries) are disjunctive between
CH> : each other while boolean clauses of each combination combines with AND
CH> : operator.
CH> :
CH> : The reason of such complexity is that I have to find a result of AND
CH> : query against several field, when parts of my query could appear in
CH> : different fields and I can't create just one single field because each
CH> : field has its own boost level.
CH> :
CH> : Does anyone have an experience of writing such query builder?
CH> :
CH> : Best regards,
CH> :  Maxim
CH> :
CH> :
CH> : ---------------------------------------------------------------------
CH> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
CH> : For additional commands, e-mail: java-user-help@lucene.apache.org
CH> :



CH> -Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Cross-field multi-word and query

Posted by Chris Hostetter <ho...@fucit.org>.
I may be wrong, but i think what you are talking about is a BooleanQuery
containing several MaxDisjunctionQuery.  take a look at the code in this
patch...

	http://issues.apache.org/jira/browse/LUCENE-323


: Date: Mon, 24 Oct 2005 20:13:55 +0300
: From: Maxim Patramanskij <ma...@osua.de>
: Reply-To: java-user@lucene.apache.org, Maxim Patramanskij <ma...@osua.de>
: To: java-user@lucene.apache.org
: Subject: Cross-field multi-word and query
:
:
: I have the following problem:
:
: I need to construct programmatically a Boolean query against n fields
: having m words in my query.
:
: All possible unique combinations(sub-queries) are disjunctive between
: each other while boolean clauses of each combination combines with AND
: operator.
:
: The reason of such complexity is that I have to find a result of AND
: query against several field, when parts of my query could appear in
: different fields and I can't create just one single field because each
: field has its own boost level.
:
: Does anyone have an experience of writing such query builder?
:
: Best regards,
:  Maxim
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org