You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Muralidharan V <mu...@gmail.com> on 2009/01/26 07:42:43 UTC

cross-field AND queries with field boosting

Hi,

    We have documents with multiple fields conceptually, and a document is
considered a match if each of the terms in the query is in any one of the
fields(i.e a 'cross-field' AND). A simple way to do this would be to dump
all of these conceptual fields into one lucene field and do the query with a
default AND_OPERATOR. However another requirement is that some fields are
more important than others and need to be boosted with different weights.
One option that I can think of is a MultiFieldQuery that essentially looks
like (field1:term1 OR field2:term1 OR field3:term1) AND (field1:term2 OR
field2:term2 OR field3:term2) etc with appropriate field boosts. However I'm
concerned about the performance of this query for a large number of terms(We
might need to deal with 4-5 fields and 4-5 terms per query). Is there a
better solution?

Thanks,
Murali

Re: cross-field AND queries with field boosting

Posted by Muralidharan V <mu...@gmail.com>.
Karsten,

         Thanks for the suggestion. After some research, payloads and
BoostingTermQuery is what we ended up using.

Thanks,
Murali

On Wed, Jan 28, 2009 at 2:13 AM, Karsten F.
<ka...@fiz-technik.de>wrote:

>
> Hi Murali,
>
> I think a search with 4 * 5 = 20 Boolean Clauses will not be a performance
> problem
> (at least if you have only one optimized index-folder).
>
> You also could use one Field which contains content of all other fields
> with
> a boost factor for each term (different boost for content from different
> fields).
> You can do this with payloads and the BoostingTermQuery.
> See e.g.
>
> http://www.nabble.com/Newbie-Question-on---Multi-valued-Keyword-field-search-td20987615.html
>
> But this has the drawback that you loss the original frequencies of the
> fields, so possible the scoring will not show want you want.
>
> Best regards
>  Karsten
>
>
> Murali-7 wrote:
> >
> > Hi,
> >
> >     We have documents with multiple fields conceptually, and a document
> is
> > considered a match if each of the terms in the query is in any one of the
> > fields(i.e a 'cross-field' AND). A simple way to do this would be to dump
> > all of these conceptual fields into one lucene field and do the query
> with
> > a
> > default AND_OPERATOR. However another requirement is that some fields are
> > more important than others and need to be boosted with different weights.
> > One option that I can think of is a MultiFieldQuery that essentially
> looks
> > like (field1:term1 OR field2:term1 OR field3:term1) AND (field1:term2 OR
> > field2:term2 OR field3:term2) etc with appropriate field boosts. However
> > I'm
> > concerned about the performance of this query for a large number of
> > terms(We
> > might need to deal with 4-5 fields and 4-5 terms per query). Is there a
> > better solution?
> >
> > Thanks,
> > Murali
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/cross-field-AND-queries-with-field-boosting-tp21661099p21703051.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: cross-field AND queries with field boosting

Posted by "Karsten F." <ka...@fiz-technik.de>.
Hi Murali,

I think a search with 4 * 5 = 20 Boolean Clauses will not be a performance
problem 
(at least if you have only one optimized index-folder).

You also could use one Field which contains content of all other fields with
a boost factor for each term (different boost for content from different
fields).
You can do this with payloads and the BoostingTermQuery.
See e.g.
http://www.nabble.com/Newbie-Question-on---Multi-valued-Keyword-field-search-td20987615.html

But this has the drawback that you loss the original frequencies of the
fields, so possible the scoring will not show want you want.

Best regards
  Karsten


Murali-7 wrote:
> 
> Hi,
> 
>     We have documents with multiple fields conceptually, and a document is
> considered a match if each of the terms in the query is in any one of the
> fields(i.e a 'cross-field' AND). A simple way to do this would be to dump
> all of these conceptual fields into one lucene field and do the query with
> a
> default AND_OPERATOR. However another requirement is that some fields are
> more important than others and need to be boosted with different weights.
> One option that I can think of is a MultiFieldQuery that essentially looks
> like (field1:term1 OR field2:term1 OR field3:term1) AND (field1:term2 OR
> field2:term2 OR field3:term2) etc with appropriate field boosts. However
> I'm
> concerned about the performance of this query for a large number of
> terms(We
> might need to deal with 4-5 fields and 4-5 terms per query). Is there a
> better solution?
> 
> Thanks,
> Murali
> 
> 

-- 
View this message in context: http://www.nabble.com/cross-field-AND-queries-with-field-boosting-tp21661099p21703051.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org