You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by santamaria2 <ar...@contify.com> on 2012/06/05 14:50:40 UTC

Is it faster to search over many different fields or one field that combines the values of all those other fields?

Say I have various categories of 'tags'. I want a keyword search to search
through my index of articles. So I search over:
1) the title.
2) the body
3) about 10 of these tag-categories. Each tag category is multivalued with a
few words per value.

Without considering the affect on 'relevance', and using the standard lucene
query parser, would it be faster to specify each of these 10 fields in q (q
= cat1:keyword OR cat2:keyword OR ... ), or to copyfield the stuff in those
10 fields into one combined field?

Or is it such that I should be slapped in the face for even thinking about
performance in this scenario?

--
View this message in context: http://lucene.472066.n3.nabble.com/Is-it-faster-to-search-over-many-different-fields-or-one-field-that-combines-the-values-of-all-those-tp3987766.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is it faster to search over many different fields or one field that combines the values of all those other fields?

Posted by Gora Mohanty <go...@mimirtech.com>.
On 5 June 2012 22:05, Mikhail Khludnev <mk...@griddynamics.com> wrote:
> IRC, Lucene in Action book loops around this point almost every chapter:
> multifield query is faster.
[...]

Surely this is dependent on the type, and volume of one's
data? As with many issues, isn't the answer that "it depends",
i.e., one should prototype, and have objective measures on
one's own data-sets.

Would love to be educated otherwise.

Regards,
Gora

P.S. Have to get that book.

Re: Is it faster to search over many different fields or one field that combines the values of all those other fields?

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
IRC, Lucene in Action book loops around this point almost every chapter:
multifield query is faster.

On Tue, Jun 5, 2012 at 7:04 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> There may be a raw performance advantage to having all values in a single
> combined field, but then you loose the opportunity to boost title and tag
> field hits.
>
> With the extended dismax query parser you have the ability to specify the
> field list in the "qf" request parameter so that the query can simply be
> the keywords and operators without all of the extra "OR" operators. qf also
> lets you specify the boost for each field.
>
> -- Jack Krupansky
>
> -----Original Message----- From: santamaria2
> Sent: Tuesday, June 05, 2012 8:50 AM
> To: solr-user@lucene.apache.org
> Subject: Is it faster to search over many different fields or one field
> that combines the values of all those other fields?
>
>
> Say I have various categories of 'tags'. I want a keyword search to search
> through my index of articles. So I search over:
> 1) the title.
> 2) the body
> 3) about 10 of these tag-categories. Each tag category is multivalued with
> a
> few words per value.
>
> Without considering the affect on 'relevance', and using the standard
> lucene
> query parser, would it be faster to specify each of these 10 fields in q (q
> = cat1:keyword OR cat2:keyword OR ... ), or to copyfield the stuff in those
> 10 fields into one combined field?
>
> Or is it such that I should be slapped in the face for even thinking about
> performance in this scenario?
>
> --
> View this message in context: http://lucene.472066.n3.**
> nabble.com/Is-it-faster-to-**search-over-many-different-**
> fields-or-one-field-that-**combines-the-values-of-all-**
> those-tp3987766.html<http://lucene.472066.n3.nabble.com/Is-it-faster-to-search-over-many-different-fields-or-one-field-that-combines-the-values-of-all-those-tp3987766.html>
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>

Re: Is it faster to search over many different fields or one field that combines the values of all those other fields?

Posted by Jack Krupansky <ja...@basetechnology.com>.
There may be a raw performance advantage to having all values in a single 
combined field, but then you loose the opportunity to boost title and tag 
field hits.

With the extended dismax query parser you have the ability to specify the 
field list in the "qf" request parameter so that the query can simply be the 
keywords and operators without all of the extra "OR" operators. qf also lets 
you specify the boost for each field.

-- Jack Krupansky

-----Original Message----- 
From: santamaria2
Sent: Tuesday, June 05, 2012 8:50 AM
To: solr-user@lucene.apache.org
Subject: Is it faster to search over many different fields or one field that 
combines the values of all those other fields?

Say I have various categories of 'tags'. I want a keyword search to search
through my index of articles. So I search over:
1) the title.
2) the body
3) about 10 of these tag-categories. Each tag category is multivalued with a
few words per value.

Without considering the affect on 'relevance', and using the standard lucene
query parser, would it be faster to specify each of these 10 fields in q (q
= cat1:keyword OR cat2:keyword OR ... ), or to copyfield the stuff in those
10 fields into one combined field?

Or is it such that I should be slapped in the face for even thinking about
performance in this scenario?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-faster-to-search-over-many-different-fields-or-one-field-that-combines-the-values-of-all-those-tp3987766.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: Is it faster to search over many different fields or one field that combines the values of all those other fields?

Posted by Michael Della Bitta <mi...@appinions.com>.
I don't have the answer to your question, but I certainly don't think
anybody should be slapped in the face for asking a question!

Michael Della Bitta

------------------------------------------------
Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Tue, Jun 5, 2012 at 8:50 AM, santamaria2 <ar...@contify.com> wrote:
> Say I have various categories of 'tags'. I want a keyword search to search
> through my index of articles. So I search over:
> 1) the title.
> 2) the body
> 3) about 10 of these tag-categories. Each tag category is multivalued with a
> few words per value.
>
> Without considering the affect on 'relevance', and using the standard lucene
> query parser, would it be faster to specify each of these 10 fields in q (q
> = cat1:keyword OR cat2:keyword OR ... ), or to copyfield the stuff in those
> 10 fields into one combined field?
>
> Or is it such that I should be slapped in the face for even thinking about
> performance in this scenario?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Is-it-faster-to-search-over-many-different-fields-or-one-field-that-combines-the-values-of-all-those-tp3987766.html
> Sent from the Solr - User mailing list archive at Nabble.com.