You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Steven White <sw...@gmail.com> on 2019/07/31 12:51:57 UTC

Single field in "qf" vs multiple

Hi everyone,

I'm indexing my data into multiple Solr fields, such as A, B, C and I'm
also copying all the data of those fields into a master field such as X.

By default, my "qf" is set to X so anytime a user is searching they are
searching across the data that also exist in fields A, B and C.

In some use cases, I need to narrow down a user's search to just A and C or
A only, etc.  When that happens, I dynamically, at run time set "qf" to "A
C" or just "A".

My question is this, will the search quality and ranking be different if I
simply set "qf" to "A B C" and avoid the copy operation to "X" (it will
save me disk space)?  Will there be a performance impact if I do this?  Is
there a limit at which point I should not list more than N fields in "qf"?

Thanks,

Steven

Re: Single field in "qf" vs multiple

Posted by Erick Erickson <er...@gmail.com>.
The short answer is “yes, ranking will be different”. This is inevitable since
the stats are different in your X field, there are more terms, the frequency of
any given term is different, etc.

I’d argue, though, that using qf with a list of fields can be tweaked to give
you better results. For instance you can boost the fields individually with
different weights etc. The canonical example is fields like title, summary
and body where you can assume that matches in title are more important
than summary which is in turn more important than body and do
something like:
qf=title^5 summary^2 body

Best,
Erick

> On Jul 31, 2019, at 8:51 AM, Steven White <sw...@gmail.com> wrote:
> 
> Hi everyone,
> 
> I'm indexing my data into multiple Solr fields, such as A, B, C and I'm
> also copying all the data of those fields into a master field such as X.
> 
> By default, my "qf" is set to X so anytime a user is searching they are
> searching across the data that also exist in fields A, B and C.
> 
> In some use cases, I need to narrow down a user's search to just A and C or
> A only, etc.  When that happens, I dynamically, at run time set "qf" to "A
> C" or just "A".
> 
> My question is this, will the search quality and ranking be different if I
> simply set "qf" to "A B C" and avoid the copy operation to "X" (it will
> save me disk space)?  Will there be a performance impact if I do this?  Is
> there a limit at which point I should not list more than N fields in "qf"?
> 
> Thanks,
> 
> Steven