You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Steven White <sw...@gmail.com> on 2015/11/20 00:46:19 UTC

Number of fields in qf & fq

Hi everyone

What is considered too many fields for qf and fq?  On average I will have
1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming I can
(I have to check with the design) for qf, if I cut it down to 1 field, will
I see noticeable performance improvement?  It will take a lot of effort to
test this which is why I'm asking first.

As is, I'm seeing 2-5 sec response time for searches on an index of 1
million records with total index size (on disk) of 4 GB.  I gave Solr 2 GB
of RAM (also tested at 4 GB) in both cases Solr didn't use more then 1 GB.

Thanks in advanced

Steve

Re: Number of fields in qf & fq

Posted by Scott Stults <ss...@opensourceconnections.com>.
Steve,

Another thing debugQuery will give you is a breakdown of how much each
field contributed to the final score of each hit. That's going to give you
a nice shopping list of qf to weed out.


k/r,
Scott

On Fri, Nov 20, 2015 at 9:26 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> Hello Steve,
>
> debugQuery=true shows whether it's facets or query, whether it's query
> parsing or searching (prepare vs process), cache statistics can tell about
> its' efficiency; sometimes a problem is obvious from request parameters.
> Simple sampling with jconsole or even by jstack can point on a smoking
> gun.
>
> On Fri, Nov 20, 2015 at 4:08 PM, Steven White <sw...@gmail.com>
> wrote:
>
> > Thanks Erick.
> >
> > The 1500 fields is a design that I inherited.  I'm trying to figure out
> why
> > it was done as such and what it will take to fix it.
> >
> > What about my other question: how does one go about debugging performance
> > issues in Solr to find out where time is mostly spent?  How do I know my
> > Solr parameters, such as cache and what have you are set right?  From
> what
> > I see, we are using the defaults off solrconfig.xml.
> >
> > I'm on Solr 5.2
> >
> > Steve
> >
> >
> > On Thu, Nov 19, 2015 at 11:36 PM, Erick Erickson <
> erickerickson@gmail.com>
> > wrote:
> >
> > > An fq is still a single entry in your filterCache so from that
> > > perspective it's the same.
> > >
> > > And to create that entry, you're still using all the underlying fields
> > > to search, so they have to be loaded just like they would be in a q
> > > clause.
> > >
> > > But really, the fundamental question here is why your design even has
> > > 1,500 fields and, more specifically, why you would want to search them
> > > all at once. From a 10,000 ft. view, that's a very suspect design.
> > >
> > > Best,
> > > Erick
> > >
> > > On Thu, Nov 19, 2015 at 4:06 PM, Walter Underwood <
> wunder@wunderwood.org
> > >
> > > wrote:
> > > > The implementation for fq has changed from 4.x to 5.x, so I’ll let
> > > someone else answer that in detail.
> > > >
> > > > In 4.x, the result of each filter query can be cached. After that,
> they
> > > are quite fast.
> > > >
> > > > wunder
> > > > Walter Underwood
> > > > wunder@wunderwood.org
> > > > http://observer.wunderwood.org/  (my blog)
> > > >
> > > >
> > > >> On Nov 19, 2015, at 3:59 PM, Steven White <sw...@gmail.com>
> > wrote:
> > > >>
> > > >> Thanks Walter.  I see your point.  Does this apply to fq as will?
> > > >>
> > > >> Also, how does one go about debugging performance issues in Solr to
> > find
> > > >> out where time is mostly spent?
> > > >>
> > > >> Steve
> > > >>
> > > >> On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood <
> > > wunder@wunderwood.org>
> > > >> wrote:
> > > >>
> > > >>> With one field in qf for a single-term query, Solr is fetching one
> > > posting
> > > >>> list. With 1500 fields, it is fetching 1500 posting lists. It could
> > > easily
> > > >>> be 1500 times slower.
> > > >>>
> > > >>> It might be even slower than that, because we can’t guarantee that:
> > a)
> > > >>> every algorithm in Solr is linear, b) that all those lists will fit
> > in
> > > >>> memory.
> > > >>>
> > > >>> wunder
> > > >>> Walter Underwood
> > > >>> wunder@wunderwood.org
> > > >>> http://observer.wunderwood.org/  (my blog)
> > > >>>
> > > >>>
> > > >>>> On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com>
> > > wrote:
> > > >>>>
> > > >>>> Hi everyone
> > > >>>>
> > > >>>> What is considered too many fields for qf and fq?  On average I
> will
> > > have
> > > >>>> 1500 fields in qf and 100 in fq (all of which are OR'ed).
> Assuming
> > I
> > > can
> > > >>>> (I have to check with the design) for qf, if I cut it down to 1
> > field,
> > > >>> will
> > > >>>> I see noticeable performance improvement?  It will take a lot of
> > > effort
> > > >>> to
> > > >>>> test this which is why I'm asking first.
> > > >>>>
> > > >>>> As is, I'm seeing 2-5 sec response time for searches on an index
> of
> > 1
> > > >>>> million records with total index size (on disk) of 4 GB.  I gave
> > Solr
> > > 2
> > > >>> GB
> > > >>>> of RAM (also tested at 4 GB) in both cases Solr didn't use more
> > then 1
> > > >>> GB.
> > > >>>>
> > > >>>> Thanks in advanced
> > > >>>>
> > > >>>> Steve
> > > >>>
> > > >>>
> > > >
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mk...@griddynamics.com>
>



-- 
Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
| 434.409.2780
http://www.opensourceconnections.com

Re: Number of fields in qf & fq

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello Steve,

debugQuery=true shows whether it's facets or query, whether it's query
parsing or searching (prepare vs process), cache statistics can tell about
its' efficiency; sometimes a problem is obvious from request parameters.
Simple sampling with jconsole or even by jstack can point on a smoking
gun.

On Fri, Nov 20, 2015 at 4:08 PM, Steven White <sw...@gmail.com> wrote:

> Thanks Erick.
>
> The 1500 fields is a design that I inherited.  I'm trying to figure out why
> it was done as such and what it will take to fix it.
>
> What about my other question: how does one go about debugging performance
> issues in Solr to find out where time is mostly spent?  How do I know my
> Solr parameters, such as cache and what have you are set right?  From what
> I see, we are using the defaults off solrconfig.xml.
>
> I'm on Solr 5.2
>
> Steve
>
>
> On Thu, Nov 19, 2015 at 11:36 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
> > An fq is still a single entry in your filterCache so from that
> > perspective it's the same.
> >
> > And to create that entry, you're still using all the underlying fields
> > to search, so they have to be loaded just like they would be in a q
> > clause.
> >
> > But really, the fundamental question here is why your design even has
> > 1,500 fields and, more specifically, why you would want to search them
> > all at once. From a 10,000 ft. view, that's a very suspect design.
> >
> > Best,
> > Erick
> >
> > On Thu, Nov 19, 2015 at 4:06 PM, Walter Underwood <wunder@wunderwood.org
> >
> > wrote:
> > > The implementation for fq has changed from 4.x to 5.x, so I’ll let
> > someone else answer that in detail.
> > >
> > > In 4.x, the result of each filter query can be cached. After that, they
> > are quite fast.
> > >
> > > wunder
> > > Walter Underwood
> > > wunder@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >
> > >> On Nov 19, 2015, at 3:59 PM, Steven White <sw...@gmail.com>
> wrote:
> > >>
> > >> Thanks Walter.  I see your point.  Does this apply to fq as will?
> > >>
> > >> Also, how does one go about debugging performance issues in Solr to
> find
> > >> out where time is mostly spent?
> > >>
> > >> Steve
> > >>
> > >> On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood <
> > wunder@wunderwood.org>
> > >> wrote:
> > >>
> > >>> With one field in qf for a single-term query, Solr is fetching one
> > posting
> > >>> list. With 1500 fields, it is fetching 1500 posting lists. It could
> > easily
> > >>> be 1500 times slower.
> > >>>
> > >>> It might be even slower than that, because we can’t guarantee that:
> a)
> > >>> every algorithm in Solr is linear, b) that all those lists will fit
> in
> > >>> memory.
> > >>>
> > >>> wunder
> > >>> Walter Underwood
> > >>> wunder@wunderwood.org
> > >>> http://observer.wunderwood.org/  (my blog)
> > >>>
> > >>>
> > >>>> On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com>
> > wrote:
> > >>>>
> > >>>> Hi everyone
> > >>>>
> > >>>> What is considered too many fields for qf and fq?  On average I will
> > have
> > >>>> 1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming
> I
> > can
> > >>>> (I have to check with the design) for qf, if I cut it down to 1
> field,
> > >>> will
> > >>>> I see noticeable performance improvement?  It will take a lot of
> > effort
> > >>> to
> > >>>> test this which is why I'm asking first.
> > >>>>
> > >>>> As is, I'm seeing 2-5 sec response time for searches on an index of
> 1
> > >>>> million records with total index size (on disk) of 4 GB.  I gave
> Solr
> > 2
> > >>> GB
> > >>>> of RAM (also tested at 4 GB) in both cases Solr didn't use more
> then 1
> > >>> GB.
> > >>>>
> > >>>> Thanks in advanced
> > >>>>
> > >>>> Steve
> > >>>
> > >>>
> > >
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re: Number of fields in qf & fq

Posted by Steven White <sw...@gmail.com>.
Thanks Erick.

The 1500 fields is a design that I inherited.  I'm trying to figure out why
it was done as such and what it will take to fix it.

What about my other question: how does one go about debugging performance
issues in Solr to find out where time is mostly spent?  How do I know my
Solr parameters, such as cache and what have you are set right?  From what
I see, we are using the defaults off solrconfig.xml.

I'm on Solr 5.2

Steve


On Thu, Nov 19, 2015 at 11:36 PM, Erick Erickson <er...@gmail.com>
wrote:

> An fq is still a single entry in your filterCache so from that
> perspective it's the same.
>
> And to create that entry, you're still using all the underlying fields
> to search, so they have to be loaded just like they would be in a q
> clause.
>
> But really, the fundamental question here is why your design even has
> 1,500 fields and, more specifically, why you would want to search them
> all at once. From a 10,000 ft. view, that's a very suspect design.
>
> Best,
> Erick
>
> On Thu, Nov 19, 2015 at 4:06 PM, Walter Underwood <wu...@wunderwood.org>
> wrote:
> > The implementation for fq has changed from 4.x to 5.x, so I’ll let
> someone else answer that in detail.
> >
> > In 4.x, the result of each filter query can be cached. After that, they
> are quite fast.
> >
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >
> >> On Nov 19, 2015, at 3:59 PM, Steven White <sw...@gmail.com> wrote:
> >>
> >> Thanks Walter.  I see your point.  Does this apply to fq as will?
> >>
> >> Also, how does one go about debugging performance issues in Solr to find
> >> out where time is mostly spent?
> >>
> >> Steve
> >>
> >> On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood <
> wunder@wunderwood.org>
> >> wrote:
> >>
> >>> With one field in qf for a single-term query, Solr is fetching one
> posting
> >>> list. With 1500 fields, it is fetching 1500 posting lists. It could
> easily
> >>> be 1500 times slower.
> >>>
> >>> It might be even slower than that, because we can’t guarantee that: a)
> >>> every algorithm in Solr is linear, b) that all those lists will fit in
> >>> memory.
> >>>
> >>> wunder
> >>> Walter Underwood
> >>> wunder@wunderwood.org
> >>> http://observer.wunderwood.org/  (my blog)
> >>>
> >>>
> >>>> On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com>
> wrote:
> >>>>
> >>>> Hi everyone
> >>>>
> >>>> What is considered too many fields for qf and fq?  On average I will
> have
> >>>> 1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming I
> can
> >>>> (I have to check with the design) for qf, if I cut it down to 1 field,
> >>> will
> >>>> I see noticeable performance improvement?  It will take a lot of
> effort
> >>> to
> >>>> test this which is why I'm asking first.
> >>>>
> >>>> As is, I'm seeing 2-5 sec response time for searches on an index of 1
> >>>> million records with total index size (on disk) of 4 GB.  I gave Solr
> 2
> >>> GB
> >>>> of RAM (also tested at 4 GB) in both cases Solr didn't use more then 1
> >>> GB.
> >>>>
> >>>> Thanks in advanced
> >>>>
> >>>> Steve
> >>>
> >>>
> >
>

Re: Number of fields in qf & fq

Posted by Erick Erickson <er...@gmail.com>.
An fq is still a single entry in your filterCache so from that
perspective it's the same.

And to create that entry, you're still using all the underlying fields
to search, so they have to be loaded just like they would be in a q
clause.

But really, the fundamental question here is why your design even has
1,500 fields and, more specifically, why you would want to search them
all at once. From a 10,000 ft. view, that's a very suspect design.

Best,
Erick

On Thu, Nov 19, 2015 at 4:06 PM, Walter Underwood <wu...@wunderwood.org> wrote:
> The implementation for fq has changed from 4.x to 5.x, so I’ll let someone else answer that in detail.
>
> In 4.x, the result of each filter query can be cached. After that, they are quite fast.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Nov 19, 2015, at 3:59 PM, Steven White <sw...@gmail.com> wrote:
>>
>> Thanks Walter.  I see your point.  Does this apply to fq as will?
>>
>> Also, how does one go about debugging performance issues in Solr to find
>> out where time is mostly spent?
>>
>> Steve
>>
>> On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood <wu...@wunderwood.org>
>> wrote:
>>
>>> With one field in qf for a single-term query, Solr is fetching one posting
>>> list. With 1500 fields, it is fetching 1500 posting lists. It could easily
>>> be 1500 times slower.
>>>
>>> It might be even slower than that, because we can’t guarantee that: a)
>>> every algorithm in Solr is linear, b) that all those lists will fit in
>>> memory.
>>>
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>>
>>>
>>>> On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com> wrote:
>>>>
>>>> Hi everyone
>>>>
>>>> What is considered too many fields for qf and fq?  On average I will have
>>>> 1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming I can
>>>> (I have to check with the design) for qf, if I cut it down to 1 field,
>>> will
>>>> I see noticeable performance improvement?  It will take a lot of effort
>>> to
>>>> test this which is why I'm asking first.
>>>>
>>>> As is, I'm seeing 2-5 sec response time for searches on an index of 1
>>>> million records with total index size (on disk) of 4 GB.  I gave Solr 2
>>> GB
>>>> of RAM (also tested at 4 GB) in both cases Solr didn't use more then 1
>>> GB.
>>>>
>>>> Thanks in advanced
>>>>
>>>> Steve
>>>
>>>
>

Re: Number of fields in qf & fq

Posted by Walter Underwood <wu...@wunderwood.org>.
The implementation for fq has changed from 4.x to 5.x, so I’ll let someone else answer that in detail.

In 4.x, the result of each filter query can be cached. After that, they are quite fast.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 19, 2015, at 3:59 PM, Steven White <sw...@gmail.com> wrote:
> 
> Thanks Walter.  I see your point.  Does this apply to fq as will?
> 
> Also, how does one go about debugging performance issues in Solr to find
> out where time is mostly spent?
> 
> Steve
> 
> On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood <wu...@wunderwood.org>
> wrote:
> 
>> With one field in qf for a single-term query, Solr is fetching one posting
>> list. With 1500 fields, it is fetching 1500 posting lists. It could easily
>> be 1500 times slower.
>> 
>> It might be even slower than that, because we can’t guarantee that: a)
>> every algorithm in Solr is linear, b) that all those lists will fit in
>> memory.
>> 
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com> wrote:
>>> 
>>> Hi everyone
>>> 
>>> What is considered too many fields for qf and fq?  On average I will have
>>> 1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming I can
>>> (I have to check with the design) for qf, if I cut it down to 1 field,
>> will
>>> I see noticeable performance improvement?  It will take a lot of effort
>> to
>>> test this which is why I'm asking first.
>>> 
>>> As is, I'm seeing 2-5 sec response time for searches on an index of 1
>>> million records with total index size (on disk) of 4 GB.  I gave Solr 2
>> GB
>>> of RAM (also tested at 4 GB) in both cases Solr didn't use more then 1
>> GB.
>>> 
>>> Thanks in advanced
>>> 
>>> Steve
>> 
>> 


Re: Number of fields in qf & fq

Posted by Steven White <sw...@gmail.com>.
Thanks Walter.  I see your point.  Does this apply to fq as will?

Also, how does one go about debugging performance issues in Solr to find
out where time is mostly spent?

Steve

On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood <wu...@wunderwood.org>
wrote:

> With one field in qf for a single-term query, Solr is fetching one posting
> list. With 1500 fields, it is fetching 1500 posting lists. It could easily
> be 1500 times slower.
>
> It might be even slower than that, because we can’t guarantee that: a)
> every algorithm in Solr is linear, b) that all those lists will fit in
> memory.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com> wrote:
> >
> > Hi everyone
> >
> > What is considered too many fields for qf and fq?  On average I will have
> > 1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming I can
> > (I have to check with the design) for qf, if I cut it down to 1 field,
> will
> > I see noticeable performance improvement?  It will take a lot of effort
> to
> > test this which is why I'm asking first.
> >
> > As is, I'm seeing 2-5 sec response time for searches on an index of 1
> > million records with total index size (on disk) of 4 GB.  I gave Solr 2
> GB
> > of RAM (also tested at 4 GB) in both cases Solr didn't use more then 1
> GB.
> >
> > Thanks in advanced
> >
> > Steve
>
>

Re: Number of fields in qf & fq

Posted by Walter Underwood <wu...@wunderwood.org>.
With one field in qf for a single-term query, Solr is fetching one posting list. With 1500 fields, it is fetching 1500 posting lists. It could easily be 1500 times slower.

It might be even slower than that, because we can’t guarantee that: a) every algorithm in Solr is linear, b) that all those lists will fit in memory.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 19, 2015, at 3:46 PM, Steven White <sw...@gmail.com> wrote:
> 
> Hi everyone
> 
> What is considered too many fields for qf and fq?  On average I will have
> 1500 fields in qf and 100 in fq (all of which are OR'ed).  Assuming I can
> (I have to check with the design) for qf, if I cut it down to 1 field, will
> I see noticeable performance improvement?  It will take a lot of effort to
> test this which is why I'm asking first.
> 
> As is, I'm seeing 2-5 sec response time for searches on an index of 1
> million records with total index size (on disk) of 4 GB.  I gave Solr 2 GB
> of RAM (also tested at 4 GB) in both cases Solr didn't use more then 1 GB.
> 
> Thanks in advanced
> 
> Steve