You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dennis Schafroth <de...@indexdata.com> on 2012/03/29 13:30:45 UTC

Slow first searcher with facet on bibliographic data in Master - Slave

Hi 
	
I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 

The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
 
I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something

I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 

I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.

Attached is the configuration files.

Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by Dennis Schafroth <de...@indexdata.com>.
I was wrong! It does seem to work! 

Thanks a bunch! 

cheers,
:-Dennis

On Mar 29, 2012, at 15:52 , fbrisbart wrote:

> I had the same issue months ago.
> 'newSearcher' fixed the problem for me.
> I also remember that I had to upgrade solr (3.1) because it didn't work
> with release 1.4 
> But, I suppose you already have a solr 3.x or more.
> So I'm afraid I can't help you more :o(
> 
> Franck
> 
> 
> Le jeudi 29 mars 2012 à 15:41 +0200, Dennis Schafroth a écrit :
>> On Mar 29, 2012, at 14:49 , fbrisbart wrote:
>> 
>>> Arf, I didn't see your attached tgz.
>>> 
>>> In your slave solrconfig.xml, only the 'firstSearcher' contains the
>>> query. Add it also in the 'newSearcher', so that the new search
>>> instances will wait also after a new index is replicated.
>> 
>> Did that now, but I believe my case is mostly a first searcher issue. Anyway it didn't seem to change anything. 
>> 
>>> 
>>> The first request is long because the default faceting method uses the
>>> FieldCache for your facet fields.
>> 
>> Jup, i know. 
>> 
>>> You may also choose to use the facet.method=enum  The performance is
>>> globally worse
>> 
>> You say. This means that every search with facets is now 20 seconds instead of 2. Then I prefer the field cache with one bad first search. 
>> 
>>> than the 'fc' method, but you will avoid the very slow
>>> first request. Btw, it's far better to use the default 'enum' facet
>>> method.
> I meant "the default 'fc' method" of course :o)
> 
>> 
>> Thanks for the input so far. 
>> 
>>> 
>>> Hope this helps,
>>> Franck
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Le jeudi 29 mars 2012 à 13:57 +0200, fbrisbart a écrit :
>>>> If you add your query to the firstSearcher and/or newSearcher event
>>>> listeners in the slave
>>>> 'solrconfig.xml' ( http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners ),
>>>> 
>>>> each new search instance will wait before accepting queries.
>>>> 
>>>> Example to load the FieldCache for 'your_facet_field' field :
>>>> ...
>>>>   <listener event="firstSearcher" class="solr.QuerySenderListener">
>>>>     <arr name="queries">
>>>>       <lst><str name="q">*:*</str><str name="facet">true</str><str
>>>> name="facet.field">your_facet_field</str></lst>
>>>>     </arr>
>>>>   </listener>
>>>> ...
>>>> 
>>>> 
>>>> Franck
>>>> 
>>>> Le jeudi 29 mars 2012 à 13:30 +0200, Dennis Schafroth a écrit :
>>>>> Hi 
>>>>> 	
>>>>> I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 
>>>>> 
>>>>> The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
>>>>> 
>>>>> I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something
>>>>> 
>>>>> I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 
>>>>> 
>>>>> I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.
>>>>> 
>>>>> Attached is the configuration files.
>>>>> 
>>>>> Let me know if there is missing information. 
>>>>> 
>>>>> cheers, 
>>>>> :-Dennis Schafroth
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> 


Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by fbrisbart <fb...@bestofmedia.com>.
I had the same issue months ago.
'newSearcher' fixed the problem for me.
I also remember that I had to upgrade solr (3.1) because it didn't work
with release 1.4 
But, I suppose you already have a solr 3.x or more.
So I'm afraid I can't help you more :o(

Franck


Le jeudi 29 mars 2012 à 15:41 +0200, Dennis Schafroth a écrit :
> On Mar 29, 2012, at 14:49 , fbrisbart wrote:
> 
> > Arf, I didn't see your attached tgz.
> > 
> > In your slave solrconfig.xml, only the 'firstSearcher' contains the
> > query. Add it also in the 'newSearcher', so that the new search
> > instances will wait also after a new index is replicated.
> 
> Did that now, but I believe my case is mostly a first searcher issue. Anyway it didn't seem to change anything. 
> 
> > 
> > The first request is long because the default faceting method uses the
> > FieldCache for your facet fields.
> 
> Jup, i know. 
> 
> > You may also choose to use the facet.method=enum  The performance is
> > globally worse
> 
> You say. This means that every search with facets is now 20 seconds instead of 2. Then I prefer the field cache with one bad first search. 
> 
> > than the 'fc' method, but you will avoid the very slow
> > first request. Btw, it's far better to use the default 'enum' facet
> > method.
I meant "the default 'fc' method" of course :o)

> 
> Thanks for the input so far. 
> 
> > 
> > Hope this helps,
> > Franck
> > 
> > 
> > 
> > 
> > 
> > 
> > Le jeudi 29 mars 2012 à 13:57 +0200, fbrisbart a écrit :
> >> If you add your query to the firstSearcher and/or newSearcher event
> >> listeners in the slave
> >> 'solrconfig.xml' ( http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners ),
> >> 
> >> each new search instance will wait before accepting queries.
> >> 
> >> Example to load the FieldCache for 'your_facet_field' field :
> >> ...
> >>    <listener event="firstSearcher" class="solr.QuerySenderListener">
> >>      <arr name="queries">
> >>        <lst><str name="q">*:*</str><str name="facet">true</str><str
> >> name="facet.field">your_facet_field</str></lst>
> >>      </arr>
> >>    </listener>
> >> ...
> >> 
> >> 
> >> Franck
> >> 
> >> Le jeudi 29 mars 2012 à 13:30 +0200, Dennis Schafroth a écrit :
> >>> Hi 
> >>> 	
> >>> I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 
> >>> 
> >>> The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
> >>> 
> >>> I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something
> >>> 
> >>> I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 
> >>> 
> >>> I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.
> >>> 
> >>> Attached is the configuration files.
> >>> 
> >>> Let me know if there is missing information. 
> >>> 
> >>> cheers, 
> >>> :-Dennis Schafroth
> >>> 
> >> 
> >> 
> > 
> > 
> > 
> 



Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by Dennis Schafroth <de...@indexdata.com>.
On Mar 29, 2012, at 14:49 , fbrisbart wrote:

> Arf, I didn't see your attached tgz.
> 
> In your slave solrconfig.xml, only the 'firstSearcher' contains the
> query. Add it also in the 'newSearcher', so that the new search
> instances will wait also after a new index is replicated.

Did that now, but I believe my case is mostly a first searcher issue. Anyway it didn't seem to change anything. 

> 
> The first request is long because the default faceting method uses the
> FieldCache for your facet fields.

Jup, i know. 

> You may also choose to use the facet.method=enum  The performance is
> globally worse

You say. This means that every search with facets is now 20 seconds instead of 2. Then I prefer the field cache with one bad first search. 

> than the 'fc' method, but you will avoid the very slow
> first request. Btw, it's far better to use the default 'enum' facet
> method.

Thanks for the input so far. 

> 
> Hope this helps,
> Franck
> 
> 
> 
> 
> 
> 
> Le jeudi 29 mars 2012 à 13:57 +0200, fbrisbart a écrit :
>> If you add your query to the firstSearcher and/or newSearcher event
>> listeners in the slave
>> 'solrconfig.xml' ( http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners ),
>> 
>> each new search instance will wait before accepting queries.
>> 
>> Example to load the FieldCache for 'your_facet_field' field :
>> ...
>>    <listener event="firstSearcher" class="solr.QuerySenderListener">
>>      <arr name="queries">
>>        <lst><str name="q">*:*</str><str name="facet">true</str><str
>> name="facet.field">your_facet_field</str></lst>
>>      </arr>
>>    </listener>
>> ...
>> 
>> 
>> Franck
>> 
>> Le jeudi 29 mars 2012 à 13:30 +0200, Dennis Schafroth a écrit :
>>> Hi 
>>> 	
>>> I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 
>>> 
>>> The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
>>> 
>>> I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something
>>> 
>>> I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 
>>> 
>>> I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.
>>> 
>>> Attached is the configuration files.
>>> 
>>> Let me know if there is missing information. 
>>> 
>>> cheers, 
>>> :-Dennis Schafroth
>>> 
>> 
>> 
> 
> 
> 


Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by fbrisbart <fb...@bestofmedia.com>.
Arf, I didn't see your attached tgz.

In your slave solrconfig.xml, only the 'firstSearcher' contains the
query. Add it also in the 'newSearcher', so that the new search
instances will wait also after a new index is replicated.

The first request is long because the default faceting method uses the
FieldCache for your facet fields.
You may also choose to use the facet.method=enum  The performance is
globally worse than the 'fc' method, but you will avoid the very slow
first request. Btw, it's far better to use the default 'enum' facet
method.

Hope this helps,
Franck






Le jeudi 29 mars 2012 à 13:57 +0200, fbrisbart a écrit :
> If you add your query to the firstSearcher and/or newSearcher event
> listeners in the slave
> 'solrconfig.xml' ( http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners ),
> 
> each new search instance will wait before accepting queries.
> 
> Example to load the FieldCache for 'your_facet_field' field :
> ...
>     <listener event="firstSearcher" class="solr.QuerySenderListener">
>       <arr name="queries">
>         <lst><str name="q">*:*</str><str name="facet">true</str><str
> name="facet.field">your_facet_field</str></lst>
>       </arr>
>     </listener>
> ...
> 
> 
> Franck
> 
> Le jeudi 29 mars 2012 à 13:30 +0200, Dennis Schafroth a écrit :
> > Hi 
> > 	
> > I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 
> > 
> > The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
> >  
> > I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something
> > 
> > I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 
> > 
> > I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.
> > 
> > Attached is the configuration files.
> > 
> > Let me know if there is missing information. 
> > 
> > cheers, 
> > :-Dennis Schafroth
> > 
> 
> 



Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by Chris Hostetter <ho...@fucit.org>.
: I do have a firstSearcher, but currently coldSearcher is set to true. 
: But doesn't this just mean that that any searches will block while the 
: first searcher is running? This is how the comment describes first 
: searcher. It would almost give the same effect; that some searches take 
: a long time.
: 
: What I am looking for is after receiving replicated data, do first 
: searcher and then switch to new index.

"firstSearcher" is literally the very first searcher used when the 
SolrCore is loaded -- it is *NOT* the first searcher after replication, 
those are "newSearcher" instances.


-Hoss

Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by Dennis Schafroth <de...@indexdata.com>.
I do have a firstSearcher, but currently coldSearcher is set to true. But doesn't this just mean that that any searches will block while the first searcher is running? This is how the comment describes first searcher. It would almost give the same effect; that some searches take a long time.   

What I am looking for is after receiving replicated data, do first searcher and then switch to new index. 

I will try with coldSearcher false, but I actually think I have already tried this. 

cheers, 
:-Dennis

On Mar 29, 2012, at 13:57 , fbrisbart wrote:

> If you add your query to the firstSearcher and/or newSearcher event
> listeners in the slave
> 'solrconfig.xml' ( http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners ),
> 
> each new search instance will wait before accepting queries.
> 
> Example to load the FieldCache for 'your_facet_field' field :
> ...
>    <listener event="firstSearcher" class="solr.QuerySenderListener">
>      <arr name="queries">
>        <lst><str name="q">*:*</str><str name="facet">true</str><str
> name="facet.field">your_facet_field</str></lst>
>      </arr>
>    </listener>
> ...
> 
> 
> Franck
> 
> Le jeudi 29 mars 2012 à 13:30 +0200, Dennis Schafroth a écrit :
>> Hi 
>> 	
>> I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 
>> 
>> The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
>> 
>> I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something
>> 
>> I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 
>> 
>> I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.
>> 
>> Attached is the configuration files.
>> 
>> Let me know if there is missing information. 
>> 
>> cheers, 
>> :-Dennis Schafroth
>> 
> 
> 
> 


Re: Slow first searcher with facet on bibliographic data in Master - Slave

Posted by fbrisbart <fb...@bestofmedia.com>.
If you add your query to the firstSearcher and/or newSearcher event
listeners in the slave
'solrconfig.xml' ( http://wiki.apache.org/solr/SolrCaching#newSearcher_and_firstSearcher_Event_Listeners ),

each new search instance will wait before accepting queries.

Example to load the FieldCache for 'your_facet_field' field :
...
    <listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst><str name="q">*:*</str><str name="facet">true</str><str
name="facet.field">your_facet_field</str></lst>
      </arr>
    </listener>
...


Franck

Le jeudi 29 mars 2012 à 13:30 +0200, Dennis Schafroth a écrit :
> Hi 
> 	
> I am running indexing and facetted searching on bibliographic data, which is known not to perform to well due to the high facet count. Actually it's just the firstSearch that is horrible slow, 200+ seconds  . After that, I am getting okay times (1 second) (at least in a few users scenario we have now). 
> 
> The current index is 54 millions record with approx. 10 millions unique authors. The facets (… _exact) is using the string type. 
>  
> I had hoped that a master (indexing) and slave (searching) would have solved the issue, but I am still seeing the issue on the slave, so I guess I must have misunderstood (or perhaps misconfigured) something
> 
> I had thought that the slave would not switch to the new index until the auto warming was completed.  Is such behavior possible? 
> 
> I guess a alternative solution could be to have multiple slaves and taking a slave off-line when doing replication, but if it is possible to do simpler (and using 1/3 less space) that would be great. Then again we might need multiple slaves with more requests.
> 
> Attached is the configuration files.
> 
> Let me know if there is missing information. 
> 
> cheers, 
> :-Dennis Schafroth
>