You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jay Hill <ja...@gmail.com> on 2009/11/03 00:50:30 UTC

Re: CPU utilization and query time high on Solr slave when snapshot install

So assuming you set up a few sample sort queries to run in the firstSearcher
config, and had very low query volume during that ten minutes so that there
were no evictions before a new Searcher was loaded, would those queries run
by the firstSearcher be passed along to the cache for the next Searcher as
part of the autowarm? If so, it seems like you might want to load a few sort
queries for the firstSearcher, but might not need any included in the
newSearcher?

-Jay


On Mon, Nov 2, 2009 at 4:26 PM, Mark Miller <ma...@gmail.com> wrote:

> Hmm...I think you have to setup warming queries yourself and that autowarm
> just copies entries from the old cache to the new cache, rather than issuing
> queries - the value is how many entries it will copy. Though that's still
> going to take CPU and time.
>
> - Mark
>
> http://www.lucidimagination.com (mobile)
>
>
> On Nov 2, 2009, at 12:47 PM, Walter Underwood <wu...@wunderwood.org>
> wrote:
>
>  If you are going to pull a new index every 10 minutes, try turning off
>> cache autowarming.
>>
>> Your caches are never more than 10 minutes old, so spending a minute
>> warming each new cache is a waste of CPU. Autowarm submits queries to the
>> new Searcher before putting it in service. This will create a burst of query
>> load on the new Searcher, often keeping one CPU pretty busy for several
>> seconds.
>>
>> In solrconfig.xml, set autowarmCount to 0.
>>
>> Also, if you want the slaves to always have an optimized index, create the
>> snapshot only in post-optimize. If you create snapshots in both post-commit
>> and post-optimize, you are creating a non-optimized index (post-commit),
>> then replacing it with an optimized one a few minutes later. A slave might
>> get a non-optimized index one time, then an optimized one the next.
>>
>> wunder
>>
>> On Nov 2, 2009, at 1:45 AM, bikumar@sapient.com wrote:
>>
>>  Hi Solr Gurus,
>>>
>>> We have solr in 1 master, 2 slave configuration. Snapshot is created post
>>> commit, post optimization. We have autocommit after 50 documents or 5
>>> minutes. Snapshot puller runs as a cron every 10 minutes. What we have
>>> observed is that whenever snapshot is installed on the slave, we see solrj
>>> client used to query slave solr, gets timedout and there is high CPU
>>> usage/load avg. on slave server. If we stop snapshot puller, then slaves
>>> work with no issues. The system has been running since 2 months and this
>>> issue has started to occur only now  when load on website is increasing.
>>>
>>> Following are some details:
>>>
>>> Solr Details:
>>> apache-solr Version: 1.3.0
>>> Lucene - 2.4-dev
>>>
>>> Master/Slave configurations:
>>>
>>> Master:
>>> - for indexing data HTTPRequests are made on Solr server.
>>> - autocommit feature is enabled for 50 docs and 5 minutes
>>> - caching params are disable for this server
>>> - mergeFactor of 10 is set
>>> - we were running optimize script after every 2 hours, but now have
>>> reduced the duration to twice a day but issue still persists
>>>
>>> Slave1/Slave2:
>>> - standard requestHandler is being used
>>> - default values of caching are set
>>> Machine Specifications:
>>>
>>> Master:
>>> - 4GB RAM
>>> - 1GB JVM Heap memory is allocated to Solr
>>>
>>> Slave1/Slave2:
>>> - 4GB RAM
>>> - 2GB JVM Heap memory is allocated to Solr
>>>
>>> Master and Slave1 (solr1)are on single box and Slave2(solr2) on different
>>> box. We use HAProxy to load balance query requests between 2 slaves. Master
>>> is only used for indexing.
>>> Please let us know if somebody has ever faced similar kind of issue or
>>> has some insight into it as we guys are literally struck at the moment with
>>> a very unstable production environment.
>>>
>>> As a workaround, we have started running optimize on master every 7
>>> minutes. This seems to have reduced the severity of the problem but still
>>> issue occurs every 2days now. please suggest what could be the root cause of
>>> this.
>>>
>>> Thanks,
>>> Bipul
>>>
>>>
>>>
>>>
>>>
>>