You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sunnyfr <jo...@gmail.com> on 2008/10/06 12:07:30 UTC

Re: Availability Issues

Hi Matthew,

What do you mean by post your updates ?
Does that mean that you just scp, copy data directory by cron job without
using automatic replication.
Because really since, I started to turn on autoCommit snapshooter, it does
slow down and mess up a bit everything.

Did you have had the same problem?
Thanks a lot,


Matthew Runo wrote:
> 
> The way I'd do it would be to buy more servers, set up Tomcat on  
> each, and get SOLR replicating from your current machine to the  
> others. Then, throw them all behind a load balancer, and there you go.
> 
> You could also post your updates to every machine. Then you don't  
> need to worry about getting replication running.
> 
> +--------------------------------------------------------+
>   | Matthew Runo
>   | Zappos Development
>   | mruno@zappos.com
>   | 702-943-7833
> +--------------------------------------------------------+
> 
> 
> On Oct 9, 2007, at 7:12 AM, David Whalen wrote:
> 
>> All:
>>
>> How can I break up my install onto more than one box?  We've
>> hit a learning curve here and we don't understand how best to
>> proceed.  Right now we have everything crammed onto one box
>> because we don't know any better.
>>
>> So, how would you build it if you could?  Here are the specs:
>>
>> a) the index needs to hold at least 25 million articles
>> b) the index is constantly updated at a rate of 10,000 articles
>> per minute
>> c) we need to have faceted queries
>>
>> Again, real-world experience is preferred here over book knowledge.
>> We've tried to read the docs and it's only made us more confused.
>>
>> TIA
>>
>> Dave W
>>
>>
>>> -----Original Message-----
>>> From: Yonik Seeley [mailto:yonik@apache.org]
>>> Sent: Monday, October 08, 2007 3:42 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Availability Issues
>>>
>>> On 10/8/07, David Whalen <dw...@enr-corp.com> wrote:
>>>>> Do you see any requests that took a really long time to finish?
>>>>
>>>> The requests that take a long time to finish are just
>>> simple queries.
>>>> And the same queries run at a later time come back much faster.
>>>>
>>>> Our logs contain 99% inserts and 1% queries.  We are
>>> constantly adding
>>>> documents to the index at a rate of 10,000 per minute, so the logs
>>>> show mostly that.
>>>
>>> Oh, so you are using the same boxes for updating and querying?
>>> When you insert, are you using multiple threads?  If so, how many?
>>>
>>> What is the full URL of those slow query requests?
>>> Do the slow requests start after a commit?
>>>
>>>>> Start with the thread dump.
>>>>> I bet it's multiple queries piling up around some synchronization
>>>>> points in lucene (sometimes caused by multiple threads generating
>>>>> the same big filter that isn't yet cached).
>>>>
>>>> What would be my next steps after that?  I'm not sure I'd
>>> understand
>>>> enough from the dump to make heads-or-tails of it.  Can I
>>> share that
>>>> here?
>>>
>>> Yes, post it here.  Most likely a majority of the threads
>>> will be blocked somewhere deep in lucene code, and you will
>>> probably need help from people here to figure it out.
>>>
>>> -Yonik
>>>
>>>
>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Availability-Issues-tp13102075p19835109.html
Sent from the Solr - User mailing list archive at Nabble.com.