You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Sai Sreenivas K <sa...@myntra.com> on 2015/05/05 14:42:23 UTC

Solr cloud clusterstate.json update query ?

Could you clarify on the following questions,
1. Is there a way to avoid all the nodes simultaneously getting into
recovery state when a bulk indexing happens ? Is there an api to disable
replication on one node for a while ?

2. We recently changed the host name on nodes in solr.xml. But the old host
entries still exist in the clusterstate.json marked as active state. Though
live_nodes has the correct information. Who updates clusterstate.json if
the node goes down in an ungraceful fashion without notifying its down
state ?

Thanks,
Sai Sreenivas K

Re: Solr cloud clusterstate.json update query ?

Posted by Erick Erickson <er...@gmail.com>.

Gopal:

Did you see my previous answer?

Best,
Erick

On Tue, May 5, 2015 at 9:42 PM, Gopal Jee <zg...@gmail.com> wrote:
> about <2> , live_nodes under zookeeper is ephemeral node (please see
> zookeeper ephemeral node). So, once connection from solr zkClient to
> zookeeper is lost, these nodes will disappear automatically. AFAIK,
> clusterstate.json is updated by overseer based on messages published to a
> queue in zookeeper by solr zkclients. In case, solr node dies ungracefully,
> I am not sure how this event is updated in clusterstate.json.
> *Can someone shed some light *on ungraceful solr shutdown and consequent
> status update in clusterstate. I guess there would be some ay, because all
> nodes in a cluster decides clusterstate based on watched clusterstate.json
> node. They will not be watching live_nodes for updating their state.
>
> Gopal
>
> On Wed, May 6, 2015 at 6:33 AM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> about <1>. This shouldn't be happening, so I wouldn't concentrate
>> there first. The most common reason is that you have a short Zookeeper
>> timeout and the replicas go into a stop-the-world garbage collection
>> that exceeds the timeout. So the first thing to do is to see if that's
>> happening. Here are a couple of good places to start:
>>
>> http://lucidworks.com/blog/garbage-collection-bootcamp-1-0/
>> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr
>>
>> <2> Partial answer is that ZK does a keep-alive type thing and if the
>> solr nodes it knows about don't reply, it marks the nodes as down.
>>
>> Best,
>> Erick
>>
>> On Tue, May 5, 2015 at 5:42 AM, Sai Sreenivas K <sa...@myntra.com> wrote:
>> > Could you clarify on the following questions,
>> > 1. Is there a way to avoid all the nodes simultaneously getting into
>> > recovery state when a bulk indexing happens ? Is there an api to disable
>> > replication on one node for a while ?
>> >
>> > 2. We recently changed the host name on nodes in solr.xml. But the old
>> host
>> > entries still exist in the clusterstate.json marked as active state.
>> Though
>> > live_nodes has the correct information. Who updates clusterstate.json if
>> > the node goes down in an ungraceful fashion without notifying its down
>> > state ?
>> >
>> > Thanks,
>> > Sai Sreenivas K
>>
>
>
>
> --

Re: Solr cloud clusterstate.json update query ?

Posted by Gopal Jee <zg...@gmail.com>.

about <2> , live_nodes under zookeeper is ephemeral node (please see
zookeeper ephemeral node). So, once connection from solr zkClient to
zookeeper is lost, these nodes will disappear automatically. AFAIK,
clusterstate.json is updated by overseer based on messages published to a
queue in zookeeper by solr zkclients. In case, solr node dies ungracefully,
I am not sure how this event is updated in clusterstate.json.
*Can someone shed some light *on ungraceful solr shutdown and consequent
status update in clusterstate. I guess there would be some ay, because all
nodes in a cluster decides clusterstate based on watched clusterstate.json
node. They will not be watching live_nodes for updating their state.

Gopal

On Wed, May 6, 2015 at 6:33 AM, Erick Erickson <er...@gmail.com>
wrote:

> about <1>. This shouldn't be happening, so I wouldn't concentrate
> there first. The most common reason is that you have a short Zookeeper
> timeout and the replicas go into a stop-the-world garbage collection
> that exceeds the timeout. So the first thing to do is to see if that's
> happening. Here are a couple of good places to start:
>
> http://lucidworks.com/blog/garbage-collection-bootcamp-1-0/
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr
>
> <2> Partial answer is that ZK does a keep-alive type thing and if the
> solr nodes it knows about don't reply, it marks the nodes as down.
>
> Best,
> Erick
>
> On Tue, May 5, 2015 at 5:42 AM, Sai Sreenivas K <sa...@myntra.com> wrote:
> > Could you clarify on the following questions,
> > 1. Is there a way to avoid all the nodes simultaneously getting into
> > recovery state when a bulk indexing happens ? Is there an api to disable
> > replication on one node for a while ?
> >
> > 2. We recently changed the host name on nodes in solr.xml. But the old
> host
> > entries still exist in the clusterstate.json marked as active state.
> Though
> > live_nodes has the correct information. Who updates clusterstate.json if
> > the node goes down in an ungraceful fashion without notifying its down
> > state ?
> >
> > Thanks,
> > Sai Sreenivas K
>

--

Re: Solr cloud clusterstate.json update query ?

Posted by Erick Erickson <er...@gmail.com>.

about <1>. This shouldn't be happening, so I wouldn't concentrate
there first. The most common reason is that you have a short Zookeeper
timeout and the replicas go into a stop-the-world garbage collection
that exceeds the timeout. So the first thing to do is to see if that's
happening. Here are a couple of good places to start:

http://lucidworks.com/blog/garbage-collection-bootcamp-1-0/
http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

<2> Partial answer is that ZK does a keep-alive type thing and if the
solr nodes it knows about don't reply, it marks the nodes as down.

Best,
Erick

On Tue, May 5, 2015 at 5:42 AM, Sai Sreenivas K <sa...@myntra.com> wrote:
> Could you clarify on the following questions,
> 1. Is there a way to avoid all the nodes simultaneously getting into
> recovery state when a bulk indexing happens ? Is there an api to disable
> replication on one node for a while ?
>
> 2. We recently changed the host name on nodes in solr.xml. But the old host
> entries still exist in the clusterstate.json marked as active state. Though
> live_nodes has the correct information. Who updates clusterstate.json if
> the node goes down in an ungraceful fashion without notifying its down
> state ?
>
> Thanks,
> Sai Sreenivas K