You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by adfel70 <ad...@gmail.com> on 2014/05/08 17:00:06 UTC

Replica as a "leader"

Solr & Collection Info:
solr 4.8 , 4 shards, 3 replicas per shard, 30-40 milion docs per shard.

Process:
1. Indexing 100-200 docs per second.
2. Doing Pkill -9 java to 2 replicas (not the leader) in shard 3 (while
indexing).
3. Indexing for 10-20 minutes and doing hard commit. 
4. Doing Pkill -9 java to the leader and then starting one replica in shard
3 (while indexing).
5. After 20 minutes starting another replica in shard 3 ,while indexing (not
the leader in step 1). 

Results:
2. Only the leader is active in shard 3.
3. Thousands of docs were added to the leader in shard 3.
4. After staring the replica, it's state was down and after 10 minutes it
became the leader in cluster state (and still down). no servers hosting
shards for index and search requests.
5. After starting another replica, it's state was recovering for 2-3 minutes
and then it became active (not leader in cluster state).
6. Index, commit and search requests are handeled in the other replicae
(*active status, not leader!!!*).


Expected:
5. To stay in down status.
*6. Not to handel index, commit and search requests - no servers hosting
shards!*

Thanks!




--
View this message in context: http://lucene.472066.n3.nabble.com/Replica-as-a-leader-tp4135077.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Replica as a "leader"

Posted by Erick Erickson <er...@gmail.com>.
bq: Is there a way that solr can recover without losing docs in this scenario?

Not that I know of currently. SolrCloud is designed to _not_ lose
documents as long
as all leaders are present. And when a leader goes down, assuming
there's a replica
handy docs shouldn't be lost either. But taking down the leader then
starting an out-of-date
replica up and hoping that Solr has somehow magically cached all the
intervening updates
is not a supported scenario. Perhaps SOLR-5468 will help here, I'm not
entirely sure. This
scenario seems out-of-band though.

Best,
Erick

On Sun, May 18, 2014 at 3:12 AM, Anshum Gupta <an...@anshumgupta.net> wrote:
> SOLR-5468 <https://issues.apache.org/jira/browse/SOLR-5468> might be useful
> for you.
>
>
> On Sun, May 18, 2014 at 1:54 AM, adfel70 <ad...@gmail.com> wrote:
>
>> *one of the most impotent requirements in my system is not to lose docs and
>> not to retrieve part of the data at query time.*
>>
>> I expect the replica to wait until the real leader will start or
>> at least to sync the real leader with the docs indexed in the replica after
>> starting and syncing the replica with the docs that were indexed to the
>> leader.
>>
>> Is there a way that solr can recover without losing docs in this scenario?
>>
>> Thanks.
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Replica-as-a-leader-tp4135614p4136729.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>
>
> --
>
> Anshum Gupta
> http://www.anshumgupta.net

Re: Replica as a "leader"

Posted by Anshum Gupta <an...@anshumgupta.net>.
SOLR-5468 <https://issues.apache.org/jira/browse/SOLR-5468> might be useful
for you.


On Sun, May 18, 2014 at 1:54 AM, adfel70 <ad...@gmail.com> wrote:

> *one of the most impotent requirements in my system is not to lose docs and
> not to retrieve part of the data at query time.*
>
> I expect the replica to wait until the real leader will start or
> at least to sync the real leader with the docs indexed in the replica after
> starting and syncing the replica with the docs that were indexed to the
> leader.
>
> Is there a way that solr can recover without losing docs in this scenario?
>
> Thanks.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Replica-as-a-leader-tp4135614p4136729.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 

Anshum Gupta
http://www.anshumgupta.net

Re: Replica as a "leader"

Posted by adfel70 <ad...@gmail.com>.
*one of the most impotent requirements in my system is not to lose docs and
not to retrieve part of the data at query time.*

I expect the replica to wait until the real leader will start or 
at least to sync the real leader with the docs indexed in the replica after
starting and syncing the replica with the docs that were indexed to the
leader. 

Is there a way that solr can recover without losing docs in this scenario?

Thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/Replica-as-a-leader-tp4135614p4136729.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Replica as a "leader"

Posted by Erick Erickson <er...@gmail.com>.
1. Indexing 100-200 docs per second.
2. Doing Pkill -9 java to 2 replicas (not the leader) in shard 3 (while
indexing).
3. Indexing for 10-20 minutes and doing hard commit.
4. Doing Pkill -9 java to the leader and then starting one replica in shard
3 (while indexing).

I think you're in uncharted territory. By only having the leader
running, indexing docs to it, then killing it, there's no way for one
of the restarted followers to know what docs were indexed. Eventually
the follower will become the leader and the docs are just lost.
Updates are NOT stored on ZK for instance.

Why do you expect the machines to "stay in down status"? SolrCloud is
doing the best it can. How do you expect this scenario to recover?

FWIW,
Erick

On Thu, May 8, 2014 at 8:00 AM, adfel70 <ad...@gmail.com> wrote:
> Solr & Collection Info:
> solr 4.8 , 4 shards, 3 replicas per shard, 30-40 milion docs per shard.
>
> Process:
> 1. Indexing 100-200 docs per second.
> 2. Doing Pkill -9 java to 2 replicas (not the leader) in shard 3 (while
> indexing).
> 3. Indexing for 10-20 minutes and doing hard commit.
> 4. Doing Pkill -9 java to the leader and then starting one replica in shard
> 3 (while indexing).
> 5. After 20 minutes starting another replica in shard 3 ,while indexing (not
> the leader in step 1).
>
> Results:
> 2. Only the leader is active in shard 3.
> 3. Thousands of docs were added to the leader in shard 3.
> 4. After staring the replica, it's state was down and after 10 minutes it
> became the leader in cluster state (and still down). no servers hosting
> shards for index and search requests.
> 5. After starting another replica, it's state was recovering for 2-3 minutes
> and then it became active (not leader in cluster state).
> 6. Index, commit and search requests are handeled in the other replicae
> (*active status, not leader!!!*).
>
>
> Expected:
> 5. To stay in down status.
> *6. Not to handel index, commit and search requests - no servers hosting
> shards!*
>
> Thanks!
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Replica-as-a-leader-tp4135077.html
> Sent from the Solr - User mailing list archive at Nabble.com.