You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2017/01/25 20:06:26 UTC

[jira] [Comment Edited] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

    [ https://issues.apache.org/jira/browse/SOLR-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15838482#comment-15838482 ] 

Erick Erickson edited comment on SOLR-10006 at 1/25/17 8:05 PM:
----------------------------------------------------------------

Mike:

First of all thanks for looking. This is the full log file after starting, fresh trunk pull this AM. Since it's pretty short I decided to upload the whole thing.

Here's what I did to make this happen:
1> set up a 2x2 collection
2> indexed a bunch of docs. Stupid-simple indexing, just wanted to get more than one segment. I'm not sure having more than one segment is relevant actually....
3> shut down a follower
4> removed a few of the segment files. Not an entire segment, just 3 files at random from a single segment. 
5> removed all the logs from the log directory.
6> tried to start the replica.


was (Author: erickerickson):
Mike:

First of all thanks for looking. This is the full log file after starting, fresh trunk pull this AM.

Here's what I did to make this happen:
1> set up a 2x2 collection
2> indexed a bunch of docs. Stupid-simple indexing, just wanted to get more than one segment. I'm not sure having more than one segment is relevant actually....
3> shut down a follower
4> removed a few of the segment files. Not an entire segment, just 3 files at random from a single segment. 
5> removed all the logs from the log directory.
6> tried to start the replica.

> Cannot do a full sync (fetchindex) if the replica can't open a searcher
> -----------------------------------------------------------------------
>
>                 Key: SOLR-10006
>                 URL: https://issues.apache.org/jira/browse/SOLR-10006
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 5.3.1, 6.4
>            Reporter: Erick Erickson
>         Attachments: SOLR-10006.patch, solr.log
>
>
> Doing a full sync or fetchindex requires an open searcher and if you can't open the searcher those operations fail.
> For discussion. I've seen a situation in the field where a replica's index became corrupt. When the node was restarted, the replica tried to do a full sync but fails because the core can't open a searcher. The replica went into an endless sync/fail/sync cycle.
> I couldn't reproduce that exact scenario, but it's easy enough to get into a similar situation. Create a 2x2 collection and index some docs. Then stop one of the instances and go in and remove a couple of segments files and restart.
> The replica stays in the "down" state, fine so far.
> Manually issue a fetchindex. That fails because the replica can't open a searcher. Sure, issuing a fetchindex is abusive.... but I think it's the same underlying issue: why should we care about the state of a replica's current index when we're going to completely replace it anyway?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org