You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "James Strassburg (JIRA)" <ji...@apache.org> on 2016/11/04 17:30:58 UTC

[jira] [Commented] (SOLR-8641) Core Deleted After Failed Index Fetch When Replication Disabled

    [ https://issues.apache.org/jira/browse/SOLR-8641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637085#comment-15637085 ] 

James Strassburg commented on SOLR-8641:
----------------------------------------

I agree that this is a bug. We recently hit this issue too. We disable replication on our replication master while we reindex then enable it so that a partial index doesn't replicate. When the communication to the master blipped we got the 'master is not available. Index fetch failed' message then the index was deleted. From look at the mentioned code it seems the issue is because of the dual use of the indexversion both to represent an actual index version of zero and to disable replication. A fix may be to update the replication handler to provide an enabled boolean status in the response to indexversion or a separate command to determine if replication is enabled or not.

> Core Deleted After Failed Index Fetch When Replication Disabled
> ---------------------------------------------------------------
>
>                 Key: SOLR-8641
>                 URL: https://issues.apache.org/jira/browse/SOLR-8641
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java)
>    Affects Versions: 5.1
>         Environment: Windows Server 2008 R2
>            Reporter: phil watson
>
> I am getting occasional Index Fetch Failures (due to server overloading I suspect). This is appearing in my log file as
> Master at: http://MOTOSOLR01:9000/solr/ShowcaseData is not available. Index fetch failed. Exception: Server refused connection at: http://MOTOSOLR01:9000/solr/ShowcaseData
> At the point of the failure the master version of the core has replication disabled (but still contains data) and it appears that on the next replication cycle that the slave version of the core is being emptied. Once replication is enabled everything works as expected.
> Having looked at the source code I suspect that lines 311-327 in indexfetcher.java are at fault. What I think is happening is that the failed IndexFetch is setting forcereplication to true, and this cause a forced delete of the core before reloading the core (which then doesn't happen as replication is disabled)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org