You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2012/05/21 23:31:41 UTC

[jira] [Updated] (SOLR-3469) recovery can incorrectly succeed

     [ https://issues.apache.org/jira/browse/SOLR-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-3469:
-------------------------------

    Attachment: SOLR-3469.patch

Here's a draft patch that doesn't quite work yet.  The direction I'm going is to add flags to the tlog entries when adding buffered entries.  On startup, we look at (and keep track of) the latest entry.  If the flag is set, then we don't try peersync when we are recovering.
                
> recovery can incorrectly succeed
> --------------------------------
>
>                 Key: SOLR-3469
>                 URL: https://issues.apache.org/jira/browse/SOLR-3469
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Yonik Seeley
>         Attachments: SOLR-3469.patch
>
>
> Hypothetical scenario:
>  - node comes up and needs to recover
>  - node starts buffering updates and replicating index
>  - node receives and buffers 1000 updates and dies before replication finishes
>  - node comes up, replays tlog
>  - peersync checks last 100 updates, they match, and node goes into "active" state (without having ever finished the index replication)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org