You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "jiangwen wei (JIRA)" <ji...@apache.org> on 2011/03/27 12:12:05 UTC

[jira] [Updated] (ZOOKEEPER-1032) speed up recovery from leader failure

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

jiangwen wei updated ZOOKEEPER-1032:
------------------------------------

    Description: 
when the number of nodes is large, it may take a long time to recover from leader failure
there are some points to improve:

1. Follower should take snapshot asynchronously when follower up to date

2. Currently Leader/Follower will clear the DataTree on leader failures, and then restore it from a snapshot and transaction logs. DataTree should not be cleared, only restore it from transaction logs.

3. FileTxnLog should store recently transaction logs in memory, so when DataTree is not behind the transaction logs a lot, the transaction logs in memory can be used to restore DataTree.


  was:
when the number of nodes is large, it may take a long time to recover from leader failures
there are some points to improve:

1. Follower should take snapshot asynchronously when follower up to date

2. Currently Leader/Follower will clear the DataTree on leader failures, and then restore it from a snapshot and transaction logs. DataTree should not be cleared, only restore it from transaction logs.

3. FileTxnLog should store recently transaction logs in memory, so when DataTree is not behind the transaction logs a lot, the transaction logs in memory can be used to restore DataTree.


        Summary: speed up recovery from leader failure  (was: speed up recovery from leader failures)

> speed up recovery from leader failure
> -------------------------------------
>
>                 Key: ZOOKEEPER-1032
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1032
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>            Reporter: jiangwen wei
>
> when the number of nodes is large, it may take a long time to recover from leader failure
> there are some points to improve:
> 1. Follower should take snapshot asynchronously when follower up to date
> 2. Currently Leader/Follower will clear the DataTree on leader failures, and then restore it from a snapshot and transaction logs. DataTree should not be cleared, only restore it from transaction logs.
> 3. FileTxnLog should store recently transaction logs in memory, so when DataTree is not behind the transaction logs a lot, the transaction logs in memory can be used to restore DataTree.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira