You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Alan Wu (JIRA)" <ji...@apache.org> on 2019/05/10 03:12:00 UTC

[jira] [Comment Edited] (RATIS-243) Add log purge function after taking snapshot

    [ https://issues.apache.org/jira/browse/RATIS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836225#comment-16836225 ] 

Alan Wu edited comment on RATIS-243 at 5/10/19 3:11 AM:
--------------------------------------------------------

[~andywu] Thanks for your reply, I've resolved this, it's my problem, the reason is that I was trying to call raftLog.purge(lastSnapshotIndex) on my implemented StateMachine, but there is no provided handle to call the method, so I made some changes for the StateMachine to support the call. This introduced a bug. But so far, I have one more thing that I do not understand, for the Snapshot, as you said, followers should not need the complete history for the commit logs, Here I assume a case, I'm using Ratis framework to replicate 3GB data to 3 servers, each request will include 10KB data to leader , when 1MB data have been submitted to all servers, a snapshot will be created, after then I will call raftLog.purge(lastSnapshotIndex) to purge history logs. in this time, one of 3 servers that has down. after it came back to work or I added a new server to the cluster, the leader's history log already been dropped, so the follower will lose that data, even if the follower installed the snapshot. maybe this violated the data consistency? or I do not completely understand the use of the state machine?



> Add log purge function after taking snapshot
> --------------------------------------------
>
>                 Key: RATIS-243
>                 URL: https://issues.apache.org/jira/browse/RATIS-243
>             Project: Ratis
>          Issue Type: Improvement
>          Components: server
>            Reporter: Andy Wu
>            Assignee: Andy Wu
>            Priority: Major
>             Fix For: 0.4.0
>
>         Attachments: r243_20190412.patch, r243_20190415.patch, r243_20190416.patch, r243_20190418.patch
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> After snapshotting the state machine, we can safely purge logs in the cache and disk.
> Based on the lastAppliedIndex, we can find out which segment the index lands, we can purge all previous segments if leader has no pending RPC on it. We will leave the segment where index lands alone, so we do not need to deal with partial file deletion logic. 
> Also if we only have snapshots, make sure we can install snapshots to the followers. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)