You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Glen Geng (Jira)" <ji...@apache.org> on 2020/12/17 08:47:00 UTC

[jira] [Comment Edited] (HDDS-4599) Handle inflight delete/add actions in ReplicationManager properly.

    [ https://issues.apache.org/jira/browse/HDDS-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250904#comment-17250904 ] 

Glen Geng edited comment on HDDS-4599 at 12/17/20, 8:46 AM:
------------------------------------------------------------

Hey [~a493172422], if need help or discusssion , you may attend the slack channel ozone-sim-ha, and the weekly sync up for scm ha (Wednesday, 12:00 Beijing time)!


was (Author: glengeng):
Hey [~a493172422], if need hep or discusssion , you may attend the slack channel ozone-sim-ha, and the weekly sync up for scm ha (Wednesday, 12:00 Beijing time)!

> Handle inflight delete/add actions in ReplicationManager properly.
> ------------------------------------------------------------------
>
>                 Key: HDDS-4599
>                 URL: https://issues.apache.org/jira/browse/HDDS-4599
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: SCM HA
>    Affects Versions: 1.1.0
>            Reporter: Glen Geng
>            Assignee: YI-CHEN WANG
>            Priority: Major
>
> ReplicationManager maintains the in-flight replication and deletion in-memory, which is not replicated using Ratis. So, theoretically it’s possible that we might run into data loss issues and over replicated issues if we immediately start ReplicationManager after a failover.
> There is a quick fix for the potential data loss issue HDDS-4589, however we need a thorough solution for both in-flight add and in-flight delete.
> We have two proposals from [~sodonnell]:
>  # have the DNs provide a list of pending_delete blocks in their container report / heartbeat, and then we can use that in SCM.
>  # if the DNs detect a new master SCM or a restarted SCM, then purge their pending delete list and wait for new instructions from the new/restarted SCM.
> File this Jira to record this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org