You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2019/12/11 15:56:36 UTC

[GitHub] [hadoop-ozone] sodonnel opened a new pull request #343: HDDS-2607 DeadNodeHandler should not remove replica for a dead maintenance node

sodonnel opened a new pull request #343: HDDS-2607 DeadNodeHandler should not remove replica for a dead maintenance node
URL: https://github.com/apache/hadoop-ozone/pull/343
 
 
   ## What changes were proposed in this pull request?
   
   Normally, when a node goes dead, the DeadNodeHandler removes all the containers and replica associated with the node from the ContainerManager.
   
   If a node is IN_MAINTENANCE and goes dead, then we do not want to remove its replica. They should remain present in the system to prevent the container being marked as under-replicated.
   
   We also need to consider the case where the node is dead, and then maintenance expires automatically. In that case, the replica associated with the node must be removed and the affected containers will become under-replicated.
   
   This PR resolves this issue, by ensuring stale, dead and healthy node events are fired as normal by the NodeStateManager when the node health state changes. This is works as before, prior to any decommission related changes.
   
   To ensure the the container replicas are not removed for a node IN_MAINTENANCE, the DeadNodeHandler has been modified to check the node state and skip removing the replicas if the node is IN_MAINTENANE. For all other states, the dead node handling is unchanged.
   
   Then, for any changes in the node operational state, the same health events are fired when the operational state is changed. This means that a DECOMMISSIONED or IN_MAINTENANCE node that is heartbeating and healthy will fire the HEALTHY node event when it is moved to IN_SERVICE, which triggers pipeline creation.
   
   Similarly, if a node was IN_MAINTENANCE and is DEAD, when it transitions to IN_SERVICE, the dead node event will be trigger, causing the container replicas to be removed. Some of these events will have no effect, but the volume of changes to the node operational state will be minimal, as it will only be changed by the decommission monitor or someone triggering or cancelling decommission and maintenance.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2607
   
   ## How was this patch tested?
   
   Additional and modified unit tests
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org