You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Prashant Pogde (Jira)" <ji...@apache.org> on 2021/03/07 19:51:00 UTC

[jira] [Commented] (HDDS-4914) Validating HDDS upgrade in presence of failures

    [ https://issues.apache.org/jira/browse/HDDS-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296960#comment-17296960 ] 

Prashant Pogde commented on HDDS-4914:
--------------------------------------

The goals is to write comprehensive framework that will
 * drives SCM - finalization
 * Inject failures in both DataNodes as well as SCM at every state change in both SCM and DataNodes.
 * Validate that SCM and Datanodes eventually finalize and upgrade is successful.

HDDS upgrade model can be thought of as a State Machine model \{states, transitions}, where
 * states are specific stages in upgrade finalization either on the SCM node or on the individual DataNodes
 * transitions are events that trigger state change

Different HDDS-Upgrade stages, for Both DataNodes as well SCM are defined as
 * BeforePreFinalizeUpgrade
 * AfterPreFinalizeUpgrade
 * BeforeCompleteFinalization
 * AfterCompleteFinalization
 * AfterPostFinalizeUpgrade

This validation framework will trigger all possible combination of failures while the nodes are in different possible states. The different combinations will include :
 *  One Node failures - Fail SCM  in the middle of SCM upgrade while the SCM is at a specific state.
 ** Try this for all possible SCM-upgrade states 
 * One Node failures - Fail DataNode in the middle of SCM upgrade while the SCM is at a specific state. 
 ** Try this for all possible SCM-upgrade states 
 *  One Node failures - Fail SCM in the middle of DataNode upgrade while the DataNode is at a specific state.
 ** Try this for all possible DataNode-upgrade states 
 * One Node failures - Fail DataNode in the middle of DataNode upgrade while the same DataNode is at a specific state. 
 ** Try this for all possible DataNode-upgrade states
 * Two Node Failures - Fail SCM as well as a DataNode in the middle of SCM upgrade while the SCM is at a specific state.
 ** Try this for all possible SCM-upgrade states
 * Two Node Failures - Fail SCM as well as a DataNode in the middle of the DataNode upgrade while the same DataNode is at a specific state.
 ** Try this for all possible DataNode-upgrade states
 * Two Node Failures - Fail SCM at a specific upgrade state in SCM thread context. Fail DataNode at a specific upgrade state in DataNode upgrade thread context.
 ** Try this for all permutations of SCM-upgrade-states and Data-Node-Upgrade-states
 * Multi-node failure - Fail All the DataNodes at specific SCM-upgrade state
 ** Try this for all possible SCM-upgrade states
 * Multi-node failure - Fail All the DataNodes at specific DataNode-upgrade state
 ** Try this for all possible DataNode-upgrade states

 

> Validating HDDS upgrade in presence of failures
> -----------------------------------------------
>
>                 Key: HDDS-4914
>                 URL: https://issues.apache.org/jira/browse/HDDS-4914
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: Ozone Datanode, SCM, upgrade
>            Reporter: Prashant Pogde
>            Assignee: Prashant Pogde
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org