You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Andrey Mashenkov (Jira)" <ji...@apache.org> on 2022/12/02 08:24:00 UTC

[jira] [Updated] (IGNITE-18171) Descibe nodes start/stop scenarios

     [ https://issues.apache.org/jira/browse/IGNITE-18171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrey Mashenkov updated IGNITE-18171:
--------------------------------------
    Description: 
h2. Definitions.

We can distinguish next cluster node groups, see below. Each node may be part of one or more groups.
 * Cluster Management Group (CMG), that control new nodes join process.
 * MetaStorage group (MSG), that hosts meta storage.
 * Data node group (DNG), that just hosts tables partitions.

The components (CMG, meta storage, tables components) are depends on each other, but may resides on different (even disjoint) node subsets. So, some components may become temporary unavailable, and dependant components must be aware of such issues and handle them (wait, retry, throw exception or whatever) in expected way, which has to be documented also.
[See IEP for details|https://cwiki.apache.org/confluence/display/IGNITE/IEP-77%3A+Node+Join+Protocol+and+Initialization+for+Ignite+3]
h2. Motivation.

As of now, the correct way to start the grid (after it was stopped) is: start CMG nodes, then Meta Storage nodes, then Data nodes. And in backward order for correct stop. Other scenarios are not tested and may lead to unexpected behaviour.

Let's describe all possible scenarios, expected behaviour for each of them and extend test coverage.

 

*UPD:* 
Scenarios to test

  was:
h2. Definitions.

We can distinguish next cluster node groups, see below. Each node may be part of one or more groups.
 * Cluster Management Group (CMG), that control new nodes join process.
 * MetaStorage group (MSG), that hosts meta storage.
 * Data node group (DNG), that just hosts tables partitions.

The components (CMG, meta storage, tables components) are depends on each other, but may resides on different (even disjoint) node subsets. So, some components may become temporary unavailable, and dependant components must be aware of such issues and handle them (wait, retry, throw exception or whatever) in expected way, which has to be documented also.
[See IEP for details|https://cwiki.apache.org/confluence/display/IGNITE/IEP-77%3A+Node+Join+Protocol+and+Initialization+for+Ignite+3]
h2. Motivation.

As of now, the correct way to start the grid (after it was stopped) is: start CMG nodes, then Meta Storage nodes, then Data nodes. And in backward order for correct stop. Other scenarios are not tested and may lead to unexpected behaviour.

Let's describe all possible scenarios, expected behaviour for each of them and extend test coverage.


> Descibe nodes start/stop scenarios
> ----------------------------------
>
>                 Key: IGNITE-18171
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18171
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>            Reporter: Andrey Mashenkov
>            Assignee: Andrey Mashenkov
>            Priority: Major
>              Labels: ignite-3
>
> h2. Definitions.
> We can distinguish next cluster node groups, see below. Each node may be part of one or more groups.
>  * Cluster Management Group (CMG), that control new nodes join process.
>  * MetaStorage group (MSG), that hosts meta storage.
>  * Data node group (DNG), that just hosts tables partitions.
> The components (CMG, meta storage, tables components) are depends on each other, but may resides on different (even disjoint) node subsets. So, some components may become temporary unavailable, and dependant components must be aware of such issues and handle them (wait, retry, throw exception or whatever) in expected way, which has to be documented also.
> [See IEP for details|https://cwiki.apache.org/confluence/display/IGNITE/IEP-77%3A+Node+Join+Protocol+and+Initialization+for+Ignite+3]
> h2. Motivation.
> As of now, the correct way to start the grid (after it was stopped) is: start CMG nodes, then Meta Storage nodes, then Data nodes. And in backward order for correct stop. Other scenarios are not tested and may lead to unexpected behaviour.
> Let's describe all possible scenarios, expected behaviour for each of them and extend test coverage.
>  
> *UPD:* 
> Scenarios to test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)