You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Kirill Gusakov (Jira)" <ji...@apache.org> on 2022/01/23 22:25:00 UTC

[jira] [Updated] (IGNITE-16011) Start new rebalance round, when partition assignments updated

     [ https://issues.apache.org/jira/browse/IGNITE-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kirill Gusakov updated IGNITE-16011:
------------------------------------
    Description: 
When partition assignments are updated, we need to start raft changePeers and handle failover scenarios.

When metastore event about partition assignments updates received we need to:
 - Start all needed nodes 
{code:java}
 partition.assignments.pending / partition.assignments.stable{code}

 - After successful starts - check if current node is the leader of raft group (leader response must be updated by current term) and changePeers(leaderTerm, peers). changePeers from old terms must be skipped.

Also, we need the propagation of some new events from the raft side:
 * {{onLeaderChanged()}} - must be executed from the new leader when raft group changes the leader. Maybe we actually need to also check if a new lease is received - we need to investigate.
 * {{onChangePeersError(errorContext)}} - must be executed when any errors during changePeers occurred
 * {{onChangePeersCommitted(peers)}} - must be executed with the list of new peers when changePeers has successfully done.

and handle them by appropriate way.

When any changePeers finished and {{onChangePeersCommitted(appliedPeers -> closure)}} called, we need to:
 - Update pending and stable partitions assignments:
{code:java}
metastoreInvoke: \\ atomic
    partition.assignments.stable = appliedPeers
    if empty(partition.assignments.planned):
        partition.assignments.pending = empty
    else:
        partition.assignments.pending = partition.assignments.planned {code}

When {{partition.assignments.stable}} updated, we need to:
 * Replace current raft client with new one, with appropriate peers
 * Stop unneeded raft node
 

(Phase 1)

  was:
When partition assignments are updated, we need to start raft changePeers and handle failover scenarios.

When metastore event about partition assignments updates received we need to:
 - Start all needed nodes 
{code:java}
 partition.assignments.pending / partition.assignments.stable{code}

 - After successful starts - check if current node is the leader of raft group (leader response must be updated by current term) and changePeers(leaderTerm, peers). changePeers from old terms must be skipped.

Also, we need the propagation of some new events from the raft side:
 * {{onLeaderChanged()}} - must be executed from the new leader when raft group changes the leader. Maybe we actually need to also check if a new lease is received - we need to investigate.
 * {{onChangePeersError(errorContext)}} - must be executed when any errors during changePeers occurred
 * {{onChangePeersCommitted(peers -> closure)}} - must be executed with the list of new peers when changePeers has successfully done.

and handle them by appropriate way.

When any changePeers finished and {{onChangePeersCommitted(appliedPeers -> closure)}} called, we need to:
 - Update pending and stable partitions assignments:
{code:java}
metastoreInvoke: \\ atomic
    partition.assignments.stable = appliedPeers
    if empty(partition.assignments.planned):
        partition.assignments.pending = empty
    else:
        partition.assignments.pending = partition.assignments.planned {code}

When {{partition.assignments.stable}} updated, we need to:
 * Replace current raft client with new one, with appropriate peers
 * Stop unneeded raft node
 

(Phase 1)


> Start new rebalance round, when partition assignments updated
> -------------------------------------------------------------
>
>                 Key: IGNITE-16011
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16011
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Kirill Gusakov
>            Priority: Major
>              Labels: ignite-3
>
> When partition assignments are updated, we need to start raft changePeers and handle failover scenarios.
> When metastore event about partition assignments updates received we need to:
>  - Start all needed nodes 
> {code:java}
>  partition.assignments.pending / partition.assignments.stable{code}
>  - After successful starts - check if current node is the leader of raft group (leader response must be updated by current term) and changePeers(leaderTerm, peers). changePeers from old terms must be skipped.
> Also, we need the propagation of some new events from the raft side:
>  * {{onLeaderChanged()}} - must be executed from the new leader when raft group changes the leader. Maybe we actually need to also check if a new lease is received - we need to investigate.
>  * {{onChangePeersError(errorContext)}} - must be executed when any errors during changePeers occurred
>  * {{onChangePeersCommitted(peers)}} - must be executed with the list of new peers when changePeers has successfully done.
> and handle them by appropriate way.
> When any changePeers finished and {{onChangePeersCommitted(appliedPeers -> closure)}} called, we need to:
>  - Update pending and stable partitions assignments:
> {code:java}
> metastoreInvoke: \\ atomic
>     partition.assignments.stable = appliedPeers
>     if empty(partition.assignments.planned):
>         partition.assignments.pending = empty
>     else:
>         partition.assignments.pending = partition.assignments.planned {code}
> When {{partition.assignments.stable}} updated, we need to:
>  * Replace current raft client with new one, with appropriate peers
>  * Stop unneeded raft node
>  
> (Phase 1)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)