You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Maxim Muzafarov (Jira)" <ji...@apache.org> on 2019/10/03 10:03:05 UTC

[jira] [Updated] (IGNITE-10485) Ability to get know more about cluster state before NODE_JOINED event is fired cluster-wide

     [ https://issues.apache.org/jira/browse/IGNITE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maxim Muzafarov updated IGNITE-10485:
-------------------------------------
    Fix Version/s:     (was: 2.8)
                   2.9

> Ability to get know more about cluster state before NODE_JOINED event is fired cluster-wide
> -------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-10485
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10485
>             Project: Ignite
>          Issue Type: Improvement
>          Components: cache
>            Reporter: Pavel Kovalenko
>            Priority: Major
>             Fix For: 2.9
>
>
> Currently there are no good possibilities to get more knowledge about cluster before PME on node join is started.
> It might be usefult to do some pre-work (activate components if cluster is active, calculate baseline affinity, cleanup pds if baseline changed, etc.) before actual NODE_JOIN event is triggered cluster-wide and PME is started.
> Such pre-work will significantly speed-up PME in case of node join.
> Currently the only place where it can be done is during processing NodeAdded message on local joining node. 
> But it's not a good idea, because it will freeze processing new discovery messages cluster-wide.
> I see 2 ways how to implement it:
> 1) Introduce new intermediate state of node when it's discovered, but discovery event on node join is not triggered yet. This is right, but complicated change, because it requires revisiting joining process both in Tcp and Zk discovery protocols with extra failover scenarios.
> 2) Try to get this information and do pre-work before discovery manager start, using e.g. GridRestProcessor. This looks much simplier, but we can have some races there, when during pre-work cluster state has been changed (deactivation, baseline change). In this case we should rollback it or just stop/restart the node to avoid cluster instability. However these are rare scenarios in real world (e.g. start baseline node and start deactivation process right after node recovery is finished).
> For starters we can expose baseline and cluster state in our rest endpoint and try to move out mentioned above pre-work things from PME. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)