You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Dave Harvey <dh...@jobcase.com> on 2018/03/28 20:33:34 UTC

Baseline Topology and Node Failure

The introduction in 2.4 of Baselines seems quite helpful.   If a node
restarts, it will avoid excessive rebalancing.
What is unclear from the documentation is what happens in the case  where a
node fails, and doesn't come back.   I'm assuming that in fact nothing
happens, except that the backups on that node are now offline, 
some backups may have been promoted to primaries, and the cluster continues
to function, but not rebalancing (but that does not appear to be stated).

My question is:  After this event is detected, and something decides to
replace the node, what  process should be used to ensure that the new node
replaces the old one.  Is it sufficient to simply set a new baseline
("--baseline set"), and the minimum amount of data movement will occur?  Or
is there something that needs to be done to get the  right node IDs, or
replace the old node with the new one?

It is unclear what triggers rebalancing, e.g., --baseline remove or just
--baseline set



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Baseline Topology and Node Failure

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi Dave,

Yes, it should be sufficient for apply a new baseline to remove\add\replace
a node.
"--baseline set (<consistent ID list> | <top ver>)" - set a new baseline
and then trigger rebalance.
You can achieve same with "--baseline remove" and then "--baseline add".
This will trigger rebalance twice, but the first rebalance will be
cancelled and restarted.

On Wed, Mar 28, 2018 at 11:33 PM, Dave Harvey <dh...@jobcase.com> wrote:

> The introduction in 2.4 of Baselines seems quite helpful.   If a node
> restarts, it will avoid excessive rebalancing.
> What is unclear from the documentation is what happens in the case  where a
> node fails, and doesn't come back.   I'm assuming that in fact nothing
> happens, except that the backups on that node are now offline,
> some backups may have been promoted to primaries, and the cluster continues
> to function, but not rebalancing (but that does not appear to be stated).
>
> My question is:  After this event is detected, and something decides to
> replace the node, what  process should be used to ensure that the new node
> replaces the old one.  Is it sufficient to simply set a new baseline
> ("--baseline set"), and the minimum amount of data movement will occur?  Or
> is there something that needs to be done to get the  right node IDs, or
> replace the old node with the new one?
>
> It is unclear what triggers rebalancing, e.g., --baseline remove or just
> --baseline set
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>



-- 
Best regards,
Andrey V. Mashenkov

Re: Baseline Topology and Node Failure

Posted by Denis Magda <dm...@gridgain.com>.

We've documented your scenario and many others. Please check up this
documentation section:
https://apacheignite.readme.io/v2.4/docs/cluster-activation#section-usage-scenarios



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/