You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Prasad Bhalerao <pr...@gmail.com> on 2018/09/28 14:32:53 UTC

Task failover in ignite

Hi,

I have created multiple ignite runnable task by extending IgniteRunnable
nad IgniteCallable interface. I submitting these tasks to the primary data
node using "ignite.compute().withNoFailover().affinityRun()" method.

I have not set max failover attempt, so I think default max failover
attempt will be 5.
I have kept the back count as 1.

What happens when the primary node executing this task goes down?

In this case does ignite move this task to the backup node and execute it
on back node?

When the node goes down ignite starts rebalancing the cluster. In this case
how this task will be executed?

Does ignite waits for rebalancing process to complete and then execute this
task?

Can some one please explain this in detail?



Thanks,
Prasad

Re: Task failover in ignite

Posted by vkulichenko <va...@gmail.com>.

Prasad,


prasadbhalerao1983 wrote
> Are you saying that when a primary node dies the former backup node
> becomes
> new primary for ALL backup partitions present on it and only primary
> partitions are moved in rebalancing process?

Not for all partitions, but only for those for which primary copy existed on
the recently failed node. For example, you have nodes A, B and C. Partition
#1 has primary copy on A and backup on B. Partition #2 has primary copy on C
and backup on B. If A dies, B becomes primary node for partition #1, and
then rebalancing creates a new backup for it on C. However, nothing changes
for partition #2, because A didn't own any of its copies.

prasadbhalerao1983 wrote
> Partition to node mapping is defined by affinity function which uses
> Rendezvous Hashing. How does it make sure that back up partitions on a
> node
> becomes primary partitions on same node?

That's how the function is implemented. If you're interested in more
details, I would suggest you to go through the code of
RendezvousAffinityFunction.

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Task failover in ignite

Posted by Prasad Bhalerao <pr...@gmail.com>.

Hi Val,

Thank you for the explanation.

Are you saying that when a primary node dies the former backup node becomes
new primary for ALL backup partitions present on it and only primary
partitions are moved in rebalancing process?

Partition to node mapping is defined by affinity function which uses
Rendezvous Hashing. How does it make sure that back up partitions on a node
becomes primary partitions on same node?

Regards,
Prasad

On Tue, Oct 2, 2018, 3:16 AM vkulichenko <va...@gmail.com>
wrote:

> Prasad,
>
> When a primary node for a partition dies, former backup node for this
> partition becomes new primary. Therefore there is no need to wait for
> rebalancing in this case, data is already there. By default job will be
> automatically remapped to that node, but with 'withNoFailover()' you'll
> have
> to retry manually.
>
> In addition, affinityRun/Call acquires a partition lock for a duration of
> computation to make sure data is not moved until it's completed. I.e. if
> there is a new node that becomes a new owner for this partition, data for
> this partition will not be evicted from previous primary until all
> collocated computations are done.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Task failover in ignite

Posted by vkulichenko <va...@gmail.com>.

Prasad,

When a primary node for a partition dies, former backup node for this
partition becomes new primary. Therefore there is no need to wait for
rebalancing in this case, data is already there. By default job will be
automatically remapped to that node, but with 'withNoFailover()' you'll have
to retry manually.

In addition, affinityRun/Call acquires a partition lock for a duration of
computation to make sure data is not moved until it's completed. I.e. if
there is a new node that becomes a new owner for this partition, data for
this partition will not be evicted from previous primary until all
collocated computations are done.

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Task failover in ignite

Posted by Prasad Bhalerao <pr...@gmail.com>.

Hi,

Ignite doc says "at least once guarantee".

If I submit the task using just  "ignite.compute().affinityRun()",   then
ignite will try to execute this task on backup node if the primary node
goes down.
Am I correct?

What I want is if the primary data node which is executing the task goes
down then the task should be executed on its back node.

Does ignite immediately start rebalancing when a node goes down?

I am trying to understand how does ignite re-executes affinity task on
backup or new primary node when primary goes down?

Does ignite wait for rebalancing to complete and then resubmits the
affinity task to new primary node?

Or does ignite resubmits the task to backup node and waits for task to
complete then does the rebalancing?

In case of node failure does backup node becomes new primary for backup
partitions or it is decided after partition re-exchange process?

How does it decides which node will become new primary for backup
partitions so that minimum data exchange will happen?

Thanks,
Prasad

On Sat, Sep 29, 2018 at 8:29 AM Prasad Bhalerao <
prasadbhalerao1983@gmail.com> wrote:

> Hi,
>
> Ignite doc says "at least once guarantee".
>
> If I sumbit the task using just  "ignite.compute().withNoFailover().affinityRun()",
>  then ignite will try to execute this task on backup node if the primary
> node goes down.
>
> Does ignite immediately start rebalancing when a node goes down?
>
> I am trying to understand how does ignite re-executes affinity task on
> backup or new primary node when primary goes down?
>
> Does ignite wait for rebalancing to complete and then resubmits the
> affinity task to new primary node?
>
> Or does ignite resubmits the task to backup node and waits for task to
> complete then does the rebalancing?
>
> In case of node failure does backup node becomes new primary for backup
> partitions or it is decided after partition reexchange process?
>
> How does it decides which node will become new primary for backup
> partitions so that minimum data exchange will happen?
>
> Thanks,
> Prasad
>
> On Sat, Sep 29, 2018, 2:31 AM vkulichenko <va...@gmail.com>
> wrote:
>
>> Prasad,
>>
>> Since you're using withNoFailover(), failover will never happen and the
>> task
>> will just fail with an exception on client side if primary nodes dies.
>> It's
>> up to your code to retry in this case.
>>
>> When you retry, the task will be mapped to the new primary, which is
>> former
>> backup and therefore has all the data. No need to wait for rebalancing.
>>
>> In general, affinityRun/Call guarantees that all data is available locally
>> during task execution. If that's not possible for any reason, an exception
>> is thrown.
>>
>> -Val
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Re: Task failover in ignite

Posted by Prasad Bhalerao <pr...@gmail.com>.

Hi,

Ignite doc says "at least once guarantee".

If I sumbit the task using just
"ignite.compute().withNoFailover().affinityRun()",
 then ignite will try to execute this task on backup node if the primary
node goes down.

Does ignite immediately start rebalancing when a node goes down?

I am trying to understand how does ignite re-executes affinity task on
backup or new primary node when primary goes down?

Does ignite wait for rebalancing to complete and then resubmits the
affinity task to new primary node?

Or does ignite resubmits the task to backup node and waits for task to
complete then does the rebalancing?

In case of node failure does backup node becomes new primary for backup
partitions or it is decided after partition reexchange process?

How does it decides which node will become new primary for backup
partitions so that minimum data exchange will happen?

Thanks,
Prasad

On Sat, Sep 29, 2018, 2:31 AM vkulichenko <va...@gmail.com>
wrote:

> Prasad,
>
> Since you're using withNoFailover(), failover will never happen and the
> task
> will just fail with an exception on client side if primary nodes dies. It's
> up to your code to retry in this case.
>
> When you retry, the task will be mapped to the new primary, which is former
> backup and therefore has all the data. No need to wait for rebalancing.
>
> In general, affinityRun/Call guarantees that all data is available locally
> during task execution. If that's not possible for any reason, an exception
> is thrown.
>
> -Val
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Task failover in ignite

Posted by vkulichenko <va...@gmail.com>.

Prasad,

Since you're using withNoFailover(), failover will never happen and the task
will just fail with an exception on client side if primary nodes dies. It's
up to your code to retry in this case.

When you retry, the task will be mapped to the new primary, which is former
backup and therefore has all the data. No need to wait for rebalancing.

In general, affinityRun/Call guarantees that all data is available locally
during task execution. If that's not possible for any reason, an exception
is thrown.

-Val





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/