You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Ryan Ripken <ry...@rmanet.com> on 2017/05/16 20:04:47 UTC

Is there a way to allow nodes to join an already started task?

In the GridGain days I was able to add nodes to an already started 
task.   Is there a way to do that in Ignite?  I occasionally have nodes 
disconnect or crash in a custom native library.  I'd like to be able to 
restart those failed nodes or add additional nodes if the compute isn't 
progressing as quickly as initially hoped.

If its not possible to add nodes to an already started task, are there 
patterns or tricks that can be used to accomplish something similar?

It seems like one trick might be to take the original task (100 jobs) 
and turn it into many more tasks (10) with fewer jobs (10) per task.  
The new list of tasks aren't started all at once but staggered over time 
so that if additional nodes join (after task 1 has already started) they 
can contribute to the later tasks.

Its a crude solution but it seems like it would have to work. Before I 
refactor my tasks and jobs to try it out I'm wondering if someone has a 
better suggestion.  Is there a better way to accomplish something similar?

Thanks!


Re: Is there a way to allow nodes to join an already started task?

Posted by Andrey Mashenkov <an...@gmail.com>.
Yes, it is not supported yet.

On Mon, May 22, 2017 at 10:08 PM, Ryan Ripken <ry...@rmanet.com> wrote:

> Thanks for the feedback.
> I've read the feature docs and the bug report.  In summary: adding nodes
> doesn't work.
>
>
>
>
> On 5/22/2017 4:43 AM, Andrey Mashenkov wrote:
>
> Hi Ryan,
>
> Ignite has JobStealingCollisionSpi and JobStealingFailoverSpi for it. See
> CollisionSpi [1] and FailoverSpi [2] documentation.
> However, there is an unresolved issue [3] that doesn't allow tasks to be
> stolen by newly joined node.
>
> [1] https://apacheignite.readme.io/docs/job-scheduling
> [2] https://apacheignite.readme.io/docs/fault-tolerance
> [3] https://issues.apache.org/jira/browse/IGNITE-1267
>
> On Tue, May 16, 2017 at 11:04 PM, Ryan Ripken <ry...@rmanet.com> wrote:
>
>> In the GridGain days I was able to add nodes to an already started task.
>>  Is there a way to do that in Ignite?  I occasionally have nodes disconnect
>> or crash in a custom native library.  I'd like to be able to restart those
>> failed nodes or add additional nodes if the compute isn't progressing as
>> quickly as initially hoped.
>>
>> If its not possible to add nodes to an already started task, are there
>> patterns or tricks that can be used to accomplish something similar?
>>
>> It seems like one trick might be to take the original task (100 jobs) and
>> turn it into many more tasks (10) with fewer jobs (10) per task.  The new
>> list of tasks aren't started all at once but staggered over time so that if
>> additional nodes join (after task 1 has already started) they can
>> contribute to the later tasks.
>>
>> Its a crude solution but it seems like it would have to work. Before I
>> refactor my tasks and jobs to try it out I'm wondering if someone has a
>> better suggestion.  Is there a better way to accomplish something similar?
>>
>> Thanks!
>>
>>
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>
>
>


-- 
Best regards,
Andrey V. Mashenkov

Re: Is there a way to allow nodes to join an already started task?

Posted by Ryan Ripken <ry...@rmanet.com>.
Thanks for the feedback.
I've read the feature docs and the bug report.  In summary: adding nodes 
doesn't work.



On 5/22/2017 4:43 AM, Andrey Mashenkov wrote:
> Hi Ryan,
>
> Ignite has JobStealingCollisionSpi and JobStealingFailoverSpi for it. 
> See CollisionSpi [1] and FailoverSpi [2] documentation.
> However, there is an unresolved issue [3] that doesn't allow tasks to 
> be stolen by newly joined node.
>
> [1] https://apacheignite.readme.io/docs/job-scheduling
> [2] https://apacheignite.readme.io/docs/fault-tolerance
> [3] https://issues.apache.org/jira/browse/IGNITE-1267
>
> On Tue, May 16, 2017 at 11:04 PM, Ryan Ripken <ryan@rmanet.com 
> <ma...@rmanet.com>> wrote:
>
>     In the GridGain days I was able to add nodes to an already started
>     task.   Is there a way to do that in Ignite?  I occasionally have
>     nodes disconnect or crash in a custom native library. I'd like to
>     be able to restart those failed nodes or add additional nodes if
>     the compute isn't progressing as quickly as initially hoped.
>
>     If its not possible to add nodes to an already started task, are
>     there patterns or tricks that can be used to accomplish something
>     similar?
>
>     It seems like one trick might be to take the original task (100
>     jobs) and turn it into many more tasks (10) with fewer jobs (10)
>     per task.  The new list of tasks aren't started all at once but
>     staggered over time so that if additional nodes join (after task 1
>     has already started) they can contribute to the later tasks.
>
>     Its a crude solution but it seems like it would have to work.
>     Before I refactor my tasks and jobs to try it out I'm wondering if
>     someone has a better suggestion.  Is there a better way to
>     accomplish something similar?
>
>     Thanks!
>
>
>
>
> -- 
> Best regards,
> Andrey V. Mashenkov



Re: Is there a way to allow nodes to join an already started task?

Posted by Andrey Mashenkov <an...@gmail.com>.
Hi Ryan,

Ignite has JobStealingCollisionSpi and JobStealingFailoverSpi for it. See
CollisionSpi [1] and FailoverSpi [2] documentation.
However, there is an unresolved issue [3] that doesn't allow tasks to be
stolen by newly joined node.

[1] https://apacheignite.readme.io/docs/job-scheduling
[2] https://apacheignite.readme.io/docs/fault-tolerance
[3] https://issues.apache.org/jira/browse/IGNITE-1267

On Tue, May 16, 2017 at 11:04 PM, Ryan Ripken <ry...@rmanet.com> wrote:

> In the GridGain days I was able to add nodes to an already started task.
>  Is there a way to do that in Ignite?  I occasionally have nodes disconnect
> or crash in a custom native library.  I'd like to be able to restart those
> failed nodes or add additional nodes if the compute isn't progressing as
> quickly as initially hoped.
>
> If its not possible to add nodes to an already started task, are there
> patterns or tricks that can be used to accomplish something similar?
>
> It seems like one trick might be to take the original task (100 jobs) and
> turn it into many more tasks (10) with fewer jobs (10) per task.  The new
> list of tasks aren't started all at once but staggered over time so that if
> additional nodes join (after task 1 has already started) they can
> contribute to the later tasks.
>
> Its a crude solution but it seems like it would have to work. Before I
> refactor my tasks and jobs to try it out I'm wondering if someone has a
> better suggestion.  Is there a better way to accomplish something similar?
>
> Thanks!
>
>


-- 
Best regards,
Andrey V. Mashenkov