You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@gearpump.apache.org by Hyunseok Chang <hy...@gmail.com> on 2016/09/12 21:25:26 UTC

Regarding dynamic DAG

Hi,

I'd like to know more about the dynamic DAG feature.

Let's say I have a DAG of:  source -> A -> B -> C.  I want to replace "B"
with "X" node in this chain.

How does node replacement happen internally?

Each processor consists of multiple parallel tasks, so node replacement
should involve killing multiple concurrent tasks for B, and somehow
introducing new tasks for X without affecting predecessor/successor tasks.
I'd like to know how this is done internally.

Also, can I change the parallelism (# of tasks) or type of partitioning
(hash <-> shuffle) of each processor dynamically at run time?

Thanks,
-hs

Re: Regarding dynamic DAG

Posted by Manu Zhang <ow...@gmail.com>.
Hi,

Internally, each Task has a lifetime and subscription list (successor
tasks) both of which can be changed.
For your example, all "B" tasks will update their lifetimes and stop
sending messages after the end of their lifetimes. New "X" tasks will be
launched and send messages.  For predecessor tasks, their subscription list
will be updated to "X" and they start to send messages to "X" tasks.
Automatically, successor tasks will receive messages from "X" tasks.

You can change the parallelism at runtime but not the partitioning.

Thanks,
Manu Zhang



On Tue, Sep 13, 2016 at 5:25 AM Hyunseok Chang <hy...@gmail.com>
wrote:

> Hi,
>
> I'd like to know more about the dynamic DAG feature.
>
> Let's say I have a DAG of:  source -> A -> B -> C.  I want to replace "B"
> with "X" node in this chain.
>
> How does node replacement happen internally?
>
> Each processor consists of multiple parallel tasks, so node replacement
> should involve killing multiple concurrent tasks for B, and somehow
> introducing new tasks for X without affecting predecessor/successor tasks.
> I'd like to know how this is done internally.
>
> Also, can I change the parallelism (# of tasks) or type of partitioning
> (hash <-> shuffle) of each processor dynamically at run time?
>
> Thanks,
> -hs
>
>