You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nemo.apache.org by GitBox <gi...@apache.org> on 2018/03/08 04:41:21 UTC

[GitHub] sanha commented on issue #2: [NEMO-7] Intra-TaskGroup pipelining

sanha commented on issue #2: [NEMO-7] Intra-TaskGroup pipelining
URL: https://github.com/apache/incubator-nemo/pull/2#issuecomment-371376883
 
 
   For the @johnyangk's comment (avoiding hashing and creating objects), I'd suggest the next model.
   - Build a DAG of `TaskWrapper` (or something like that) from `taskGroupDag` of `ScheduledTaskGroup` when a `TaskGroup` is scheduled.
     - This DAG should manage the connection among vertices as pointer (reference) rather than `Map` and `List`, unlike our current `DAG` implementation.
     - The `TaskWrapper` should have `Callable`, which consumes input element and produce output.
       - This `Callable` can be built from the `Transform` of `Task` that the wrapper wraps.
     - The `TaskWrapper` can have any other stuffs which are stored in `TaskDataHandler` now.
   - After this, each data in input data `Iterable` can be processed through this DAG of `TaskWrapper` without calculating any hash or creating any extra object.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services