You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2015/05/05 06:30:07 UTC

[jira] [Updated] (TEZ-2411) Offload DataMovement event creation from the AM to the tasks

     [ https://issues.apache.org/jira/browse/TEZ-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bikas Saha updated TEZ-2411:
----------------------------
    Description: Today the AM creates a new DataMovement event from the original event sent by the producer task and supplements the new event with source/target indices for the consumer task. This new event creation can be offloaded to the task runtime and thus save CPU cycles on the AM for the object creation. Secondly, the original event can be kept in serialized form inside the AM and sent as is to the task over the RPC, thus potentially saving serde CPU for these events in addition to the object creation CPU. This can help when there is a high concurrency of running tasks in a job. Say 10000 tasks running in parallel and sending events to the AM.  (was: Today the AM creates a new DataMovement event from the original event sent by the producer task and supplements the new event with source/target indices for the consumer task. This new event creation can be offloaded to the task runtime and thus save CPU cycles on the AM for the object creation. Secondly, the original event can be kept in serialized form inside the AM and sent as is to the task over the RPC, thus potentially saving serde CPU for these events in addition to the object creation CPU.)

> Offload DataMovement event creation from the AM to the tasks
> ------------------------------------------------------------
>
>                 Key: TEZ-2411
>                 URL: https://issues.apache.org/jira/browse/TEZ-2411
>             Project: Apache Tez
>          Issue Type: Task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>
> Today the AM creates a new DataMovement event from the original event sent by the producer task and supplements the new event with source/target indices for the consumer task. This new event creation can be offloaded to the task runtime and thus save CPU cycles on the AM for the object creation. Secondly, the original event can be kept in serialized form inside the AM and sent as is to the task over the RPC, thus potentially saving serde CPU for these events in addition to the object creation CPU. This can help when there is a high concurrency of running tasks in a job. Say 10000 tasks running in parallel and sending events to the AM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)