You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2013/08/07 07:33:51 UTC

[jira] [Created] (TEZ-350) Allow multiple vertices to be run within a task if #tasks in each stage is 1

Siddharth Seth created TEZ-350:
----------------------------------

             Summary: Allow multiple vertices to be run within a task if #tasks in each stage is 1
                 Key: TEZ-350
                 URL: https://issues.apache.org/jira/browse/TEZ-350
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Siddharth Seth


Depending on the nature of data being processed, a dag with multiple dependent vertices can end up running a single task for each of the last few vertexes. This information may only be available at execution time.
In such cases, Tez should be able to run these tasks within the same process - without having to shuffle data between the tasks. Also, as a follow on step - we could avoid serialization / deserialization of the data with custom Inputs / Outputs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira