You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2013/08/09 06:29:49 UTC

[jira] [Assigned] (TEZ-348) Improve how completion events are fetched by the ShuffleHandler

     [ https://issues.apache.org/jira/browse/TEZ-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hitesh Shah reassigned TEZ-348:
-------------------------------

    Assignee: Hitesh Shah
    
> Improve how completion events are fetched by the ShuffleHandler
> ---------------------------------------------------------------
>
>                 Key: TEZ-348
>                 URL: https://issues.apache.org/jira/browse/TEZ-348
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Hitesh Shah
>
> The ShuffleHandler currently has a thread which checks for new completion events every second. This can add unnecessary delay to the reduce getting started.
> An async RPC implementation would work well to fix this. That, however, is currently not available in Hadoop.
> Options
> - Poll with a smaller interval. This can overload the AM if there's a large number of reduce tasks. The poll interval could be set based on the # of tasks.
> - Have the AM push completion events to the Task. AM ends up creating way too many connections, and the child has to run an RPC server.
> - Rely on an external service like ZK with monitors.
> Thoughts / suggestions on how this can be improved ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira