You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2007/12/27 10:26:43 UTC

[jira] Assigned: (HADOOP-1719) Improve the utilization of shuffle copier threads

     [ https://issues.apache.org/jira/browse/HADOOP-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-1719:
-----------------------------------

    Assignee: Amar Kamat  (was: Devaraj Das)

> Improve the utilization of shuffle copier threads
> -------------------------------------------------
>
>                 Key: HADOOP-1719
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1719
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>         Attachments: 1719.1.patch, 1719.patch, HADOOP-1719.patch, HADOOP-1719.patch
>
>
> In the current design, the scheduling of copies is done and the scheduler (the main loop in fetchOutputs) won't schedule anything until it hears back from at least one of the copier threads. Due to this, the main loop won't query the TaskTracker asking for new map locations and may not be using all the copiers effectively. This may not be an issue for small-sized map outputs, where at steady state, the frequency of such notifications is frequent.
> Ideally, we should schedule all what we can, and, depending on how busy we currently are, query the tasktracker for more map locations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.