You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2008/10/21 13:34:44 UTC

[jira] Created: (HADOOP-4474) JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task

JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task
-----------------------------------------------------------------------------------------

                 Key: HADOOP-4474
                 URL: https://issues.apache.org/jira/browse/HADOOP-4474
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
            Reporter: Vivek Ratan


On every heartbeat, the JT first looks to see if it can run a setup or cleanup task, before calling a Scheduler to get a Map or Reduce task. The JT maintains a hashmap of JobInProgress objects (which can be waiting, running, or completed). It iterates through this hashmap on each heartbeat to find a setup or cleanup task. This linear search can be be very expensive, especially with large clusters where the number of jobs is high. There are lots of obvious ways to cut down on this linear search. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4474) JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641354#action_12641354 ] 

Vivek Ratan commented on HADOOP-4474:
-------------------------------------

Since the JT itself manages (and hence, knows about) the creation of setup and cleanup tasks, it can easily maintain a linked list of these tasks. Picking a setup or cleanup task is then O(1). 

> JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4474
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4474
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Vivek Ratan
>
> On every heartbeat, the JT first looks to see if it can run a setup or cleanup task, before calling a Scheduler to get a Map or Reduce task. The JT maintains a hashmap of JobInProgress objects (which can be waiting, running, or completed). It iterates through this hashmap on each heartbeat to find a setup or cleanup task. This linear search can be be very expensive, especially with large clusters where the number of jobs is high. There are lots of obvious ways to cut down on this linear search. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4474) JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641709#action_12641709 ] 

Vivek Ratan commented on HADOOP-4474:
-------------------------------------

bq. But note that the iteration is done only when a TT asks for a task and not per a heartbeat.

Fair enough. But using a linked list like structure to track these special tasks seems simple enough. It also gives you a big benefit that you can maintain, and hence schedule, setup/cleanup tasks in a particular order. You'd ideally like to run setup tasks of jobs in the order in which they were submitted. An iteration over a hashmap is random. Using a separate sorted structure for setup/cleanup tasks can give you the ordering you want, which I think is quite valuable. There shouldn't be too many of these tasks (one each per job), so your memory footprint is relatively low. 

> JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4474
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4474
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Vivek Ratan
>
> On every heartbeat, the JT first looks to see if it can run a setup or cleanup task, before calling a Scheduler to get a Map or Reduce task. The JT maintains a hashmap of JobInProgress objects (which can be waiting, running, or completed). It iterates through this hashmap on each heartbeat to find a setup or cleanup task. This linear search can be be very expensive, especially with large clusters where the number of jobs is high. There are lots of obvious ways to cut down on this linear search. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4474) JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641634#action_12641634 ] 

dhruba borthakur commented on HADOOP-4474:
------------------------------------------

Is there are scalability to be gained if the heartbeat processing occurs under a read-lock? If it can be so arranged (especially for heartbeats that do not ask for a new task) then multiple threads in the JT can be responding to heartbeats at the same time.

> JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4474
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4474
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Vivek Ratan
>
> On every heartbeat, the JT first looks to see if it can run a setup or cleanup task, before calling a Scheduler to get a Map or Reduce task. The JT maintains a hashmap of JobInProgress objects (which can be waiting, running, or completed). It iterates through this hashmap on each heartbeat to find a setup or cleanup task. This linear search can be be very expensive, especially with large clusters where the number of jobs is high. There are lots of obvious ways to cut down on this linear search. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4474) JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641430#action_12641430 ] 

Devaraj Das commented on HADOOP-4474:
-------------------------------------

Yes, a valid point - O(#jobs) is certainly worse than O(1). But note that the iteration is done only when a TT asks for a task and not per a heartbeat.

> JT should not iterate through all jobs in every heartbeat to find a cleanup or setup task
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4474
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4474
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Vivek Ratan
>
> On every heartbeat, the JT first looks to see if it can run a setup or cleanup task, before calling a Scheduler to get a Map or Reduce task. The JT maintains a hashmap of JobInProgress objects (which can be waiting, running, or completed). It iterates through this hashmap on each heartbeat to find a setup or cleanup task. This linear search can be be very expensive, especially with large clusters where the number of jobs is high. There are lots of obvious ways to cut down on this linear search. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.