You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Karam Singh (JIRA)" <ji...@apache.org> on 2008/10/17 15:03:44 UTC

[jira] Created: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
------------------------------------------------------------------------------------------------------

                 Key: HADOOP-4444
                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/capacity-sched
         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
            Reporter: Karam Singh


Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan reassigned HADOOP-4444:
----------------------------------------------

    Assignee: Sreekanth Ramakrishnan

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642412#action_12642412 ] 

Sreekanth Ramakrishnan commented on HADOOP-4444:
------------------------------------------------

Attaching the ant test-patch output.

{noformat}
     [exec] 
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
{noformat}

The patch does not require a test case because it is just a change to update the scheduling information to be correctly displayed on the UI.

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Karam Singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640514#action_12640514 ] 

karams edited comment on HADOOP-4444 at 10/21/08 7:18 AM:
---------------------------------------------------------------

Submitted a job with map and reduce tasks = 212.
When all maps and reduces tasks of job starts running, JobTracker UI displayed following information about "Cluster Summary" and "Scheduling Information" -:

Maps=212, Reduces=212, Total Submissions=2, Nodes =106, Map Task Capacity=212 Reduce Task Capacity	=212, Avg. Tasks/Node=4.00

Scheduling Information shows -:
Queue Name=default 
Guaranteed Capacity Maps (%) :100.0
Guaranteed Capacity Reduces (%) :100.0
Current Capacity Maps : 212 
Current Capacity Reduces : 212 
User Limit : 25 
Reclaim Time limit : 300 
Number of Running Maps : 211 
Number of Running Reduces : 211 
Number of Waiting Maps : 1 
Number of Waiting Reduces : 1 
Priority Supported : YES


Job Details page displays -:
JobID: job_200810171104_0003 on <jobtracker_hostname>
Kind =map, Num Tasks=212, Pending=0, Running=0, Complete=0, Killed=0, Failed/Killed Task Attempts=0/0
Kind=reduce	Num Task=212, Pending=0, Running=0, Complete=0, Killed=0, Failed/Killed Task Attempts=0/0.

This also shows that actually 212 MR tasks are running while queue info displays 211 MR tasks running and 1 in waiting.
Note -: if there are ore then one jobs running queue information display one task than then actual task running 

      was (Author: karams):
    Submitted a job with map and reduce tasks = 212.
When all maps and reduces tasks of job starts running, JobTracker UI displayed following information about "Cluster Summary" and "Scheduling Information" -:

Maps=212, Reduces=212, Total Submissions=2, Nodes =106, Map Task Capacity=212 Reduce Task Capacity	=212, Avg. Tasks/Node=4.00

Scheduling Information shows -:
Queue Name=default 
Guaranteed Capacity Maps (%) :100.0
Guaranteed Capacity Reduces (%) :100.0
Current Capacity Maps : 212 
Current Capacity Reduces : 212 
User Limit : 25 
Reclaim Time limit : 300 
Number of Running Maps : 211 
Number of Running Reduces : 211 
Number of Waiting Maps : 1 
Number of Waiting Reduces : 1 
Priority Supported : YES


Job Details page displays -:
JobID: job_200810171104_0003 on gs301308
Kind =map, Num Tasks=212, Pending=0, Running=0, Complete=0, Killed=0, Failed/Killed Task Attempts=0/0
Kind=reduce	Num Task=212, Pending=0, Running=0, Complete=0, Killed=0, Failed/Killed Task Attempts=0/0.

This also shows that actually 212 MR tasks are running while queue info displays 211 MR tasks running and 1 in waiting.
Note -: if there are ore then one jobs running queue information display one task than then actual task running 
  
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4444:
-------------------------------------------

    Attachment: HADOOP-4444.patch

Attaching patch file which should fix this issue. This issue was happening because not every queue scheduling information was getting updated in each heart beat, either MAP or REDUCE scheduler information was updated per heart beat. 

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643945#action_12643945 ] 

Vinod K V commented on HADOOP-4444:
-----------------------------------

All these issues seem to be bugs(but not, as established by now) because of two different sources - scheduler's own structures and the cluster-summary. Even after doing what is done in the latest patch, we still will find the discrepancies because JIP.runningTasks is updated immediately after giving away new tasks but not when it is actually running (i.e when removed from expireLaunchingTasks list).

After discussing this with Hemanth, realized that at the place where we actually use these counts (in the scheduler) we do *have* the correct values. We are anyway going to remove these counts from the scheduler UI (HADOOP-4445) as they are already displayed in cluster-summary. Once we fix HADOOP-4445, these issues will no longer be visible.

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444-1.patch, HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V resolved HADOOP-4444.
-------------------------------

    Resolution: Duplicate

Closing it as duplicate of HADOOP-4445.

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444-1.patch, HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643808#action_12643808 ] 

Vinod K V commented on HADOOP-4444:
-----------------------------------

bq. [...] not every queue scheduling information was getting updated in each heart beat, either MAP or REDUCE scheduler information was updated per heart beat.
As for this, I also think we need to fix it. May or may not be in this patch.

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Karam Singh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karam Singh updated HADOOP-4444:
--------------------------------

    Affects Version/s: 0.19.0

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Karam Singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640514#action_12640514 ] 

Karam Singh commented on HADOOP-4444:
-------------------------------------

Submitted a job with map and reduce tasks = 212.
When all maps and reduces tasks of job starts running, JobTracker UI displayed following information about "Cluster Summary" and "Scheduling Information" -:

Maps=212, Reduces=212, Total Submissions=2, Nodes =106, Map Task Capacity=212 Reduce Task Capacity	=212, Avg. Tasks/Node=4.00

Scheduling Information shows -:
Queue Name=default 
Guaranteed Capacity Maps (%) :100.0
Guaranteed Capacity Reduces (%) :100.0
Current Capacity Maps : 212 
Current Capacity Reduces : 212 
User Limit : 25 
Reclaim Time limit : 300 
Number of Running Maps : 211 
Number of Running Reduces : 211 
Number of Waiting Maps : 1 
Number of Waiting Reduces : 1 
Priority Supported : YES


Job Details page displays -:
JobID: job_200810171104_0003 on gs301308
Kind =map, Num Tasks=212, Pending=0, Running=0, Complete=0, Killed=0, Failed/Killed Task Attempts=0/0
Kind=reduce	Num Task=212, Pending=0, Running=0, Complete=0, Killed=0, Failed/Killed Task Attempts=0/0.

This also shows that actually 212 MR tasks are running while queue info displays 211 MR tasks running and 1 in waiting.
Note -: if there are ore then one jobs running queue information display one task than then actual task running 

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sreekanth Ramakrishnan updated HADOOP-4444:
-------------------------------------------

    Attachment: HADOOP-4444-1.patch

Attaching patch fixing the issue. Updating the scheduling information after the scheduler has obtained tasks to run on TaskTrackers.

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444-1.patch, HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643807#action_12643807 ] 

Vinod K V commented on HADOOP-4444:
-----------------------------------

Found out the actual reason for this behavior. Scheduler obtains running and pending count from JobInProgress. It does the following when giving out a new task - 
 1) Update queue information. This includes updating running count from all jobs.
 2) Give a new Task. This includes incrementing running count for this job.

The incremented running count for a job in the last step is not reflected in the scheduler information till the TT reaches scheduler again i.e. when scheduler goes out for giving another new task to this TT. And when a TT hits the situation when it no longer needs new tasks, running count incremented in the last assignment doesn't get reflected in the scheduler information as the TT never reaches scheduler.

A simple fix would be to updated queue information also after task assignment.

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4444) Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644646#action_12644646 ] 

Vivek Ratan commented on HADOOP-4444:
-------------------------------------

Please see HADOOP-4445 for the explanation of why all this is really happening. This Jira still remains a clone of HADOOP-4445. 

> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4444
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4444
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: HADOOP-4444-1.patch, HADOOP-4444.patch
>
>
> Scheduling Information for queues shows one map/reduce task less then actual running map/reduce tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.