You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sherry Chen (JIRA)" <ji...@apache.org> on 2011/07/26 18:23:09 UTC

[jira] [Created] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
-----------------------------------------------------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-2729
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
         Environment: 0.20.1xx-Secondary
            Reporter: Sherry Chen
            Assignee: Sherry Chen


In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.

Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071421#comment-13071421 ] 

Sherry Chen commented on MAPREDUCE-2729:
----------------------------------------

Ant test passed.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2729:
-----------------------------------

    Affects Version/s: 0.20.205.0
        Fix Version/s: 0.20.205.0

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072987#comment-13072987 ] 

Sherry Chen commented on MAPREDUCE-2729:
----------------------------------------

Arun, I ran unit-tests and test-patch. Thx, Sherry

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated MAPREDUCE-2729:
-------------------------------------

    Status: Open  (was: Patch Available)

Sherry, the patch doesn't apply clean - can you please re-generate it? Thanks.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072714#comment-13072714 ] 

Arun C Murthy commented on MAPREDUCE-2729:
------------------------------------------

Sherry, the patch looks good. What sort of testing have you done?

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073803#comment-13073803 ] 

Arun C Murthy commented on MAPREDUCE-2729:
------------------------------------------

To qualify: please run it on a cluster of 5-10 nodes, verify the fix manually and please let me know. Thanks.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073078#comment-13073078 ] 

Arun C Murthy commented on MAPREDUCE-2729:
------------------------------------------

Sherry, you need to verify this on a real cluster to be safe before we commit this...

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2729:
-----------------------------------

    Attachment: MAPREDUCE-2729.patch

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073009#comment-13073009 ] 

Arun C Murthy commented on MAPREDUCE-2729:
------------------------------------------

Sherry - I meant what tests you ran at scale to ensure this works...

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073027#comment-13073027 ] 

Sherry Chen commented on MAPREDUCE-2729:
----------------------------------------

Arun, do you mean I need to run tests in a test cluster? I haven't got any cluster to do it.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081122#comment-13081122 ] 

Arun C Murthy commented on MAPREDUCE-2729:
------------------------------------------

Thomas, it doesn't make sense to port this to trunk - please don't bother, unless you want to look at this vis-a-vis MAPREDUCE-279.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy resolved MAPREDUCE-2729.
--------------------------------------

    Resolution: Fixed

Sorry, some weird issue with my patch d/w.

I just committed this. Thanks Sherry!

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079451#comment-13079451 ] 

Sherry Chen commented on MAPREDUCE-2729:
----------------------------------------

I manually verified this fix on the 10 nodes cluster.

Verification steps:
1. Replace hadoop-capacity-scheduler.jar with the fix on the cluster gateway
2. Modify the capacity-scheduler.xml to ensure a queue have multiple map & reduce task slots
3. restart mapred
4. Submit jobs for a user which start reduces when 5% (default) maps complete, submit jobs for 2nd user (same queue as 1st user) which start reduces when 50% maps complete.
5. Verify that 1st user got all queue reduce capacity whatever the 2nd user hasn't used yet, it is greater than user-limit.



> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Thomas Graves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081118#comment-13081118 ] 

Thomas Graves commented on MAPREDUCE-2729:
------------------------------------------

The patch is for the branch-0.20-security branch.  I will look at putting it on trunk.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071465#comment-13071465 ] 

Hadoop QA commented on MAPREDUCE-2729:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12487910/MAPREDUCE-2729.patch
  against trunk revision 1150926.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/504//console

This message is automatically generated.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079087#comment-13079087 ] 

Sherry Chen commented on MAPREDUCE-2729:
----------------------------------------

Tested in 10 node mini cluster, test passed.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2729:
-----------------------------------

    Status: Patch Available  (was: Open)

To pass TestJobQueueTaskScheduler test cases, patch for MAPREDUCE-2621 has to be applied.



> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071333#comment-13071333 ] 

Milind Bhandarkar commented on MAPREDUCE-2729:
----------------------------------------------

It would be good to have a notion of a "ready" task, which is separate from a pending task.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079129#comment-13079129 ] 

Arun C Murthy commented on MAPREDUCE-2729:
------------------------------------------

Sherri - thanks. Can you please clarify that you manually verified this fix on the cluster? Thanks.

> Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.1xx-Secondary
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has completed enough map tasks) when we increment number of users in a queue needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira