You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Robert Evans <ev...@yahoo-inc.com> on 2011/07/21 20:49:31 UTC

Review Request: MAPREDUCE-2324 Job should fail if a reduce task can't be scheduled anywhere

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1164/
-----------------------------------------------------------

Review request for hadoop-mapreduce, Todd Lipcon, Tom Graves, and Jeffrey Naisbitt.


Summary
-------

Job should fail if a reduce task can't be scheduled anywhere. V2 of the patch.


This addresses bug MAPREDUCE-2324.
    https://issues.apache.org/jira/browse/MAPREDUCE-2324


Diffs
-----

  branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java 1148035 
  branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/TaskTracker.java 1148035 
  branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/MiniMRCluster.java 1148035 
  branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java 1148035 

Diff: https://reviews.apache.org/r/1164/diff


Testing
-------

Unit tests and ran manual tests on a single node cluster.


Thanks,

Robert


Re: Review Request: MAPREDUCE-2324 Job should fail if a reduce task can't be scheduled anywhere

Posted by Todd Lipcon <to...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1164/#review1157
-----------------------------------------------------------



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2386>

    I think a default of 0.8 or so would probably make more sense -- just in case one of the TTs is in some bad state where it isn't heartbeating, we don't want to wait forever.



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2387>

    since this is a set of trackers, not attempts, a better name might be: failedReduceSchedulingTrackers, or something?



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2388>

    this key should probably be defined as a constant in MRJobConfig, right?



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2389>

    refactor this into a new method?



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2391>

    I think the operator precedence is off here.
    
    (int)reduceInputAttemptFactor is higher precedence, so it will end up rounding anything < 1.0 down to 0.



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2390>

    style: add space between if and (



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2392>

    I think we mostly avoid the 1st person in error messages. Change to "Tried to schedule..." rather than "We tried"...



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2393>

    StringUtils.humanReadableInt might be useful here.



branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2394>

    jobId is the job, not the task, right?



branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java
<https://reviews.apache.org/r/1164/#comment2395>

    typos: "failes", "then that"



branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java
<https://reviews.apache.org/r/1164/#comment2396>

    isn't the default input limit unlimited? why do we need this?



branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java
<https://reviews.apache.org/r/1164/#comment2397>

    we should check that the failure info of the job has the correct type of error message (ie that it didn't fail due to some other error)


- Todd


On 2011-07-21 18:49:31, Robert Evans wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/1164/
> -----------------------------------------------------------
> 
> (Updated 2011-07-21 18:49:31)
> 
> 
> Review request for hadoop-mapreduce, Todd Lipcon, Tom Graves, and Jeffrey Naisbitt.
> 
> 
> Summary
> -------
> 
> Job should fail if a reduce task can't be scheduled anywhere. V2 of the patch.
> 
> 
> This addresses bug MAPREDUCE-2324.
>     https://issues.apache.org/jira/browse/MAPREDUCE-2324
> 
> 
> Diffs
> -----
> 
>   branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java 1148035 
>   branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/TaskTracker.java 1148035 
>   branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/MiniMRCluster.java 1148035 
>   branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java 1148035 
> 
> Diff: https://reviews.apache.org/r/1164/diff
> 
> 
> Testing
> -------
> 
> Unit tests and ran manual tests on a single node cluster.
> 
> 
> Thanks,
> 
> Robert
> 
>