You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Robert Evans <ev...@yahoo-inc.com> on 2011/07/21 20:49:31 UTC
Review Request: MAPREDUCE-2324 Job should fail if a reduce task can't be
scheduled anywhere
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1164/
-----------------------------------------------------------
Review request for hadoop-mapreduce, Todd Lipcon, Tom Graves, and Jeffrey Naisbitt.
Summary
-------
Job should fail if a reduce task can't be scheduled anywhere. V2 of the patch.
This addresses bug MAPREDUCE-2324.
https://issues.apache.org/jira/browse/MAPREDUCE-2324
Diffs
-----
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java 1148035
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/TaskTracker.java 1148035
branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/MiniMRCluster.java 1148035
branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java 1148035
Diff: https://reviews.apache.org/r/1164/diff
Testing
-------
Unit tests and ran manual tests on a single node cluster.
Thanks,
Robert
Re: Review Request: MAPREDUCE-2324 Job should fail if a reduce task can't be
scheduled anywhere
Posted by Todd Lipcon <to...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1164/#review1157
-----------------------------------------------------------
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2386>
I think a default of 0.8 or so would probably make more sense -- just in case one of the TTs is in some bad state where it isn't heartbeating, we don't want to wait forever.
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2387>
since this is a set of trackers, not attempts, a better name might be: failedReduceSchedulingTrackers, or something?
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2388>
this key should probably be defined as a constant in MRJobConfig, right?
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2389>
refactor this into a new method?
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2391>
I think the operator precedence is off here.
(int)reduceInputAttemptFactor is higher precedence, so it will end up rounding anything < 1.0 down to 0.
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2390>
style: add space between if and (
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2392>
I think we mostly avoid the 1st person in error messages. Change to "Tried to schedule..." rather than "We tried"...
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2393>
StringUtils.humanReadableInt might be useful here.
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java
<https://reviews.apache.org/r/1164/#comment2394>
jobId is the job, not the task, right?
branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java
<https://reviews.apache.org/r/1164/#comment2395>
typos: "failes", "then that"
branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java
<https://reviews.apache.org/r/1164/#comment2396>
isn't the default input limit unlimited? why do we need this?
branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java
<https://reviews.apache.org/r/1164/#comment2397>
we should check that the failure info of the job has the correct type of error message (ie that it didn't fail due to some other error)
- Todd
On 2011-07-21 18:49:31, Robert Evans wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/1164/
> -----------------------------------------------------------
>
> (Updated 2011-07-21 18:49:31)
>
>
> Review request for hadoop-mapreduce, Todd Lipcon, Tom Graves, and Jeffrey Naisbitt.
>
>
> Summary
> -------
>
> Job should fail if a reduce task can't be scheduled anywhere. V2 of the patch.
>
>
> This addresses bug MAPREDUCE-2324.
> https://issues.apache.org/jira/browse/MAPREDUCE-2324
>
>
> Diffs
> -----
>
> branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/JobInProgress.java 1148035
> branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/TaskTracker.java 1148035
> branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/MiniMRCluster.java 1148035
> branches/branch-0.20-security/src/test/org/apache/hadoop/mapred/TestTaskLimits.java 1148035
>
> Diff: https://reviews.apache.org/r/1164/diff
>
>
> Testing
> -------
>
> Unit tests and ran manual tests on a single node cluster.
>
>
> Thanks,
>
> Robert
>
>