You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2015/07/23 09:43:06 UTC
[jira] [Commented] (MAPREDUCE-6009) Map-only job with new-api runs
wrong OutputCommitter when cleanup scheduled in a reduce slot
[ https://issues.apache.org/jira/browse/MAPREDUCE-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638392#comment-14638392 ]
Zheng Shao commented on MAPREDUCE-6009:
---------------------------------------
We just hit this bug in an unpatched version of MR1.
The situation is that HCatalog submits a Map-only job and hopes to use OutputCommitter.commitJob to create a Hive partition. Because of this bug, the Hive partition was never created.
Our sanity check on the hive table + workflow retry mechanism allowed us to have this bug running in production for a long time (and wasting compute resources). It's great that this is fixed.
> Map-only job with new-api runs wrong OutputCommitter when cleanup scheduled in a reduce slot
> --------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6009
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: client, job submission
> Affects Versions: 1.2.1
> Reporter: Gera Shegalov
> Assignee: Gera Shegalov
> Priority: Blocker
> Fix For: 1.3.0, 1.2.2
>
> Attachments: MAPREDUCE-6009.v01-branch-1.2.patch, MAPREDUCE-6009.v02-branch-1.2.patch
>
>
> In branch 1 job commit is executed in a JOB_CLEANUP task that may run in either map or reduce slot
> in org.apache.hadoop.mapreduce.Job#setUseNewAPI there is a logic setting new-api flag only for reduce-ful jobs.
> {code}
> if (numReduces != 0) {
> conf.setBooleanIfUnset("mapred.reducer.new-api",
> conf.get(oldReduceClass) == null);
> ...
> {code}
> Therefore, when cleanup runs in a reduce slot, ReduceTask inits using the old API and runs incorrect default OutputCommitter, instead of consulting OutputFormat.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)