You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Alejandro Abdelnur (JIRA)" <ji...@apache.org> on 2012/11/30 19:51:58 UTC

[jira] [Commented] (MAPREDUCE-4820) MRApps distributed-cache duplicate checks are incorrect

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507542#comment-13507542 ] 

Alejandro Abdelnur commented on MAPREDUCE-4820:
-----------------------------------------------

Following up on this. It seems the duplicate entries are introduced by YARN/MR, not a client submitting a job. 

IMO, we should undo MAPREDUCE-4549 (as clients submitting jobs cannot do anything to avoid this).

                
> MRApps distributed-cache duplicate checks are incorrect
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-4820
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4820
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.2-alpha
>            Reporter: Alejandro Abdelnur
>            Priority: Blocker
>
> This seems a combination of issues that are being exposed in 2.0.2-alpha by MAPREDUCE-4549.
> MAPREDUCE-4549 introduces a check to to ensure there are not duplicate JARs in the distributed-cache (using the JAR name as identity).
> In Hadoop 2 (different from Hadoop 1), all JARs in the distributed-cache are symlink-ed to the current directory of the task.
> MRApps, when setting up the DistributedCache (MRApps#setupDistributedCache->parseDistributedCacheArtifacts) assumes that the local resources (this includes files in the CURRENT_DIR/, CURRENT_DIR/classes/ and files in CURRENT_DIR/lib/) are part of the distributed-cache already.
> For systems, like Oozie, which use a launcher job to submit the real job this poses a problem because MRApps is run from the launcher job to submit the real job. The configuration of the real job has the correct distributed-cache entries (no duplicates), but because the current dir has the same files, the submission fails.
> It seems that MRApps should not be checking dups in the distributed-cached against JARs in the CURRENT_DIR/ or CURRENT_DIR/lib/. The dup check should be done among distributed-cached entries only.
> It seems YARNRunner is symlink-ing all files in the distributed cached in the current directory. In Hadoop 1 this was done only for files added to the distributed-cache using a fragment (ie "#FOO") to trigger a symlink creation. 
> Marking as a blocker because without a fix for this, Oozie cannot submit jobs to Hadoop 2 (i've debugged Oozie in a live cluster being used by BigTop -thanks Roman- to test their release work, and I've verified that Oozie 3.3 does not create duplicated entries in the distributed-cache)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira