You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Philip Zeyliger (JIRA)" <ji...@apache.org> on 2017/10/26 18:15:01 UTC

[jira] [Created] (MAPREDUCE-6992) Race for temp dir in LocalDistributedCacheManager.java

Philip Zeyliger created MAPREDUCE-6992:
------------------------------------------

             Summary: Race for temp dir in LocalDistributedCacheManager.java
                 Key: MAPREDUCE-6992
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6992
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Philip Zeyliger


When localizing distributed cache files in "local" mode, LocalDistributedCacheManager.java chooses a "unique" directory based on a millisecond time stamp. When running code with some parallelism, it's possible to run into this.

The error message looks like 
{code}
bq. java.io.FileNotFoundException: jenkins/mapred/local/1508958341829_tmp does not exist
{code}

I ran into this in Impala's data loading. There, we run a HiveServer2 which runs in MapReduce. If multiple queries are submitted simultaneously to the HS2, they conflict on this directory. Googling found that StreamSets ran into something very similar looking at https://issues.streamsets.com/browse/SDC-5473.

I believe the buggy code is (link: https://github.com/apache/hadoop/blob/2da654e34a436aae266c1fbdec5c1067da8d854e/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalDistributedCacheManager.java#L94)
{code}
    // Generating unique numbers for FSDownload.
    AtomicLong uniqueNumberGenerator =
        new AtomicLong(System.currentTimeMillis());
{code}

Notably, a similar code path uses an actual random number generator ({{LocalJobRunner.java}}, https://github.com/apache/hadoop/blob/2da654e34a436aae266c1fbdec5c1067da8d854e/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java#L912).
{code}
  public String getStagingAreaDir() throws IOException {
    Path stagingRootDir = new Path(conf.get(JTConfig.JT_STAGING_AREA_ROOT,
        "/tmp/hadoop/mapred/staging"));
    UserGroupInformation ugi = UserGroupInformation.getCurrentUser();
    String user;
    randid = rand.nextInt(Integer.MAX_VALUE);
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org