You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kousuke Saruta (JIRA)" <ji...@apache.org> on 2014/10/10 18:15:33 UTC

[jira] [Created] (SPARK-3900) ApplicationMaster's shutdown hook fails to cleanup staging directory.

Kousuke Saruta created SPARK-3900:
-------------------------------------

             Summary: ApplicationMaster's shutdown hook fails to cleanup staging directory.
                 Key: SPARK-3900
                 URL: https://issues.apache.org/jira/browse/SPARK-3900
             Project: Spark
          Issue Type: Bug
          Components: YARN
    Affects Versions: 1.2.0
         Environment: Hadoop 0.23
            Reporter: Kousuke Saruta
            Priority: Critical


ApplicationMaster registers a shutdown hook and it calls ApplicationMaster#cleanupStagingDir.

cleanupStagingDir invokes FileSystem.get(yarnConf) and it invokes FileSystem.getInternal. FileSystem.getInternal also registers shutdown hook.
In FileSystem of hadoop 0.23, the shutdown hook registration does not consider whether shutdown is in progress or not (In 2.2, it's considered).

{code}
// 0.23 
if (map.isEmpty() ) {
  ShutdownHookManager.get().addShutdownHook(clientFinalizer, SHUTDOWN_HOOK_PRIORITY);
}
{code}

{code}
// 2.2
if (map.isEmpty()
            && !ShutdownHookManager.get().isShutdownInProgress()) {
   ShutdownHookManager.get().addShutdownHook(clientFinalizer, SHUTDOWN_HOOK_PRIORITY);
}
{code}

Thus, in 0.23, another shutdown hook can be registered when ApplicationMaster's shutdown hook run.

This issue cause IllegalStateException as follows.

{code}
java.lang.IllegalStateException: Shutdown in progress, cannot add a shutdownHook
        at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:152)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2306)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2278)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
        at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$cleanupStagingDir(ApplicationMaster.scala:307)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:118)
        at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org