You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "Xing Lin (Jira)" <ji...@apache.org> on 2022/12/03 01:13:00 UTC

[jira] [Updated] (HDFS-16852) HDFS-16852 Register the shutdown hook only when not in shutdown for KeyProviderCache constructor

     [ https://issues.apache.org/jira/browse/HDFS-16852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xing Lin updated HDFS-16852:
----------------------------
    Summary: HDFS-16852 Register the shutdown hook only when not in shutdown for KeyProviderCache constructor  (was: Swallow IllegalStateException in KeyProviderCache)

> HDFS-16852 Register the shutdown hook only when not in shutdown for KeyProviderCache constructor
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-16852
>                 URL: https://issues.apache.org/jira/browse/HDFS-16852
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>            Reporter: Xing Lin
>            Assignee: Xing Lin
>            Priority: Minor
>              Labels: pull-request-available
>
> When an HDFS client is created, it will register a shutdownhook to shutdownHookManager. ShutdownHookManager doesn't allow adding a new shutdownHook when the process is already in shutdown and throws an IllegalStateException.
> This behavior is not ideal, when a spark program failed during pre-launch. In that case, during shutdown, spark would call cleanStagingDir() to clean the staging dir. In cleanStagingDir(), it will create a FileSystem object to talk to HDFS. However, since this would be the first time to use a filesystem object in that process, it will need to create an hdfs client and register the shutdownHook. Then, we will hit the IllegalStateException. This illegalStateException will mask the actual exception which causes the spark program to fail during pre-launch.
> We propose to swallow IllegalStateException in KeyProviderCache and log a warning. The TCP connection between the client and NameNode should be closed by the OS when the process is shutdown. 
> Example stacktrace
> {code:java}
> 13-09-2022 14:39:42 PDT INFO - 22/09/13 21:39:41 ERROR util.Utils: Uncaught exception in thread shutdown-hook-0   
> 13-09-2022 14:39:42 PDT INFO - java.lang.IllegalStateException: Shutdown in progress, cannot add a shutdownHook    
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:299)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.hdfs.KeyProviderCache.<init>(KeyProviderCache.java:71)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.hdfs.ClientContext.<init>(ClientContext.java:130)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:167)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:383)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:287)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3261)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.deploy.yarn.ApplicationMaster.cleanupStagingDir(ApplicationMaster.scala:675)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.deploy.yarn.ApplicationMaster.$anonfun$run$2(ApplicationMaster.scala:259)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)          
> 13-09-2022 14:39:42 PDT INFO - at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2023)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)          
> 13-09-2022 14:39:42 PDT INFO - at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)          
> 13-09-2022 14:39:42 PDT INFO - at scala.util.Try$.apply(Try.scala:213)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)          
> 13-09-2022 14:39:42 PDT INFO - at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)          
> 13-09-2022 14:39:42 PDT INFO - at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)          
> 13-09-2022 14:39:42 PDT INFO - at java.util.concurrent.FutureTask.run(FutureTask.java:266)          
> 13-09-2022 14:39:42 PDT INFO - at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)          
> 13-09-2022 14:39:42 PDT INFO - at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)          
> 13-09-2022 14:39:42 PDT INFO - at java.lang.Thread.run(Thread.java:748)          
> 13-09-2022 14:39:42 PDT INFO - 22/09/13 21:39:41 INFO util.ShutdownHookManager: Shutdown hook called     
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org