You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kai-Hsun Chen (Jira)" <ji...@apache.org> on 2022/08/03 00:52:00 UTC

[jira] [Created] (SPARK-39956) Determine task failures based on ExecutorExitCode

Kai-Hsun Chen created SPARK-39956:
-------------------------------------

             Summary: Determine task failures based on ExecutorExitCode
                 Key: SPARK-39956
                 URL: https://issues.apache.org/jira/browse/SPARK-39956
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 3.4.0
            Reporter: Kai-Hsun Chen


There are a lot of possible reasons to cause an executor exit. However, the driver will assume every executor exit is caused by task failure. The assumption is wrong. For example, when DiskBlockManager fails to create a directory, it will close executor’s JVM with the exit code {{{}DISK_STORE_FAILED_TO_CREATE_DIR{}}}. Obviously, when the driver received the exit code {{{}DISK_STORE_FAILED_TO_CREATE_DIR{}}}, the executor exit is highly possible caused by hardware failure rather than task failure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org