You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/08/03 01:15:00 UTC

[jira] [Assigned] (SPARK-39956) Determine task failures based on ExecutorExitCode

     [ https://issues.apache.org/jira/browse/SPARK-39956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-39956:
------------------------------------

    Assignee: Apache Spark

> Determine task failures based on ExecutorExitCode
> -------------------------------------------------
>
>                 Key: SPARK-39956
>                 URL: https://issues.apache.org/jira/browse/SPARK-39956
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.4.0
>            Reporter: Kai-Hsun Chen
>            Assignee: Apache Spark
>            Priority: Major
>
> There are a lot of possible reasons to cause an executor exit. However, the driver will assume every executor exit is caused by task failure. The assumption is wrong. For example, when DiskBlockManager fails to create a directory, it will close executor’s JVM with the exit code {{{}DISK_STORE_FAILED_TO_CREATE_DIR{}}}. Obviously, when the driver received the exit code {{{}DISK_STORE_FAILED_TO_CREATE_DIR{}}}, the executor exit is highly possible caused by hardware failure rather than task failure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org