You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chris (Jira)" <ji...@apache.org> on 2021/05/13 14:33:00 UTC

[jira] [Updated] (SPARK-35399) State is still needed in the event of executor failure

     [ https://issues.apache.org/jira/browse/SPARK-35399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris updated SPARK-35399:
--------------------------
    Description: 
[Graceful Decommission of Executors|https://spark.apache.org/docs/3.1.1/job-scheduling.html#graceful-decommission-of-executors] section states that:
{quote}a Spark executor exits either on failure or when the associated application has also exited. In both scenarios, all state associated with the executor is no longer needed and can be safely discarded.
{quote}
However, in the case of a flaky app with tasks causing occasional executor OOM, state _is_ needed to prevent triggering of the stage failure mechanism to regenerate missing blocks. Hence, the Shuffle Service is valuable in this scenario, not only in the dynamic resource allocation scenario.

  was:
[Graceful Decommission of Executors|https://spark.apache.org/docs/3.1.1/job-scheduling.html#graceful-decommission-of-executors] section states that:

 
{quote}a Spark executor exits either on failure or when the associated application has also exited. In both scenarios, all state associated with the executor is no longer needed and can be safely discarded.
{quote}
However, in the case of a flaky app with tasks causing occasional executor OOM, state _is_ needed to prevent triggering of the stage failure mechanism to regenerate missing blocks. Hence, the Shuffle Service is valuable in this scenario, not only in the dynamic resource allocation scenario.


> State is still needed in the event of executor failure
> ------------------------------------------------------
>
>                 Key: SPARK-35399
>                 URL: https://issues.apache.org/jira/browse/SPARK-35399
>             Project: Spark
>          Issue Type: Documentation
>          Components: Documentation
>    Affects Versions: 3.1.1
>            Reporter: Chris
>            Priority: Minor
>              Labels: newbie, pull-request-available
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> [Graceful Decommission of Executors|https://spark.apache.org/docs/3.1.1/job-scheduling.html#graceful-decommission-of-executors] section states that:
> {quote}a Spark executor exits either on failure or when the associated application has also exited. In both scenarios, all state associated with the executor is no longer needed and can be safely discarded.
> {quote}
> However, in the case of a flaky app with tasks causing occasional executor OOM, state _is_ needed to prevent triggering of the stage failure mechanism to regenerate missing blocks. Hence, the Shuffle Service is valuable in this scenario, not only in the dynamic resource allocation scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org