You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2021/11/05 21:18:00 UTC

[jira] [Resolved] (SPARK-37151) Avoid executor state sync attempt fail continuously in a short timeframe

     [ https://issues.apache.org/jira/browse/SPARK-37151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun resolved SPARK-37151.
-----------------------------------
    Fix Version/s: 3.3.0
       Resolution: Fixed

Issue resolved by pull request 34428
[https://github.com/apache/spark/pull/34428]

> Avoid executor state sync attempt fail continuously in a short timeframe
> ------------------------------------------------------------------------
>
>                 Key: SPARK-37151
>                 URL: https://issues.apache.org/jira/browse/SPARK-37151
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.2.0
>            Reporter: Xingbo Jiang
>            Assignee: Xingbo Jiang
>            Priority: Major
>             Fix For: 3.3.0
>
>
> A worker would retry sending the ExecutorStateChanged message when the previous attempt failed. This would not be an issue when the attempt failed with TimeoutException. But if the connection between the worker and the master is broken, the attempt would fail immediately, leading to the retry attempt also fail, and quickly reaches the max attempt limitation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org