You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xingbo Jiang (Jira)" <ji...@apache.org> on 2019/11/14 00:17:00 UTC

[jira] [Resolved] (SPARK-29287) Executors should not receive any offers before they are actually constructed

     [ https://issues.apache.org/jira/browse/SPARK-29287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xingbo Jiang resolved SPARK-29287.
----------------------------------
    Fix Version/s: 3.0.0
         Assignee: Kent Yao
       Resolution: Done

Fixed by https://github.com/apache/spark/pull/25964

> Executors should not receive any offers before they are actually constructed
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-29287
>                 URL: https://issues.apache.org/jira/browse/SPARK-29287
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Kent Yao
>            Assignee: Kent Yao
>            Priority: Major
>             Fix For: 3.0.0
>
>
> The executors send RegisterExecutor messages to the driver when onStart.
> The driver put the executor data in “the ready to serve map” if it could be, then send RegisteredExecutor back to the executor.  The driver now can make an offer to this executor.
> But the executor is not fully constructed yet. When it received RegisteredExecutor, it start to construct itself, initializing block manager, maybe register to the local shuffle server in the way of retrying, then start the heart beating to driver ... 
> The task allocated here may fail if the executor fails to start or cannot get heart beating to the driver in time.
> Sometimes, even worse, when dynamic allocation and blacklisting is enabled and when the runtime executor number down to min executor setting, and those executors receive tasks before fully constructed and if any error happens, the application may be blocked or tear down. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org