You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kent Yao (Jira)" <ji...@apache.org> on 2019/09/29 08:47:00 UTC

[jira] [Created] (SPARK-29287) Executors should not receive any offers before they are actually constructed

Kent Yao created SPARK-29287:
--------------------------------

             Summary: Executors should not receive any offers before they are actually constructed
                 Key: SPARK-29287
                 URL: https://issues.apache.org/jira/browse/SPARK-29287
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.4.4
            Reporter: Kent Yao


The executors send RegisterExecutor messages to the driver when onStart.

The driver put the executor data in “the ready to serve map” if it could be, then send RegisteredExecutor back to the executor.  The driver now can make an offer to this executor.

But the executor is not fully constructed yet. When it received RegisteredExecutor, it start to construct itself, initializing block manager, maybe register to the local shuffle server in the way of retrying, then start the heart beating to driver ... 

The task allocated here may fail if the executor fails to start or cannot get heart beating to the driver in time.

Sometimes, even worse, when dynamic allocation and blacklisting is enabled and when the runtime executor number down to min executor setting, and those executors receive tasks before fully constructed and if any error happens, the application may be blocked or tear down. 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org