You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jie Huang (JIRA)" <ji...@apache.org> on 2015/08/03 08:45:04 UTC

[jira] [Created] (SPARK-9552) Add force control for killExecutors to avoid false killing for those busy executors

Jie Huang created SPARK-9552:
--------------------------------

             Summary: Add force control for killExecutors to avoid false killing for those busy executors
                 Key: SPARK-9552
                 URL: https://issues.apache.org/jira/browse/SPARK-9552
             Project: Spark
          Issue Type: Bug
          Components: Scheduler
    Affects Versions: 1.4.1, 1.4.0
            Reporter: Jie Huang


By using the dynamic allocation, sometimes it occurs false killing for those busy executors. Some executors with assignments will be killed because of being idle for enough time (say 60 seconds). The root cause is that the Task-Launch listener event is asynchronized.

For example, some executors are under assigning tasks, but not sending out the listener notification yet. Meanwhile, the dynamic allocation's executor idle time is up (e.g., 60 seconds). It will trigger killExecutor event at the same time.

the timer expiration starts before the listener event arrives.
Then, the task is going to run on top of that killed/killing executor. It will lead to task failure finally.
Here is the proposal to fix it. We can add the force control for killExecutor. If the force control is not set (i.e., false), we'd better to check if the executor under killing is idle or busy. If the current executor has some assignment, we should not kill that executor and return back false (to indicate killing failure). In dynamic allocation, we'd better to turn off force killing (i.e., force = false), we will meet killing failure if tries to kill a busy executor. And then, the executor timer won't be invalid. Later on, the task assignment event arrives, we can remove the idle timer accordingly. So that we can avoid false killing for those busy executors in dynamic allocation.

For the rest of usages, the end users can decide if to use force killing or not by themselves. If to turn on that option, the killExecutor will do the action without any status checking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org