You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "hujiahua (Jira)" <ji...@apache.org> on 2021/12/20 03:16:00 UTC

[jira] [Updated] (SPARK-37688) ExecutorMonitor should ignore SparkListenerBlockUpdated event if executor was not active

     [ https://issues.apache.org/jira/browse/SPARK-37688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

hujiahua updated SPARK-37688:
-----------------------------
    Description: 
When a executor was not alive, and ExecutorMonitor received late SparkListenerBlockUpdated event. The `onBlockUpdated` hander will call `ensureExecutorIsTracked`, which will create a new executor tracker with UNKNOWN_RESOURCE_PROFILE_ID for the dead executor. And ExecutorAllocationManager will not remove executor with UNKNOWN_RESOURCE_PROFILE_ID, which cause a executor slot is occupied by the dead executor, so a new one cannot be created . 

The ExecutorAllocationManager log was like this:
21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!
21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!
21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!
21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!

  was:When a executor was not alive, and ExecutorMonitor received late SparkListenerBlockUpdated event. The `onBlockUpdated` hander will call `ensureExecutorIsTracked`, which will create a new executor tracker with UNKNOWN_RESOURCE_PROFILE_ID for the dead executor. And ExecutorAllocationManager will not remove executor with UNKNOWN_RESOURCE_PROFILE_ID, which cause a executor slot is occupied by the dead executor, so a new one cannot be created . 


> ExecutorMonitor should ignore SparkListenerBlockUpdated event if executor was not active
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-37688
>                 URL: https://issues.apache.org/jira/browse/SPARK-37688
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.1.2
>            Reporter: hujiahua
>            Priority: Major
>
> When a executor was not alive, and ExecutorMonitor received late SparkListenerBlockUpdated event. The `onBlockUpdated` hander will call `ensureExecutorIsTracked`, which will create a new executor tracker with UNKNOWN_RESOURCE_PROFILE_ID for the dead executor. And ExecutorAllocationManager will not remove executor with UNKNOWN_RESOURCE_PROFILE_ID, which cause a executor slot is occupied by the dead executor, so a new one cannot be created . 
> The ExecutorAllocationManager log was like this:
> 21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!
> 21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!
> 21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!
> 21/08/24 15:38:14 WARN [spark-dynamic-executor-allocation] ExecutorAllocationManager: Not removing executor 34324 because the ResourceProfile was UNKNOWN!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org