You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/02/03 11:25:52 UTC
[jira] [Resolved] (SPARK-19438) executorDataMap should be guarded
by CoarseGrainedSchedulerBackend.this.synchronized
[ https://issues.apache.org/jira/browse/SPARK-19438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-19438.
-------------------------------
Resolution: Not A Problem
> executorDataMap should be guarded by CoarseGrainedSchedulerBackend.this.synchronized
> -------------------------------------------------------------------------------------
>
> Key: SPARK-19438
> URL: https://issues.apache.org/jira/browse/SPARK-19438
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.1.0
> Reporter: jin xing
>
> Currently when handle *RegisterExecutor* in *CoarseGrainedSchedulerBackend*, *executorDataMap* is guarded by *CoarseGrainedSchedulerBackend.this.synchronized* when updating, which can cause *numPendingExecutors* incorrect.
> Code is like below:
> {code}
> if (executorDataMap.contains(executorId)) {
> executorRef.send(RegisterExecutorFailed("Duplicate executor ID: " + executorId))
> context.reply(true)
> } else {
> ...
> CoarseGrainedSchedulerBackend.this.synchronized {
> executorDataMap.put(executorId, data)
> if (currentExecutorIdCounter < executorId.toInt) {
> currentExecutorIdCounter = executorId.toInt
> }
> if (numPendingExecutors > 0) {
> numPendingExecutors -= 1
> logDebug(s"Decremented number of pending executors ($numPendingExecutors left)")
> }
> }
> {code}
> Consider SPARK-19437 and a scenario like below:
> An executor sent *RegisterExecutor* twice by *askWithRetry*, and the interval between the two is quite small. Thus it might be possible that both of them will go to *else* branch, thus *numPendingExecutors* will be deducted twice. Currently, the *askWithRetry* of *RegisterExecutor* only exists in some unit tests, but it makes sense to make it stronger when handling *RegisterExecutor*.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org