You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mridulm <gi...@git.apache.org> on 2018/06/19 02:53:59 UTC
[GitHub] spark pull request #21577: [SPARK-24589][core] Correctly identify tasks in o...
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/21577#discussion_r196282241
--- Diff: core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala ---
@@ -109,20 +116,21 @@ private[spark] class OutputCommitCoordinator(conf: SparkConf, isDriver: Boolean)
* @param maxPartitionId the maximum partition id that could appear in this stage's tasks (i.e.
* the maximum possible value of `context.partitionId`).
*/
- private[scheduler] def stageStart(stage: StageId, maxPartitionId: Int): Unit = synchronized {
+ private[scheduler] def stageStart(stage: Int, maxPartitionId: Int): Unit = synchronized {
stageStates(stage) = new StageState(maxPartitionId + 1)
--- End diff --
There are two cases here (both not handled in existing/earlier code).
Handled in PR:
* Stage S1 attempt A1 launched.
* Tasks T1_1 launched for partition P1
* A1 fails
* Stage S1 attempt A2 launched.
* Tasks T1_2 for partition P1 launched.
* T1_1 finishes, and is allowed to commit.
IMO not handled in PR:
* Stage S1 attempt A1 launched.
* Tasks T1_1.1 launched for partition P1
* Tasks T1_1.2 launched for partition P1 (speculative)
* Task T1_1.1 committed.
* A1 fails
* Stage S1 attempt A2 launched for some other pending partitions.
* Tasks T1_1.2 wants to commit.
T1_1.2 will be allowed to commit.
Now we have two tasks for same partition successfully committing.
Did I miss something here ?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org