You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/27 23:51:18 UTC

[GitHub] [spark] jiangxb1987 removed a comment on issue #26975: [SPARK-30325][CORE] markPartitionCompleted cause task status inconsistent

jiangxb1987 removed a comment on issue #26975: [SPARK-30325][CORE] markPartitionCompleted cause task status inconsistent
URL: https://github.com/apache/spark/pull/26975#issuecomment-569365197
 
 
   There are multiple corner cases not handled by current solution:
   Image we have two TSMs (M1 and M2) working on the same Stage, and for the corresponding tasks are notated as T1 and T2 for a specific partition:
   1. T1 and T2 might be scheduled on different executors (E1 and E2), T1 has been finished but T2 is still running. Then E2 get lost, in the approach suggested by this PR, the partition in M2 will be marked as not successful and a new pending task would be added, which is actually not necessary because the shuffle files are on E1;
   2. T1 and T2 might be scheduled on the same executor, T1 has been finished but T2 is still running. Then the executor get lost, since T2 is still running the partition will not be marked as not successful. After a while maybe another task finished and mark the TSM as finished, but actually the shuffle files get lost, thus it lead to a new regression.
   
   I haven't get a solution here. I'm thinking whether we can put the successful task information into `taskInfos` inside `markPartitionCompleted`, if this is possible then the second problem I mentioned above could probably get resolved.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org