You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/24 06:09:19 UTC

[GitHub] [spark] itskals edited a comment on issue #26975: [SPARK-30325][CORE] Stage retry and executor crash cause app hung up forever

itskals edited a comment on issue #26975: [SPARK-30325][CORE] Stage retry and executor crash cause app hung up forever
URL: https://github.com/apache/spark/pull/26975#issuecomment-568664326

I was of the opinion that when a task is started by a stage attempt and still in progress, no subsequent retries from other stage attempt must be made, unless it is fate is known.
To know if the partition is already assigned to some task, the MapStatus entry for the partition could denote the intermediate step.(As of now MapStatusEntry is either null or filled, kind of boolean. I think we can have the third stage).

By this proposed model, we can have the compute resources also saved(no need to start a redundant computation if one stage attempt is already working on it). However, we allow speculation as its within same stage attempt.

DO let me know if there is any shortcomings in this thought process.

@cloud-fan @seayoun @jiangxb1987

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org