You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/03 19:55:00 UTC

[GitHub] [spark] hiboyang commented on pull request #31715: [SPARK-34601][SHUFFLE] Add spark.shuffle.markFileLostOnExecutorLost to not delete shuffle file on executor lost event

hiboyang commented on pull request #31715:
URL: https://github.com/apache/spark/pull/31715#issuecomment-790013265


   > I do not think this flag (indicating whether to unregister all the shuffle blocks created by the failed executor) must be set at the application level (the way this PR does).
   > 
   > I think this must be a property kept for the replicated shuffle block so somewhere close to `MapStatus`.
   > 
   > Imagine a mixed solution where the disaggregated storage just a fallback and still locally stored blocks can be used.
   > Then it make sense to unregister the shuffle blocks for the failed executors and leave the disaggregated storage access for the blocks intact.
   
   Yeah, if user uses a mixed solution where the disaggregated storage just a fallback, it is better to have MapStatus to track availability of shuffle blocks. The PR does not block this scenario.
   
   There is non-mixed solution where people totally rely on remote shuffle service (like Facebook/Uber's one). In that case, people only need to set this flag in the application level.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org