You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ye Zhou (Jira)" <ji...@apache.org> on 2021/10/16 06:45:00 UTC
[jira] [Created] (SPARK-37023) Avoid fetching merge status when
shuffleMergeEnabled is false for a shuffleDependency during retry
Ye Zhou created SPARK-37023:
-------------------------------
Summary: Avoid fetching merge status when shuffleMergeEnabled is false for a shuffleDependency during retry
Key: SPARK-37023
URL: https://issues.apache.org/jira/browse/SPARK-37023
Project: Spark
Issue Type: Sub-task
Components: Shuffle
Affects Versions: 3.2.0
Reporter: Ye Zhou
The assertion below inĀ MapOutoutputTracker.getMapSizesByExecutorId is not guaranteed
{code:java}
assert(mapSizesByExecutorId.enableBatchFetch == true){code}
The reason is during some stage retry cases, the shuffleDependency.shuffleMergeEnabled is set to false, but there will be mergeStatus since the Driver has collected the merged status for its shuffle dependency. If this is the case, the current implementation would set the enableBatchFetch to false, since there are mergeStatus.
Details can be found here:
[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L1492]
We should improve the implementation here.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org