You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/04/27 07:56:00 UTC

[jira] [Commented] (SPARK-35239) Coalesce shuffle partition should handle empty input RDD

    [ https://issues.apache.org/jira/browse/SPARK-35239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333016#comment-17333016 ] 

Apache Spark commented on SPARK-35239:
--------------------------------------

User 'ulysses-you' has created a pull request for this issue:
https://github.com/apache/spark/pull/32362

>  Coalesce shuffle partition should handle empty input RDD
> ---------------------------------------------------------
>
>                 Key: SPARK-35239
>                 URL: https://issues.apache.org/jira/browse/SPARK-35239
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: ulysses you
>            Priority: Minor
>
> If input RDD partition is empty then the map output statistics will be null. And if all shuffle stage's input RDD partition is empty, we will skip it and lose the chance to coalesce partition.
>  
> We can simply create a empty partition for these custom shuffle reader to reduce the partition number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org