You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Yuming Wang (Jira)" <ji...@apache.org> on 2020/03/23 05:19:00 UTC

[jira] [Commented] (SPARK-31220) distribute by obeys spark.sql.adaptive.coalescePartitions.initialPartitionNum when spark.sql.adaptive.enabled

    [ https://issues.apache.org/jira/browse/SPARK-31220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064539#comment-17064539 ] 

Yuming Wang commented on SPARK-31220:
-------------------------------------

I'm working on.

> distribute by obeys spark.sql.adaptive.coalescePartitions.initialPartitionNum when spark.sql.adaptive.enabled
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-31220
>                 URL: https://issues.apache.org/jira/browse/SPARK-31220
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> {code:scala}
> spark.sql("CREATE TABLE spark_31220(id int)")
> spark.sql("set spark.sql.adaptive.coalescePartitions.initialPartitionNum=1000")
> spark.sql("set spark.sql.adaptive.enabled=true")
> {code}
> {noformat}
> scala> spark.sql("SELECT id from spark_31220 GROUP BY id").explain
> == Physical Plan ==
> AdaptiveSparkPlan(isFinalPlan=false)
> +- HashAggregate(keys=[id#5], functions=[])
>    +- Exchange hashpartitioning(id#5, 1000), true, [id=#171]
>       +- HashAggregate(keys=[id#5], functions=[])
>          +- FileScan parquet default.spark_31220[id#5] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/apache-spark/spark-warehouse/spark_31220], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int>
> scala> spark.sql("SELECT id from spark_31220 DISTRIBUTE BY id").explain
> == Physical Plan ==
> AdaptiveSparkPlan(isFinalPlan=false)
> +- Exchange hashpartitioning(id#5, 200), false, [id=#179]
>    +- FileScan parquet default.spark_31220[id#5] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/apache-spark/spark-warehouse/spark_31220], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int>
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org