You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Manu Zhang (Jira)" <ji...@apache.org> on 2020/09/18 04:03:00 UTC

[jira] [Created] (SPARK-32932) AQE local reader optimizer breaks repartitioning for dynamic partition overwrite

Manu Zhang created SPARK-32932:
----------------------------------

             Summary: AQE local reader optimizer breaks repartitioning for dynamic partition overwrite
                 Key: SPARK-32932
                 URL: https://issues.apache.org/jira/browse/SPARK-32932
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Manu Zhang


With AQE, local reader optimizer breaks users' repartitioning for dynamic partition overwrite as in the following case.
{code:java}
test("repartition with local reader") {
  withSQLConf(SQLConf.PARTITION_OVERWRITE_MODE.key -> PartitionOverwriteMode.DYNAMIC.toString,
    SQLConf.SHUFFLE_PARTITIONS.key -> "5",
    SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true") {
    withTable("t") {
      val data = for (
        i <- 1 to 10;
        j <- 1 to 3
      ) yield (i, j)
      data.toDF("a", "b")
        .repartition($"b")
        .write
        .partitionBy("b")
        .mode("overwrite")
        .saveAsTable("t")
      assert(spark.read.table("t").inputFiles.length == 3)
    }
  }
}{code}
Coalescing shuffle partitions could also break it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org