You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by windpiger <gi...@git.apache.org> on 2017/01/16 06:26:29 UTC

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

GitHub user windpiger opened a pull request:

    https://github.com/apache/spark/pull/16593

    [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with create partitioned table

    ## What changes were proposed in this pull request?
    
    After [SPARK-19107](https://issues.apache.org/jira/browse/SPARK-19153), we now can treat hive as a data source and create hive tables with DataFrameWriter and Catalog. However, the support is not completed, there are still some cases we do not support.
    
    this PR provide DataFrameWriter.saveAsTable work with hive format to create partitioned table.
    
    ## How was this patch tested?
    unit test added


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/windpiger/spark saveAsTableWithPartitionedTable

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16593
    
----
commit 6c31d017324b3c7f310103d2d4b5138bbef4b463
Author: windpiger <so...@outlook.com>
Date:   2017-01-16T06:23:09Z

    [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with create partitioned table

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96805893
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -64,7 +77,7 @@ case class CreateHiveTableAsSelectCommand(
           val withSchema = if (withFormat.schema.isEmpty) {
             // Hive doesn't support specifying the column list for target table in CTAS
             // However we don't think SparkSQL should follow that.
    --- End diff --
    
    We need to update the above comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by windpiger <gi...@git.apache.org>.

Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    thanks all, let's make a summary:
    1. no CTAS
    `
    create table t(a int, b int, c string, d string)
    using $provider
    partitioned by(d, c)
    `
    the schema order of table in catalog should be `a, b, d, c`
    a) for datasource table 
    this situation `has ensured by DataSource.getOrInferFileFormatSchema`:
    https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L182
    
    b) for hive table
      as @lins05 's comment, currently we does not process this situation, as the suggest we should
     add a new rule for it.
    
    2. CTAS
    `
    create table t
    using $provider
    partitioned by(d, c)
    select 1 as b, 2 as a, 'x' as c, 'y' as d
    `
    the schema order of table in catalog should be `b, a, d, c`
    a) for datasource table 
    this situation `has ensured by create table with updated schema`:
    https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala#L159
    
    b) for hive table
      this pr put this logic in `CreateHIveTableAsSelectCommand`, if we add a new rule, we can merge the logic with no-CTAS for hive situation.
    
    Above all, to ensure the order of schema in catalog as we expected, we need add a new rule for hive table. this is the test branch implement the new rule,https://github.com/windpiger/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8#diff-73bd90660f41c12a87ee9fe8d35d856aR463
    
    But before this implement new rule, we should first merge the pr(#16642), then we can get a `tableDesc with non-empty schema`, and then we can use it here https://github.com/windpiger/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8#diff-73bd90660f41c12a87ee9fe8d35d856aR470 
    
    @cloud-fan @lins05 is this ok?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96218307
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala ---
    @@ -183,9 +183,11 @@ case class CatalogTable(
     
       import CatalogTable._
     
    -  /** schema of this table's partition columns */
    -  def partitionSchema: StructType = StructType(schema.filter {
    -    c => partitionColumnNames.contains(c.name)
    +  /** schema of this table's partition columns
    +   * keep the schema order with partitionColumnNames
    +   */
    +  def partitionSchema: StructType = StructType(partitionColumnNames.flatMap {
    +    p => schema.filter(_.name == p)
    --- End diff --
    
    use `partitionColumnNames.map(c => schema.find(_.name == c).getOrElse(throw ...))`, each column name in `partitionColumnNames` must match one and only one field in schema.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r97017743
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +46,18 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +    // when create a partitioned table, we should reorder the columns
    +    // to put the partition columns at the end
    +    val partitionAttrs = tableDesc.partitionColumnNames.map { p =>
    +      query.output.find(_.name == p).getOrElse(
    +        new AnalysisException(s"Partition column[$p] does not exist " +
    +          s"in query output partition").asInstanceOf[NamedExpression]
    +      )
    +    }
    +    val partitionSet = AttributeSet(partitionAttrs)
    +    val dataAttrs = query.output.filterNot(partitionSet.contains)
    +    val reorderedOutputQuery = Project(dataAttrs ++ partitionAttrs, query)
    --- End diff --
    
    we can revert this after https://github.com/apache/spark/pull/16655


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71445/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71664/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71702/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71702/testReport)** for PR 16593 at commit [`21f113a`](https://github.com/apache/spark/commit/21f113a85ae2df46c93dd57384a01955f394188b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/16593


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96349144
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala ---
    @@ -183,9 +183,15 @@ case class CatalogTable(
     
       import CatalogTable._
     
    -  /** schema of this table's partition columns */
    -  def partitionSchema: StructType = StructType(schema.filter {
    -    c => partitionColumnNames.contains(c.name)
    +  /**
    +   * schema of this table's partition columns
    +   * keep the schema order with partitionColumnNames
    +   */
    +  def partitionSchema: StructType = StructType(partitionColumnNames.map {
    +    p => schema.find(_.name == p).getOrElse(
    +      throw new AnalysisException(s"Partition column [$p] " +
    +        s"did not exist in schema ${schema.toString}")
    --- End diff --
    
    "did not exist" -> "does not exist"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71594 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71594/testReport)** for PR 16593 at commit [`a656474`](https://github.com/apache/spark/commit/a656474521fa3976af6e954e8c2d00243b282634).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71702/testReport)** for PR 16593 at commit [`21f113a`](https://github.com/apache/spark/commit/21f113a85ae2df46c93dd57384a01955f394188b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96785016
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1361,6 +1355,38 @@ class HiveDDLSuite
         }
       }
     
    +  test("create hive serde table as select with DataFrameWriter.saveAsTable with partitionBy") {
    +    withTable("t", "t1") {
    +      withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
    +        Seq(10 -> "y").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row("y", 10) :: Nil)
    +        var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        var partitionSchema = table.partitionSchema
    +        assert(partitionSchema.size == 1 && partitionSchema.fields(0).name == "i" &&
    +          partitionSchema.fields(0).dataType == IntegerType)
    +
    +        Seq(11 -> "z").toDF("i", "j").write.mode("overwrite").format("hive")
    +          .partitionBy("j").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row(11, "z") :: Nil)
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        partitionSchema = table.partitionSchema
    +        assert(partitionSchema.size == 1 && partitionSchema.fields(0).name == "j" &&
    +          partitionSchema.fields(0).dataType == StringType)
    +
    +        Seq((1, 2, 3)).toDF("i", "j", "k").write.mode("overwrite").format("hive")
    +          .partitionBy("k", "j").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row(1, 3, 2) :: Nil)
    +
    +        Seq((1, 2, 3)).toDF("i", "j", "k").write.mode("overwrite").format("hive")
    --- End diff --
    
    I think we don't need to test `overwrite` behavior so many times, just create a table with `Seq(10 -> "y").toDF("i", "j").write.partitionBy("i")` and overwrite it with `Seq((1, 2, 3)).toDF("i", "j", "k").write.partitionBy("j", "k")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96849182
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1384,4 +1394,96 @@ class HiveDDLSuite
           assert(e2.message.contains("Hive data source can only be used with tables"))
         }
       }
    +
    +  test("the columns order in catalog should respect the order when create table") {
    +    withTable("t", "t1", "t2", "t3") {
    +      val structType = Seq(("a", IntegerType), ("b", IntegerType),
    +        ("c", StringType), ("d", StringType))
    +      val partStructType = Seq(("c", StringType), ("d", StringType))
    +
    +      sql(
    +        """CREATE TABLE IF NOT EXISTS t(a int, b int, c string, d string)
    +          | using parquet
    +          | partitioned by (c, d)""".stripMargin)
    +      var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +      assert(table.schema.map(s => (s.name, s.dataType)) == structType)
    +      assert(table.partitionSchema.map(s => (s.name, s.dataType)) == partStructType)
    +
    +      val structType1 = Seq(("b", IntegerType), ("a", IntegerType),
    +        ("d", StringType), ("c", StringType))
    +      val partStructType1 = Seq(("d", StringType), ("c", StringType))
    +
    +      sql(
    +        """CREATE TABLE IF NOT EXISTS t1(b int, a int, c string, d string)
    --- End diff --
    
    let's not add new test for the legacy hive syntax anymore, please use `create table xxx using hive`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71660/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96218522
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala ---
    @@ -95,8 +95,15 @@ private[hive] case class MetastoreRelation(
         val (partCols, schema) = catalogTable.schema.map(toHiveColumn).partition { c =>
           catalogTable.partitionColumnNames.contains(c.getName)
         }
    +
    +    // keep the schema order with catalogTable.partitionColumnNames
    +    val reorderPartCols = catalogTable.partitionColumnNames.flatMap {
    --- End diff --
    
    can you move this logic to `HiveClientImpl`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71660/testReport)** for PR 16593 at commit [`4410de9`](https://github.com/apache/spark/commit/4410de978c15d8b54bad2427a1613610afce444b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by windpiger <gi...@git.apache.org>.

Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    this added test?
    https://github.com/apache/spark/pull/16593/files#diff-b7094baa12601424a5d19cb930e3402fR1385


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71603/testReport)** for PR 16593 at commit [`d15b947`](https://github.com/apache/spark/commit/d15b9470958b9a97e94376efb7d6c6e859f00648).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71603/testReport)** for PR 16593 at commit [`d15b947`](https://github.com/apache/spark/commit/d15b9470958b9a97e94376efb7d6c6e859f00648).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71634/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71701/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by windpiger <gi...@git.apache.org>.

Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96336300
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +46,25 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +
    +    // relation should move partition columns to the last
    +    val (partOutputs, nonPartOutputs) = query.output.partition {
    +      a =>
    --- End diff --
    
    nit: code style
    ```
    xxx.map { p =>
      xxx
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96349044
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala ---
    @@ -183,9 +183,15 @@ case class CatalogTable(
     
       import CatalogTable._
     
    -  /** schema of this table's partition columns */
    -  def partitionSchema: StructType = StructType(schema.filter {
    -    c => partitionColumnNames.contains(c.name)
    +  /**
    +   * schema of this table's partition columns
    +   * keep the schema order with partitionColumnNames
    --- End diff --
    
    "keep the schema order with partitionColumnNames because we always concatenate the partition columns to the schema when reading the table information from hive  metastore."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71657/testReport)** for PR 16593 at commit [`082abd2`](https://github.com/apache/spark/commit/082abd232688537d0a41d67277593ee93e825496).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71626/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71767/testReport)** for PR 16593 at commit [`4c50f49`](https://github.com/apache/spark/commit/4c50f4938a9da360827092db0a8ab50c6c793c2f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71421/testReport)** for PR 16593 at commit [`6c31d01`](https://github.com/apache/spark/commit/6c31d017324b3c7f310103d2d4b5138bbef4b463).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96784647
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1361,6 +1355,38 @@ class HiveDDLSuite
         }
       }
     
    +  test("create hive serde table as select with DataFrameWriter.saveAsTable with partitionBy") {
    +    withTable("t", "t1") {
    +      withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
    +        Seq(10 -> "y").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row("y", 10) :: Nil)
    +        var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        var partitionSchema = table.partitionSchema
    --- End diff --
    
    Since the `partitionSchema` change is reverted, we don't need to test it here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    please address https://github.com/apache/spark/pull/16593#discussion_r96610195


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71442 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71442/testReport)** for PR 16593 at commit [`046ead7`](https://github.com/apache/spark/commit/046ead7c87c76c0a17852f65f51f8a746ed638a9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71445 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71445/testReport)** for PR 16593 at commit [`76f643a`](https://github.com/apache/spark/commit/76f643a507f39099a6539899597398c35341ad63).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71784 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71784/testReport)** for PR 16593 at commit [`7bdc265`](https://github.com/apache/spark/commit/7bdc265500cbfd6b4dc16ec6a6ce7c321e7dd3dc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71439/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96348515
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1343,17 +1343,41 @@ class HiveDDLSuite
           sql("INSERT INTO t SELECT 2, 'b'")
           checkAnswer(spark.table("t"), Row(9, "x") :: Row(2, "b") :: Nil)
     
    -      val e = intercept[AnalysisException] {
    -        Seq(1 -> "a").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t2")
    -      }
    -      assert(e.message.contains("A Create Table As Select (CTAS) statement is not allowed " +
    -        "to create a partitioned table using Hive"))
    -
           val e2 = intercept[AnalysisException] {
             Seq(1 -> "a").toDF("i", "j").write.format("hive").bucketBy(4, "i").saveAsTable("t2")
           }
           assert(e2.message.contains("Creating bucketed Hive serde table is not supported yet"))
     
    +      try {
    +        spark.sql("set hive.exec.dynamic.partition.mode=nonstrict")
    --- End diff --
    
    I think we can use `withSQLConf` instead of `try .. finally ..`.
    
    ```scala
    withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
    ...
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71701 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71701/testReport)** for PR 16593 at commit [`acca991`](https://github.com/apache/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class ReorderHivePartitionedTableSchema(sparkSession: SparkSession)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71632/testReport)** for PR 16593 at commit [`9270851`](https://github.com/apache/spark/commit/9270851f0b358c30a14f0f63eded25b68b38b102).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71593 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71593/testReport)** for PR 16593 at commit [`f40adbe`](https://github.com/apache/spark/commit/f40adbeec638884cdff4f6324880729b5ff9e790).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71655/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96788653
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -87,8 +101,8 @@ case class CreateHiveTableAsSelectCommand(
           }
         } else {
           try {
    -        sparkSession.sessionState.executePlan(InsertIntoTable(
    -          metastoreRelation, Map(), query, overwrite = true, ifNotExists = false)).toRdd
    +        sparkSession.sessionState.executePlan(InsertIntoTable(metastoreRelation,
    --- End diff --
    
    oh, we should be fine here, the table is created with `reorderedOutputQuery.schema`, so there won't be any type difference


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96697784
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +46,19 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +    // the CTAS's SELECT partition-outputs order should be consistent with
    +    // tableDesc.partitionColumnNames
    +    val partitionAttrs = tableDesc.partitionColumnNames.map {
    +            p =>
    --- End diff --
    
    Could you please follow the coding style in https://github.com/databricks/scala-style-guide#spacing-and-indentation? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71593 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71593/testReport)** for PR 16593 at commit [`f40adbe`](https://github.com/apache/spark/commit/f40adbeec638884cdff4f6324880729b5ff9e790).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96224962
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala ---
    @@ -183,9 +183,11 @@ case class CatalogTable(
     
       import CatalogTable._
     
    -  /** schema of this table's partition columns */
    -  def partitionSchema: StructType = StructType(schema.filter {
    -    c => partitionColumnNames.contains(c.name)
    +  /** schema of this table's partition columns
    +   * keep the schema order with partitionColumnNames
    +   */
    +  def partitionSchema: StructType = StructType(partitionColumnNames.flatMap {
    +    p => schema.filter(_.name == p)
    --- End diff --
    
    I think this is a bug, can you send an individual PR for this bug? I'd like to backport this bug fix to 2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71439/testReport)** for PR 16593 at commit [`7c09a7c`](https://github.com/apache/spark/commit/7c09a7ca1b948368cf67505e8bd19d0ae6e6142b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71767/testReport)** for PR 16593 at commit [`4c50f49`](https://github.com/apache/spark/commit/4c50f4938a9da360827092db0a8ab50c6c793c2f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71594 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71594/testReport)** for PR 16593 at commit [`a656474`](https://github.com/apache/spark/commit/a656474521fa3976af6e954e8c2d00243b282634).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71634 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71634/testReport)** for PR 16593 at commit [`14aed85`](https://github.com/apache/spark/commit/14aed85b6b3b083b8a4fdb3a3cab65f1eebc8729).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71656/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by windpiger <gi...@git.apache.org>.

Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96244312
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala ---
    @@ -95,8 +95,15 @@ private[hive] case class MetastoreRelation(
         val (partCols, schema) = catalogTable.schema.map(toHiveColumn).partition { c =>
           catalogTable.partitionColumnNames.contains(c.getName)
         }
    +
    +    // keep the schema order with catalogTable.partitionColumnNames
    +    val reorderPartCols = catalogTable.partitionColumnNames.flatMap {
    --- End diff --
    
     create table has a reorderd partcols
    https://github.com/windpiger/spark/blob/4612a52e7e1adb7ce41fe8329ad67faa11285550/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala#L87
    
    so here is redundant, we can remove it.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71655/testReport)** for PR 16593 at commit [`7d38fae`](https://github.com/apache/spark/commit/7d38fae1dffabba0af247f6cdbdbd3ca642c80d3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71657/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71634/testReport)** for PR 16593 at commit [`14aed85`](https://github.com/apache/spark/commit/14aed85b6b3b083b8a4fdb3a3cab65f1eebc8729).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    @windpiger Could you do me a favor to add a dedicated test case in this PR? 
    - Create a partitinoed Hive Table
    - Create a partitinoed data source Table
    - Create a partitinoed Hive Table As SELECT
    - Create a partitinoed data source Table AS SELECT
    
    I want to see whether all of them follow the same rule:
    - data columns + partitino columns
    - the order of data columns is based on the user-specified order in either schema (CT) or query (CTAS)
    - the order of parttin columns is based on the order of columns specified in the clause of `PARTITIONED BY`?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71701/testReport)** for PR 16593 at commit [`acca991`](https://github.com/apache/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71767/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71657/testReport)** for PR 16593 at commit [`082abd2`](https://github.com/apache/spark/commit/082abd232688537d0a41d67277593ee93e825496).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    can you also update the test name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96783111
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -87,8 +101,8 @@ case class CreateHiveTableAsSelectCommand(
           }
         } else {
           try {
    -        sparkSession.sessionState.executePlan(InsertIntoTable(
    -          metastoreRelation, Map(), query, overwrite = true, ifNotExists = false)).toRdd
    +        sparkSession.sessionState.executePlan(InsertIntoTable(metastoreRelation,
    --- End diff --
    
    here we only reorder the columns of the query plan by name, and construct a `InsertIntoTable` plan, other rules will take care of `InsertIntoTable` and do cast. But yes, we should add a test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71602/testReport)** for PR 16593 at commit [`0b0e64b`](https://github.com/apache/spark/commit/0b0e64bc639e632ea4c4a373a64b3a8e47956747).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96609364
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +46,24 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +
    +    // relation should move partition columns to the last
    +    val (partOutputs, nonPartOutputs) = query.output.partition { a =>
    +        tableDesc.partitionColumnNames.contains(a.name)
    +    }
    +
    +    // the CTAS's SELECT partition-outputs order should be consistent with
    +    // tableDesc.partitionColumnNames
    +    val reorderedPartOutputs = tableDesc.partitionColumnNames.map {
    --- End diff --
    
    since we need to reorder the partition columns anyway, how about
    ```
    val partitionAttrs = tableDesc.partitionColumnNames.map { c =>
      query.output.find(_.name == c).getOrElse(throw ...)
    }
    val partitionSet = AttributeSet(partitionAttrs)
    val dataAttrs = query.output.filterNot(partitionSet.contains)
    val reorderOutputQuery = Project(dataAttrs ++ partitionAttrs, query)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r97201665
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1411,7 +1421,10 @@ class HiveDDLSuite
           sql("CREATE TABLE t4(a int, b int, c int, d int) USING hive PARTITIONED BY (d, b)")
           assert(getTableColumns("t4") == Seq("a", "c", "d", "b"))
     
    -      // TODO: add test for creating partitioned hive serde table as select, once we support it.
    +      withSQLConf(("hive.exec.dynamic.partition.mode", "nonstrict")) {
    --- End diff --
    
    nit: `"hive.exec.dynamic.partition.mode" -> "nonstrict"`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71602/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71594/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71602/testReport)** for PR 16593 at commit [`0b0e64b`](https://github.com/apache/spark/commit/0b0e64bc639e632ea4c4a373a64b3a8e47956747).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96785107
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1361,6 +1355,38 @@ class HiveDDLSuite
         }
       }
     
    +  test("create hive serde table as select with DataFrameWriter.saveAsTable with partitionBy") {
    +    withTable("t", "t1") {
    +      withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
    +        Seq(10 -> "y").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row("y", 10) :: Nil)
    +        var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        var partitionSchema = table.partitionSchema
    +        assert(partitionSchema.size == 1 && partitionSchema.fields(0).name == "i" &&
    +          partitionSchema.fields(0).dataType == IntegerType)
    +
    +        Seq(11 -> "z").toDF("i", "j").write.mode("overwrite").format("hive")
    +          .partitionBy("j").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row(11, "z") :: Nil)
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        partitionSchema = table.partitionSchema
    +        assert(partitionSchema.size == 1 && partitionSchema.fields(0).name == "j" &&
    +          partitionSchema.fields(0).dataType == StringType)
    +
    +        Seq((1, 2, 3)).toDF("i", "j", "k").write.mode("overwrite").format("hive")
    +          .partitionBy("k", "j").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row(1, 3, 2) :: Nil)
    +
    +        Seq((1, 2, 3)).toDF("i", "j", "k").write.mode("overwrite").format("hive")
    +          .partitionBy("j", "k").saveAsTable("t")
    +        checkAnswer(spark.table("t"), Row(1, 2, 3) :: Nil)
    +
    +        spark.sql("create table t1 as select * from t")
    --- End diff --
    
    you are not creating a partitioned table, it should be `create table t1 partitioned by (i) as select 1 as i, 'a' as j`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96788538
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1361,6 +1355,22 @@ class HiveDDLSuite
         }
       }
     
    +  test("create hive serde table as select") {
    --- End diff --
    
    `create partitioned hive serde table as select`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96805975
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -87,8 +101,8 @@ case class CreateHiveTableAsSelectCommand(
           }
         } else {
           try {
    -        sparkSession.sessionState.executePlan(InsertIntoTable(
    -          metastoreRelation, Map(), query, overwrite = true, ifNotExists = false)).toRdd
    +        sparkSession.sessionState.executePlan(InsertIntoTable(metastoreRelation,
    --- End diff --
    
    Yeah. Agree


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96348696
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -88,7 +108,9 @@ case class CreateHiveTableAsSelectCommand(
         } else {
           try {
             sparkSession.sessionState.executePlan(InsertIntoTable(
    -          metastoreRelation, Map(), query, overwrite = true, ifNotExists = false)).toRdd
    +        metastoreRelation, Map(), reorderOutputQuery, overwrite = true
    +          , ifNotExists = false))
    --- End diff --
    
    nit: The comma should be in the line above (after `overwrite = true`). Actually I think we can put all the args to `InsertIntoTable` in the same line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71450/testReport)** for PR 16593 at commit [`4612a52`](https://github.com/apache/spark/commit/4612a52e7e1adb7ce41fe8329ad67faa11285550).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71626/testReport)** for PR 16593 at commit [`2da6c7e`](https://github.com/apache/spark/commit/2da6c7e91165d2c130f4cae82e2bfac9e54adf1c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    I just found "create table using hive " (without "select ... from", i.e. the non-CTAS form) is handled by `CreateTableCommand` ([source](https://github.com/apache/spark/blob/bcc510b021/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L411-L413)). But here the reordering of the columns is only handled for `CreateHiveTableAsSelectCommand`, which means the former would still suffer from the problem.
    
    What about introducing a new analyzer rule to do the reordering of the columns when creating a partitioned hive table so both case could be covered?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71656/testReport)** for PR 16593 at commit [`2f04952`](https://github.com/apache/spark/commit/2f04952ff1c69720d56bf0b0cbcae21020f35d4f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96218915
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +45,17 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +
    +    val oriQueryOutput = query.output
    +    val notPartitionOutputs = oriQueryOutput
    +      .filterNot(p => tableDesc.partitionColumnNames.exists(_ == p.name))
    +    val partitionOutputs = tableDesc.partitionColumnNames.flatMap {
    --- End diff --
    
    nit:
    ```
    val (partOutputs, nonPartOutputs) = query.output.partition { a =>
      tableDesc.partitionColumnNames.contains(a.name)
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71660/testReport)** for PR 16593 at commit [`4410de9`](https://github.com/apache/spark/commit/4410de978c15d8b54bad2427a1613610afce444b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by windpiger <gi...@git.apache.org>.

Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96243520
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala ---
    @@ -183,9 +183,11 @@ case class CatalogTable(
     
       import CatalogTable._
     
    -  /** schema of this table's partition columns */
    -  def partitionSchema: StructType = StructType(schema.filter {
    -    c => partitionColumnNames.contains(c.name)
    +  /** schema of this table's partition columns
    +   * keep the schema order with partitionColumnNames
    +   */
    +  def partitionSchema: StructType = StructType(partitionColumnNames.flatMap {
    +    p => schema.filter(_.name == p)
    --- End diff --
    
    ok, I will create a new jira , and send a pr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71664/testReport)** for PR 16593 at commit [`21f113a`](https://github.com/apache/spark/commit/21f113a85ae2df46c93dd57384a01955f394188b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71442/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r97201669
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1411,7 +1421,10 @@ class HiveDDLSuite
           sql("CREATE TABLE t4(a int, b int, c int, d int) USING hive PARTITIONED BY (d, b)")
           assert(getTableColumns("t4") == Seq("a", "c", "d", "b"))
     
    -      // TODO: add test for creating partitioned hive serde table as select, once we support it.
    +      withSQLConf(("hive.exec.dynamic.partition.mode", "nonstrict")) {
    +        sql("CREATE TABLE t5 USING hive PARTITIONED BY (d, b) AS SELECT 1 a, 1 b, 1 c, 1 d")
    --- End diff --
    
    please also test `DataFrameWriter`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71664 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71664/testReport)** for PR 16593 at commit [`21f113a`](https://github.com/apache/spark/commit/21f113a85ae2df46c93dd57384a01955f394188b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96783342
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +46,18 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +    // the CTAS's SELECT partition-outputs order should be consistent with
    +    // tableDesc.partitionColumnNames
    --- End diff --
    
    `When creating a partitioned table, we should reorder the columns to put the partition columns at the end.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71450/testReport)** for PR 16593 at commit [`4612a52`](https://github.com/apache/spark/commit/4612a52e7e1adb7ce41fe8329ad67faa11285550).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71655/testReport)** for PR 16593 at commit [`7d38fae`](https://github.com/apache/spark/commit/7d38fae1dffabba0af247f6cdbdbd3ca642c80d3).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71421 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71421/testReport)** for PR 16593 at commit [`6c31d01`](https://github.com/apache/spark/commit/6c31d017324b3c7f310103d2d4b5138bbef4b463).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71632/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71450/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71626/testReport)** for PR 16593 at commit [`2da6c7e`](https://github.com/apache/spark/commit/2da6c7e91165d2c130f4cae82e2bfac9e54adf1c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96609532
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -64,7 +83,7 @@ case class CreateHiveTableAsSelectCommand(
           val withSchema = if (withFormat.schema.isEmpty) {
             // Hive doesn't support specifying the column list for target table in CTAS
             // However we don't think SparkSQL should follow that.
    -        tableDesc.copy(schema = query.output.toStructType)
    +        tableDesc.copy(schema = reorderOutputQuery.output.toStructType)
    --- End diff --
    
    nit: `reorderOutputQuery.schema`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71423 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71423/testReport)** for PR 16593 at commit [`7c09a7c`](https://github.com/apache/spark/commit/7c09a7ca1b948368cf67505e8bd19d0ae6e6142b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96610195
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1361,6 +1355,37 @@ class HiveDDLSuite
         }
       }
     
    +  test("create hive serde table with DataFrameWriter.saveAsTable with partitionBy") {
    --- End diff --
    
    nit: `create partitioned hive serde table as select`, and we should test both `DataFrameWriter.saveAsTable` and `CREATE TABLE ... PARTITIONED BY (...) AS SELECT`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71593/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71656/testReport)** for PR 16593 at commit [`2f04952`](https://github.com/apache/spark/commit/2f04952ff1c69720d56bf0b0cbcae21020f35d4f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96774234
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -87,8 +101,8 @@ case class CreateHiveTableAsSelectCommand(
           }
         } else {
           try {
    -        sparkSession.sessionState.executePlan(InsertIntoTable(
    -          metastoreRelation, Map(), query, overwrite = true, ifNotExists = false)).toRdd
    +        sparkSession.sessionState.executePlan(InsertIntoTable(metastoreRelation,
    --- End diff --
    
    IIUC the partition syntax doesn't contain type, e.g. ```create table t2 using hive partitioned by (c1, c2) as select * from t1```. If one specify `partition by (c1 string, c2 int)` the parser would raise an error, because we have this specified in the parser syntax:
    
    ```g4
            createTableHeader ...
            (PARTITIONED BY partitionColumnNames=identifierList)?
            ... #createTable
    
    identifierList
        : '(' identifierSeq ')'
        ;
    
    identifierSeq
        : identifier (',' identifier)*
        ;
    
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96218041
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala ---
    @@ -183,9 +183,11 @@ case class CatalogTable(
     
       import CatalogTable._
     
    -  /** schema of this table's partition columns */
    -  def partitionSchema: StructType = StructType(schema.filter {
    -    c => partitionColumnNames.contains(c.name)
    +  /** schema of this table's partition columns
    +   * keep the schema order with partitionColumnNames
    --- End diff --
    
    nit: style
    ```
    /**
     * comments...
     */
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71603/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71442/testReport)** for PR 16593 at commit [`046ead7`](https://github.com/apache/spark/commit/046ead7c87c76c0a17852f65f51f8a746ed638a9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96348933
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1343,17 +1343,41 @@ class HiveDDLSuite
           sql("INSERT INTO t SELECT 2, 'b'")
           checkAnswer(spark.table("t"), Row(9, "x") :: Row(2, "b") :: Nil)
     
    -      val e = intercept[AnalysisException] {
    -        Seq(1 -> "a").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t2")
    -      }
    -      assert(e.message.contains("A Create Table As Select (CTAS) statement is not allowed " +
    -        "to create a partitioned table using Hive"))
    -
           val e2 = intercept[AnalysisException] {
             Seq(1 -> "a").toDF("i", "j").write.format("hive").bucketBy(4, "i").saveAsTable("t2")
           }
           assert(e2.message.contains("Creating bucketed Hive serde table is not supported yet"))
     
    +      try {
    +        spark.sql("set hive.exec.dynamic.partition.mode=nonstrict")
    +        Seq(10 -> "y").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t3")
    +        checkAnswer(spark.table("t3"), Row("y", 10) :: Nil)
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
    +        var partitionSchema = table.partitionSchema
    +        assert(partitionSchema.size == 1 && partitionSchema.fields(0).name == "i" &&
    +          partitionSchema.fields(0).dataType == IntegerType)
    +
    +        Seq(11 -> "z").toDF("i", "j").write.mode("overwrite").format("hive")
    +          .partitionBy("j").saveAsTable("t3")
    +        checkAnswer(spark.table("t3"), Row(11, "z") :: Nil)
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
    +        partitionSchema = table.partitionSchema
    +        assert(partitionSchema.size == 1 && partitionSchema.fields(0).name == "j" &&
    +          partitionSchema.fields(0).dataType == StringType)
    +
    +        Seq((1, 2, 3)).toDF("i", "j", "k").write.mode("overwrite").format("hive")
    +          .partitionBy("k", "j").saveAsTable("t3")
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
    +        checkAnswer(spark.table("t3"), Row(1, 3, 2) :: Nil)
    +
    +        Seq((1, 2, 3)).toDF("i", "j", "k").write.mode("overwrite").format("hive")
    +          .partitionBy("j", "k").saveAsTable("t3")
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
    +        checkAnswer(spark.table("t3"), Row(1, 2, 3) :: Nil)
    +      } finally {
    +        spark.sql("set hive.exec.dynamic.partition.mode=strict")
    +      }
    +
    --- End diff --
    
    I think this test case is a bit fat, maybe we can split it into two or three smaller ones? e.g.:
    
    ```scala
      test("create hive serde table with DataFrameWriter.saveAsTable - basic") ...
      test("create hive serde table with DataFrameWriter.saveAsTable - overwrite and append") ...
      test("create hive serde table with DataFrameWriter.saveAsTable - partitioned") ...
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96700244
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -87,8 +101,8 @@ case class CreateHiveTableAsSelectCommand(
           }
         } else {
           try {
    -        sparkSession.sessionState.executePlan(InsertIntoTable(
    -          metastoreRelation, Map(), query, overwrite = true, ifNotExists = false)).toRdd
    +        sparkSession.sessionState.executePlan(InsertIntoTable(metastoreRelation,
    --- End diff --
    
    We changed the order, but are we able to handle the type difference? Are we doing casting? Do we have any test case to cover it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71445 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71445/testReport)** for PR 16593 at commit [`76f643a`](https://github.com/apache/spark/commit/76f643a507f39099a6539899597398c35341ad63).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    good catch! introducing a new analyzer rule SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    LGTM, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71632/testReport)** for PR 16593 at commit [`9270851`](https://github.com/apache/spark/commit/9270851f0b358c30a14f0f63eded25b68b38b102).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71784/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71421/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96219465
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1343,17 +1343,38 @@ class HiveDDLSuite
           sql("INSERT INTO t SELECT 2, 'b'")
           checkAnswer(spark.table("t"), Row(9, "x") :: Row(2, "b") :: Nil)
     
    -      val e = intercept[AnalysisException] {
    -        Seq(1 -> "a").toDF("i", "j").write.format("hive").partitionBy("i").saveAsTable("t2")
    -      }
    -      assert(e.message.contains("A Create Table As Select (CTAS) statement is not allowed " +
    -        "to create a partitioned table using Hive"))
    -
           val e2 = intercept[AnalysisException] {
             Seq(1 -> "a").toDF("i", "j").write.format("hive").bucketBy(4, "i").saveAsTable("t2")
           }
           assert(e2.message.contains("Creating bucketed Hive serde table is not supported yet"))
     
    +      spark.sql("set hive.exec.dynamic.partition.mode=nonstrict")
    --- End diff --
    
    please restore the conf after test. use try-finally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    **[Test build #71784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71784/testReport)** for PR 16593 at commit [`7bdc265`](https://github.com/apache/spark/commit/7bdc265500cbfd6b4dc16ec6a6ce7c321e7dd3dc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

Posted by lins05 <gi...@git.apache.org>.

Github user lins05 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16593#discussion_r96348791
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala ---
    @@ -45,6 +46,25 @@ case class CreateHiveTableAsSelectCommand(
       override def innerChildren: Seq[LogicalPlan] = Seq(query)
     
       override def run(sparkSession: SparkSession): Seq[Row] = {
    +
    +    // relation should move partition columns to the last
    +    val (partOutputs, nonPartOutputs) = query.output.partition {
    +      a =>
    +        tableDesc.partitionColumnNames.contains(a.name)
    +    }
    +
    +    // the CTAS's SELECT partition-outputs order should be consistent with
    +    // tableDesc.partitionColumnNames
    +    val reorderPartOutputs = tableDesc.partitionColumnNames.map {
    --- End diff --
    
    nit: `reorderPartOutputs` -> `reorderedPartOutputs`. The former sounds like a verb while the later sounds like a noun.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16593
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71423/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org