You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2017/05/21 18:29:18 UTC

[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/18050

    [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tables with IF NOT EXISTS

    ### What changes were proposed in this pull request?
    Currently, we have a bug when we specify `IF NOT EXISTS` in `INSERT OVERWRITE` data source tables. For example, given a query:
    ```SQL
    INSERT OVERWRITE TABLE $tableName partition (b=2, c=3) IF NOT EXISTS SELECT 9, 10
    ```
    we will get the following error:
    ```
    unresolved operator 'InsertIntoTable Relation[a#425,d#426,b#427,c#428] parquet, Map(b -> Some(2), c -> Some(3)), true, true;;
    'InsertIntoTable Relation[a#425,d#426,b#427,c#428] parquet, Map(b -> Some(2), c -> Some(3)), true, true
    +- Project [cast(9#423 as int) AS a#429, cast(10#424 as int) AS d#430]
       +- Project [9 AS 9#423, 10 AS 10#424]
          +- OneRowRelation$
    ```
    
    This PR is to fix the issue to follow the behavior of Hive serde tables
    > INSERT OVERWRITE will overwrite any existing data in the table or partition unless IF NOT EXISTS is provided for a partition
    
    
    ### How was this patch tested?
    Modified an existing test case

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark insertPartitionIfNotExists

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18050.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18050
    
----
commit 23b5ef5e1424dda7f3b19d7a987fe80f9faee533
Author: gatorsmile <ga...@gmail.com>
Date:   2017-05-21T18:18:25Z

    fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    **[Test build #77173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77173/testReport)** for PR 18050 at commit [`1bf07f4`](https://github.com/apache/spark/commit/1bf07f4cde48c4ea495c00e75643a2506d6e0f76).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117681968
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    it's a little weird to test data source table insertion in InsertIntoHiveTableSuite...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117658942
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    +    val selQuery = s"select a, b, c, d from $tableName"
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 1, 4
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(1, 2, 3, 4))
     
    -        var e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'newPartition'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 5, 6
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
    +
    +    val e = intercept[AnalysisException] {
    +      sql(
    +        s"""
    +           |INSERT OVERWRITE TABLE $tableName
    +           |partition (b=2, c) IF NOT EXISTS
    +           |SELECT 7, 8, 3
    +          """.stripMargin)
    +    }
    +    assert(e.getMessage.contains(
    +      "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [c]"))
     
    -        e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'b'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    // If the partition already exists, the insert will overwrite the data
    +    // unless users specify IF NOT EXISTS
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3) IF NOT EXISTS
    +         |SELECT 9, 10
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
     
    -        // If the partition already exists, the insert will overwrite the data
    --- End diff --
    
    That is why I used `testPartitionedTable` to do the test. It tests both data source and hive serde tables. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117683001
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    They are still under hive/?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    **[Test build #77153 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77153/testReport)** for PR 18050 at commit [`23b5ef5`](https://github.com/apache/spark/commit/23b5ef5e1424dda7f3b19d7a987fe80f9faee533).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77153/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117659055
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    +    val selQuery = s"select a, b, c, d from $tableName"
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 1, 4
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(1, 2, 3, 4))
     
    -        var e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'newPartition'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 5, 6
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
    +
    +    val e = intercept[AnalysisException] {
    +      sql(
    +        s"""
    +           |INSERT OVERWRITE TABLE $tableName
    +           |partition (b=2, c) IF NOT EXISTS
    +           |SELECT 7, 8, 3
    +          """.stripMargin)
    +    }
    +    assert(e.getMessage.contains(
    +      "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [c]"))
     
    -        e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'b'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    // If the partition already exists, the insert will overwrite the data
    +    // unless users specify IF NOT EXISTS
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3) IF NOT EXISTS
    +         |SELECT 9, 10
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
     
    -        // If the partition already exists, the insert will overwrite the data
    --- End diff --
    
    Ok. Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117683698
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    We can move the test cases to a new one in /core and let `InsertIntoHiveTableSuite` extends that one. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18050


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117653786
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    +    val selQuery = s"select a, b, c, d from $tableName"
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 1, 4
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(1, 2, 3, 4))
     
    -        var e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'newPartition'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 5, 6
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
    +
    +    val e = intercept[AnalysisException] {
    +      sql(
    +        s"""
    +           |INSERT OVERWRITE TABLE $tableName
    +           |partition (b=2, c) IF NOT EXISTS
    +           |SELECT 7, 8, 3
    +          """.stripMargin)
    +    }
    +    assert(e.getMessage.contains(
    +      "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [c]"))
     
    -        e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'b'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    // If the partition already exists, the insert will overwrite the data
    +    // unless users specify IF NOT EXISTS
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3) IF NOT EXISTS
    +         |SELECT 9, 10
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
     
    -        // If the partition already exists, the insert will overwrite the data
    --- End diff --
    
    Oh. This is for Hive table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117682360
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    anyway, we can fix it later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77173/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117669879
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
    @@ -410,17 +410,20 @@ case class Hint(name: String, parameters: Seq[String], child: LogicalPlan) exten
      *                  would have Map('a' -> Some('1'), 'b' -> None).
      * @param query the logical plan representing data to write to.
      * @param overwrite overwrite existing table or partitions.
    - * @param ifNotExists If true, only write if the table or partition does not exist.
    + * @param ifStaticPartitionNotExists If true, only write if the partition does not exist.
    --- End diff --
    
    this name is a little verbose, how about `ifPartitionNotExists`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    thanks, merging to master/2.2!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117683751
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    Yeah, we can do it as a test-only PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    **[Test build #77173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77173/testReport)** for PR 18050 at commit [`1bf07f4`](https://github.com/apache/spark/commit/1bf07f4cde48c4ea495c00e75643a2506d6e0f76).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117653219
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    +    val selQuery = s"select a, b, c, d from $tableName"
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 1, 4
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(1, 2, 3, 4))
     
    -        var e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'newPartition'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 5, 6
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
    +
    +    val e = intercept[AnalysisException] {
    +      sql(
    +        s"""
    +           |INSERT OVERWRITE TABLE $tableName
    +           |partition (b=2, c) IF NOT EXISTS
    +           |SELECT 7, 8, 3
    +          """.stripMargin)
    +    }
    +    assert(e.getMessage.contains(
    +      "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [c]"))
     
    -        e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'b'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    // If the partition already exists, the insert will overwrite the data
    +    // unless users specify IF NOT EXISTS
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3) IF NOT EXISTS
    +         |SELECT 9, 10
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
     
    -        // If the partition already exists, the insert will overwrite the data
    --- End diff --
    
    Seems `IF NOT EXISTS` works previously with `INSERT OVERWRITE`? It doesn't overwrite the existing data.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117683231
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    If it is not too verbose, I'd like to have them separated, instead of mixing together in one test suite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117682400
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    --- End diff --
    
    Create and nove all these tests to a new suite `InsertIntoTableSuite`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data source tab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18050
  
    **[Test build #77153 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77153/testReport)** for PR 18050 at commit [`23b5ef5`](https://github.com/apache/spark/commit/23b5ef5e1424dda7f3b19d7a987fe80f9faee533).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18050: [SPARK-20831] [SQL] Fix INSERT OVERWRITE data sou...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18050#discussion_r117653850
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -166,72 +166,54 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
         sql("DROP TABLE tmp_table")
       }
     
    -  test("INSERT OVERWRITE - partition IF NOT EXISTS") {
    -    withTempDir { tmpDir =>
    -      val table = "table_with_partition"
    -      withTable(table) {
    -        val selQuery = s"select c1, p1, p2 from $table"
    -        sql(
    -          s"""
    -             |CREATE TABLE $table(c1 string)
    -             |PARTITIONED by (p1 string,p2 string)
    -             |location '${tmpDir.toURI.toString}'
    -           """.stripMargin)
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr", "a", "b"))
    -
    -        sql(
    -          s"""
    -             |INSERT OVERWRITE TABLE $table
    -             |partition (p1='a',p2='b')
    -             |SELECT 'blarr2'
    -           """.stripMargin)
    -        checkAnswer(
    -          sql(selQuery),
    -          Row("blarr2", "a", "b"))
    +  testPartitionedTable("INSERT OVERWRITE - partition IF NOT EXISTS") { tableName =>
    +    val selQuery = s"select a, b, c, d from $tableName"
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 1, 4
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(1, 2, 3, 4))
     
    -        var e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'newPartition'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3)
    +         |SELECT 5, 6
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
    +
    +    val e = intercept[AnalysisException] {
    +      sql(
    +        s"""
    +           |INSERT OVERWRITE TABLE $tableName
    +           |partition (b=2, c) IF NOT EXISTS
    +           |SELECT 7, 8, 3
    +          """.stripMargin)
    +    }
    +    assert(e.getMessage.contains(
    +      "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [c]"))
     
    -        e = intercept[AnalysisException] {
    -          sql(
    -            s"""
    -               |INSERT OVERWRITE TABLE $table
    -               |partition (p1='a',p2) IF NOT EXISTS
    -               |SELECT 'blarr3', 'b'
    -             """.stripMargin)
    -        }
    -        assert(e.getMessage.contains(
    -          "Dynamic partitions do not support IF NOT EXISTS. Specified partitions with value: [p2]"))
    +    // If the partition already exists, the insert will overwrite the data
    +    // unless users specify IF NOT EXISTS
    +    sql(
    +      s"""
    +         |INSERT OVERWRITE TABLE $tableName
    +         |partition (b=2, c=3) IF NOT EXISTS
    +         |SELECT 9, 10
    +        """.stripMargin)
    +    checkAnswer(sql(selQuery), Row(5, 2, 3, 6))
     
    -        // If the partition already exists, the insert will overwrite the data
    --- End diff --
    
    So seems we should also add a test for `IF NOT EXISTS` with `INSERT OVERWRITE` for datasource table?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org