You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by windpiger <gi...@git.apache.org> on 2017/01/22 09:30:45 UTC

[GitHub] spark pull request #16672: [SPARK-19329][SQL]insert data to a not exist loca...

GitHub user windpiger opened a pull request:

    https://github.com/apache/spark/pull/16672

    [SPARK-19329][SQL]insert data to a not exist location datasource table should success

    ## What changes were proposed in this pull request?
    
    when we insert data into a datasource table use `sqlText`, and the table has a not exists location,
    this will throw an Exception.
    
    example:
    
    ```
    spark.sql("create table t(a string, b int) using parquet")
    spark.sql("alter table t set location '/xx'")
    spark.sql("insert into table t select 'c', 1")
    ```
    
    Exception:
    ```
    com.google.common.util.concurrent.UncheckedExecutionException: org.apache.spark.sql.AnalysisException: Path does not exist: /xx;
    at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4814)
    at com.google.common.cache.LocalCache$LocalLoadingCache.apply(LocalCache.java:4830)
    at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:122)
    at org.apache.spark.sql.hive.HiveSessionCatalog.lookupRelation(HiveSessionCatalog.scala:69)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:456)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:465)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:463)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:60)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:463)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:453)
    ```
    
    
    
    ## How was this patch tested?
    unit test added

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/windpiger/spark insertNotExistLocation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16672.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16672
    
----
commit 5ec7dd6987ff2cdbadda0eb45f6fdd8aacaf92fd
Author: windpiger <so...@outlook.com>
Date:   2017-01-22T09:19:16Z

    [SPARK-19329][SQL]insert data to a not exist location datasource table should success

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72746/testReport)** for PR 16672 at commit [`abc57dd`](https://github.com/apache/spark/commit/abc57ddedde78cfd8e94125416423cbcd4e56f71).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a ta...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100703121
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
    @@ -754,6 +754,8 @@ case class AlterTableSetLocationCommand(
             // No partition spec is specified, so we set the location for the table itself
             catalog.alterTable(table.withNewStorage(locationUri = Some(location)))
         }
    +
    +    catalog.refreshTable(table.identifier)
    --- End diff --
    
    This is not a right fix. Please see the discussion in a separate PR: https://github.com/apache/spark/pull/16514#issuecomment-271232337
    
    Please focus on the scope of this PR. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72765 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72765/testReport)** for PR 16672 at commit [`542f86b`](https://github.com/apache/spark/commit/542f86ba6d1037153fb9752472f670c3e81fedbf).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]insert/read data to a not exist...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100684307
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1431,4 +1431,30 @@ class HiveDDLSuite
           }
         }
       }
    +
    +  test("insert data to a table which has altered the table location " +
    +    "to an not exist location should success") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""create table t(a string, b int)
    --- End diff --
    
    General style suggestions. Please use upper case for SQL keywords. For example, in this SQL statement can be improved to
    ```
    CREATE TABLE t(a STRING, b INT)
    USING parquet
    OPTIONS(path "xyz")
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a data-sour...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72766/testReport)** for PR 16672 at commit [`2cbb9d6`](https://github.com/apache/spark/commit/2cbb9d68171bf66c2e1494c3f346b69b04e4b7de).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72869 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72869/testReport)** for PR 16672 at commit [`0d947a5`](https://github.com/apache/spark/commit/0d947a55a80ecc63eb15092c29b2c44aeeb197e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    @gatorsmile I'd like to take that. https://issues.apache.org/jira/browse/SPARK-19583 Thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71898/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r101388054
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +             |CREATE TABLE t(a string, b int)
    +             |USING parquet
    +             |OPTIONS(path "$dir")
    --- End diff --
    
    what if the path doesn't exist when create? will we succeed or fail?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72746/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71803/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #71898 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71898/testReport)** for PR 16672 at commit [`abc57dd`](https://github.com/apache/spark/commit/abc57ddedde78cfd8e94125416423cbcd4e56f71).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    We also need to add a test case for in-memory catalog. Maybe we can wait until https://github.com/apache/spark/pull/16592 is resolved?
    
    Actually, this fix is not right. We should create the directory when we set the location.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Could you move the test cases to `DDLSuite.scala`? This is not for Hive specific. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    ok, let me create a new pr for hive serde tables, and continue to finish this pr~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r101388447
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +             |CREATE TABLE t(a string, b int)
    +             |USING parquet
    +             |OPTIONS(path "$dir")
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = dir.getAbsolutePath.stripSuffix("/")
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete
    +        val tableLocFile = new File(table.location.stripPrefix("file:"))
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        Utils.deleteRecursively(dir)
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    --- End diff --
    
    nit: `new File(dir, "x")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/16672


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    To backport https://github.com/apache/spark/pull/17097, we need to backport multiple PRs. This is one of it.
    
    @windpiger Could you please submit a PR to backport it to Spark 2.1? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Could you please add the test cases for the scenarios (of non pre existing location) you explained above? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r101541205
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +             |CREATE TABLE t(a string, b int)
    +             |USING parquet
    +             |OPTIONS(path "$dir")
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = dir.getAbsolutePath.stripSuffix("/")
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete
    +        val tableLocFile = new File(table.location.stripPrefix("file:"))
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        Utils.deleteRecursively(dir)
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    --- End diff --
    
    ok, I will fix this when I do another pr, thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100735110
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1431,4 +1432,133 @@ class HiveDDLSuite
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    +           """.stripMargin)
    +        var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    --- End diff --
    
    Another general comment. Please avoid using `var`, if possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]insert data to a not exist loca...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r97230104
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1431,4 +1431,27 @@ class HiveDDLSuite
           }
         }
       }
    +
    +  test("insert data to a table which has altered the table location " +
    +    "to an not exist location should success") {
    +    withTable("t", "t1") {
    --- End diff --
    
    `t1`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    I'm wondering how hive read a table with non-existing path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #71808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71808/testReport)** for PR 16672 at commit [`2196648`](https://github.com/apache/spark/commit/21966484c9b44ca7d509ee017a10175b48300283).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #71898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71898/testReport)** for PR 16672 at commit [`abc57dd`](https://github.com/apache/spark/commit/abc57ddedde78cfd8e94125416423cbcd4e56f71).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72869/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    : ) `success` is a noun and `exist` is a verb. 
    
    `insert/read data to a not exist location datasource table should success` -> `Reading from or writing to a data-source table with a non pre-existing location should succeed`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72775 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72775/testReport)** for PR 16672 at commit [`c3e87d3`](https://github.com/apache/spark/commit/c3e87d356cf99e40e853cb3845fe4b93462c457b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a data-sour...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72765 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72765/testReport)** for PR 16672 at commit [`542f86b`](https://github.com/apache/spark/commit/542f86ba6d1037153fb9752472f670c3e81fedbf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72746/testReport)** for PR 16672 at commit [`abc57dd`](https://github.com/apache/spark/commit/abc57ddedde78cfd8e94125416423cbcd4e56f71).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    @gatorsmile I have fixed some review issues. Could you help to continue to review this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71808/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72814 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72814/testReport)** for PR 16672 at commit [`dee844c`](https://github.com/apache/spark/commit/dee844ce73defc68116913569733275a2f1d5529).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    In hive:
    1. read a table with non-existing path, no exception and return 0 rows
    2. read a table with non-permission path, throw runtime exception
    ```
    FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if hdfs:/tmp/noownerpermission is encrypted: org.apache.hadoop.security.AccessControlException: Permission denied: user=test, access=READ, inode="/tmp/noownerpermission":hadoop:hadoop:drwxr-x--x
    	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320)
    
    ```
     
    3. write to a non-exist path ,it will create it and insert data to it, everything is ok
    
    4. write to a non-permission path, it will throw an exception
    
    5.  alter table set location to a non-permission path, it is ok



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100963830
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,127 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete
    +        val tableLocFile = new File(table.location.stripPrefix("file:"))
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        Utils.deleteRecursively(dir)
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    +        val newDirFile = new File(newDir)
    +        spark.sql(s"ALTER TABLE t SET LOCATION '$newDir'")
    +        spark.sessionState.catalog.refreshTable(TableIdentifier("t"))
    +
    +        val table1 = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        assert(table1.location == newDir)
    +        assert(!newDirFile.exists)
    +
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(newDirFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +      }
    +    }
    +  }
    +
    +  test("insert into a data source table with no existed partition location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a int, b int, c int, d int)
    +              |USING parquet
    +              |PARTITIONED BY(a, b)
    +              |LOCATION "file:${dir.getCanonicalPath}"
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        spark.sql("INSERT INTO TABLE t PARTITION(a=1, b=2) SELECT 3, 4")
    +        checkAnswer(spark.table("t"), Row(3, 4, 1, 2) :: Nil)
    +
    +        val partLoc = new File(s"${dir.getAbsolutePath}/a=1")
    +        Utils.deleteRecursively(partLoc)
    +        assert(!partLoc.exists())
    +        // insert overwrite into a partition which location has been deleted.
    +        spark.sql("INSERT OVERWRITE TABLE t PARTITION(a=1, b=2) SELECT 7, 8")
    +        assert(partLoc.exists())
    +        checkAnswer(spark.table("t"), Row(7, 8, 1, 2) :: Nil)
    +
    +        // TODO:insert into a partition after alter the partition location by alter command
    --- End diff --
    
    I found there is a bug in this situation. and I create a jira
    https://issues.apache.org/jira/browse/SPARK-19577
    
    shall we just forbid this situation or fix it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72813/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    ping @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72813/testReport)** for PR 16672 at commit [`b238e8d`](https://github.com/apache/spark/commit/b238e8d34fc08b3f641c610dada582ca3ee2be2b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #71808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71808/testReport)** for PR 16672 at commit [`2196648`](https://github.com/apache/spark/commit/21966484c9b44ca7d509ee017a10175b48300283).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r101590857
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +             |CREATE TABLE t(a string, b int)
    +             |USING parquet
    +             |OPTIONS(path "$dir")
    --- End diff --
    
    what's the behavior of hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72759 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72759/testReport)** for PR 16672 at commit [`c3439ff`](https://github.com/apache/spark/commit/c3439ffecfcde7ecc06b6dd40e1d085c433eea94).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72759/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    If we create the location path when alter table by user A, maybe we use user B to run the job which have no permission to write data to the location, is't it also not friendly? Maybe throw an runtime Exception is properly, and don't create path when alter table. 
    @gatorsmile @cloud-fan @yhuai @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72775/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72855/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72775/testReport)** for PR 16672 at commit [`c3e87d3`](https://github.com/apache/spark/commit/c3e87d356cf99e40e853cb3845fe4b93462c457b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72765/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    I test it in hive\uff0calter table set location  does not create the dir \uff0cwhen we insert data on\uff0cthe dir createed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72814/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100861842
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,127 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete
    +        val tableLocFile = new File(table.location.stripPrefix("file:"))
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        Utils.deleteRecursively(dir)
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    +        val newDirFile = new File(newDir)
    +        spark.sql(s"ALTER TABLE t SET LOCATION '$newDir'")
    +        spark.sessionState.catalog.refreshTable(TableIdentifier("t"))
    +
    +        val table1 = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        assert(table1.location == newDir)
    +        assert(!newDirFile.exists)
    +
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(newDirFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +      }
    +    }
    +  }
    +
    +  test("insert into a data source table with no existed partition location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a int, b int, c int, d int)
    +              |USING parquet
    +              |PARTITIONED BY(a, b)
    +              |LOCATION "file:${dir.getCanonicalPath}"
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        spark.sql("INSERT INTO TABLE t PARTITION(a=1, b=2) SELECT 3, 4")
    +        checkAnswer(spark.table("t"), Row(3, 4, 1, 2) :: Nil)
    +
    +        val partLoc = new File(s"${dir.getAbsolutePath}/a=1")
    +        Utils.deleteRecursively(partLoc)
    +        assert(!partLoc.exists())
    +        // insert overwrite into a partition which location has been deleted.
    +        spark.sql("INSERT OVERWRITE TABLE t PARTITION(a=1, b=2) SELECT 7, 8")
    +        assert(partLoc.exists())
    +        checkAnswer(spark.table("t"), Row(7, 8, 1, 2) :: Nil)
    +
    +        // TODO:insert into a partition after alter the partition location by alter command
    --- End diff --
    
    To other reviewers: ALTER TABLE SET LOCATION for partition is not allowed for tables defined using the datasource API


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    `SET LOCATION` can also be done over individual partitions of the table. Are you planning to handle that as well in this PR ? If it already works, please add test case(s) for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72766/testReport)** for PR 16672 at commit [`2cbb9d6`](https://github.com/apache/spark/commit/2cbb9d68171bf66c2e1494c3f346b69b04e4b7de).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72814 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72814/testReport)** for PR 16672 at commit [`dee844c`](https://github.com/apache/spark/commit/dee844ce73defc68116913569733275a2f1d5529).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    How about creating a separate PR for Hive serde tables? This PR can focus on the issues of data source tables. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    More test cases for non pre-existing locations? For example, INESRT without an ALTER LOCATION? You can simply drop the directory. This scenario is reasonable when the table is external. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #71803 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71803/testReport)** for PR 16672 at commit [`5ec7dd6`](https://github.com/apache/spark/commit/5ec7dd6987ff2cdbadda0eb45f6fdd8aacaf92fd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72869/testReport)** for PR 16672 at commit [`0d947a5`](https://github.com/apache/spark/commit/0d947a55a80ecc63eb15092c29b2c44aeeb197e5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72855/testReport)** for PR 16672 at commit [`0d947a5`](https://github.com/apache/spark/commit/0d947a55a80ecc63eb15092c29b2c44aeeb197e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r101541033
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +             |CREATE TABLE t(a string, b int)
    +             |USING parquet
    +             |OPTIONS(path "$dir")
    --- End diff --
    
    currently, it will throw an exception that the path does not existed. maybe we can check if the path is a dir or not, dir can not exist and file must be exist?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #71803 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71803/testReport)** for PR 16672 at commit [`5ec7dd6`](https://github.com/apache/spark/commit/5ec7dd6987ff2cdbadda0eb45f6fdd8aacaf92fd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100735007
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1431,4 +1432,133 @@ class HiveDDLSuite
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    +           """.stripMargin)
    --- End diff --
    
    A general comment about the style. We prefer to the following indentation styles.
    ```Scala
            sql(
              """
                |SELECT '1' AS part, key, value FROM VALUES
                |(1, "one"), (2, "two"), (3, null) AS data(key, value)
              """.stripMargin)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72804/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    I am not very sure whether we should follow Hive in this case. The path might be wrong or no permission to create such a directory. Thus, it might be more user friendly if they can get the error of creating the directory when changing the location. cc @cloud-fan @yhuai @hvanhovell 
    
    This PR focues on the write path. How about the read path? What does Hive behave when try to select a table whose location/directory is not created? What is the behavior of our Spark SQL?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    yes, let me add these test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]insert data to a not exist loca...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r97238455
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -240,7 +240,7 @@ class FindDataSourceTable(sparkSession: SparkSession) extends Rule[LogicalPlan]
                 // TODO: improve `InMemoryCatalog` and remove this limitation.
                 catalogTable = if (withHiveSupport) Some(table) else None)
     
    -        LogicalRelation(dataSource.resolveRelation(), catalogTable = Some(table))
    +        LogicalRelation(dataSource.resolveRelation(false), catalogTable = Some(table))
    --- End diff --
    
    thanks\uff5e


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a ta...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100725345
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
    @@ -754,6 +754,8 @@ case class AlterTableSetLocationCommand(
             // No partition spec is specified, so we set the location for the table itself
             catalog.alterTable(table.withNewStorage(locationUri = Some(location)))
         }
    +
    +    catalog.refreshTable(table.identifier)
    --- End diff --
    
    sorry, the test case hit the bug, so I fix it here, I will avoid the bug to use clear cache.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100882025
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,127 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    --- End diff --
    
    - First, you do not need to add the `file:`
    - Second, you still need to adjust the indent.
    
    ```Scala
            spark.sql(
              s"""
                 |CREATE TABLE t(a int, b int, c int, d int)
                 |USING parquet
                 |PARTITIONED BY(a, b)
                 |LOCATION '$dir'
               """.stripMargin)
            val expectedPath = dir.getAbsolutePath.stripSuffix("/")
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100882612
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
    @@ -1816,4 +1816,127 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach {
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete
    +        val tableLocFile = new File(table.location.stripPrefix("file:"))
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        Utils.deleteRecursively(dir)
    +        assert(!tableLocFile.exists)
    +        spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
    +        assert(tableLocFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    +        val newDirFile = new File(newDir)
    +        spark.sql(s"ALTER TABLE t SET LOCATION '$newDir'")
    +        spark.sessionState.catalog.refreshTable(TableIdentifier("t"))
    +
    +        val table1 = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        assert(table1.location == newDir)
    +        assert(!newDirFile.exists)
    +
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        assert(newDirFile.exists)
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +      }
    +    }
    +  }
    +
    +  test("insert into a data source table with no existed partition location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a int, b int, c int, d int)
    +              |USING parquet
    +              |PARTITIONED BY(a, b)
    +              |LOCATION "file:${dir.getCanonicalPath}"
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        spark.sql("INSERT INTO TABLE t PARTITION(a=1, b=2) SELECT 3, 4")
    +        checkAnswer(spark.table("t"), Row(3, 4, 1, 2) :: Nil)
    +
    +        val partLoc = new File(s"${dir.getAbsolutePath}/a=1")
    +        Utils.deleteRecursively(partLoc)
    +        assert(!partLoc.exists())
    +        // insert overwrite into a partition which location has been deleted.
    +        spark.sql("INSERT OVERWRITE TABLE t PARTITION(a=1, b=2) SELECT 7, 8")
    +        assert(partLoc.exists())
    +        checkAnswer(spark.table("t"), Row(7, 8, 1, 2) :: Nil)
    +
    +        // TODO:insert into a partition after alter the partition location by alter command
    +      }
    +    }
    +  }
    +
    +  test("read data from a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getAbsolutePath}")
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete()
    +        checkAnswer(spark.table("t"), Nil)
    +
    +        val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    +        spark.sql(s"ALTER TABLE t SET LOCATION '$newDir'")
    +
    +        val table1 = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        assert(table1.location == newDir)
    +        assert(!new File(newDir).exists())
    +        checkAnswer(spark.table("t"), Nil)
    +      }
    +    }
    +  }
    +
    +  test("read data from a data source table with no existed partition location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""
    +              |CREATE TABLE t(a int, b int, c int, d int)
    +              |USING parquet
    +              |PARTITIONED BY(a, b)
    +              |LOCATION "file:${dir.getCanonicalPath}"
    +           """.stripMargin)
    +        val table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    --- End diff --
    
    This is not being used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    @gatorsmile could you give some suggestion? thanks very much!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72759 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72759/testReport)** for PR 16672 at commit [`c3439ff`](https://github.com/apache/spark/commit/c3439ffecfcde7ecc06b6dd40e1d085c433eea94).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72804 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72804/testReport)** for PR 16672 at commit [`334e89f`](https://github.com/apache/spark/commit/334e89fe7258ab6a6773d534bee469cda7cd6d0c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    The changes in this PR affects both read and write paths. Please update the PR description and title. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    **[Test build #72813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72813/testReport)** for PR 16672 at commit [`b238e8d`](https://github.com/apache/spark/commit/b238e8d34fc08b3f641c610dada582ca3ee2be2b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]insert/read data to a not exist...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100684204
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1431,4 +1431,30 @@ class HiveDDLSuite
           }
         }
       }
    +
    +  test("insert data to a table which has altered the table location " +
    +    "to an not exist location should success") {
    --- End diff --
    
    Test case names are not accurate after you add new test cases.  Actually, could you split the test cases? 
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by windpiger <gi...@git.apache.org>.
Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    ok thanks\uff0cits my pleasure\uff5e


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Actually, I found another issue in CTAS with pre-existing location. Maybe you can take that too? https://issues.apache.org/jira/browse/SPARK-19583


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a table wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72766/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert/read data to a not exist locati...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    The test case also can check another INSERT mode. INSERT OVERWRITE? Also verifying the behaviors for Hive Serde tables? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]insert data to a not exist loca...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r97230092
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -240,7 +240,7 @@ class FindDataSourceTable(sparkSession: SparkSession) extends Rule[LogicalPlan]
                 // TODO: improve `InMemoryCatalog` and remove this limitation.
                 catalogTable = if (withHiveSupport) Some(table) else None)
     
    -        LogicalRelation(dataSource.resolveRelation(), catalogTable = Some(table))
    +        LogicalRelation(dataSource.resolveRelation(false), catalogTable = Some(table))
    --- End diff --
    
    Nit: `dataSource.resolveRelation(false)` -> `dataSource.resolveRelation(checkFilesExist = false)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16672#discussion_r100734735
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ---
    @@ -1431,4 +1432,133 @@ class HiveDDLSuite
           }
         }
       }
    +
    +  test("insert data to a data source table which has a not existed location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""CREATE TABLE t(a string, b int)
    +              |USING parquet
    +              |OPTIONS(path "file:${dir.getCanonicalPath}")
    +           """.stripMargin)
    +        var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        dir.delete
    +        assert(!new File(table.location).exists())
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        Utils.deleteRecursively(dir)
    +        assert(!new File(table.location).exists())
    +        spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +
    +        var newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
    +        spark.sql(s"ALTER TABLE t SET LOCATION '$newDir'")
    +        spark.sessionState.catalog.refreshTable(TableIdentifier("t"))
    +
    +        table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        assert(table.location == newDir)
    +        assert(!new File(newDir).exists())
    +
    +        spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
    +        checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
    +      }
    +    }
    +  }
    +
    +  test("insert into a data source table with no existed partition location should succeed") {
    +    withTable("t") {
    +      withTempDir { dir =>
    +        spark.sql(
    +          s"""CREATE TABLE t(a int, b int, c int, d int)
    +              |USING parquet
    +              |PARTITIONED BY(a, b)
    +              |LOCATION "file:${dir.getCanonicalPath}"
    +           """.stripMargin)
    +        var table = spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
    +        val expectedPath = s"file:${dir.getAbsolutePath.stripSuffix("/")}"
    +        assert(table.location.stripSuffix("/") == expectedPath)
    +
    +        spark.sql("INSERT INTO TABLE t PARTITION(a=1, b=2) SELECT 3, 4")
    +        checkAnswer(spark.table("t"), Row(3, 4, 1, 2) :: Nil)
    +
    +        val partLoc = new File(s"${dir.getAbsolutePath}/a=1")
    --- End diff --
    
    A general comment about the test cases. Can you please check whether the directory exists after the insert? It can help others confirm the path is correct


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/16672
  
    It seems to me following hive is safer, any other ideas?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org