You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/03/29 14:16:39 UTC

[GitHub] [spark] AngersZhuuuu opened a new pull request #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty

AngersZhuuuu opened a new pull request #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066
 
 
   ### What changes were proposed in this pull request?
   When create MANAGED table, we always use default location, when validate the tableLocation is wrong?
   
   When create External Catalog, we also need to check table location to avoid rewrite data existed in location.
   
   
   ### Why are the changes needed?
   Avoid overwrite other data we needed 
   
   ### Does this PR introduce any user-facing change?
   When create external table,  also need to check location
   
   
   ### How was this patch tested?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#discussion_r399835785
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ##########
 @@ -328,15 +328,17 @@ class SessionCatalog(
 
   def validateTableLocation(table: CatalogTable): Unit = {
     // SPARK-19724: the default location of a managed table should be non-existent or empty.
-    if (table.tableType == CatalogTableType.MANAGED) {
-      val tableLocation =
-        new Path(table.storage.locationUri.getOrElse(defaultTablePath(table.identifier)))
-      val fs = tableLocation.getFileSystem(hadoopConf)
-
-      if (fs.exists(tableLocation) && fs.listStatus(tableLocation).nonEmpty) {
-        throw new AnalysisException(s"Can not create the managed table('${table.identifier}')" +
-          s". The associated location('${tableLocation.toString}') already exists.")
-      }
+    // SPARK-31298: when create external table also should be a non-existent or empty
 
 Review comment:
   This looks wrong, @AngersZhuuuu . `EXTERNAL TABLE` is used to mount an exiting table location.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#issuecomment-605680625
 
 
   Please see the definition.
   - https://cwiki.apache.org/confluence/display/Hive/Managed+vs.+External+Tables
   
   I'll close this PR first. We can discussion more at this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#issuecomment-605643420
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#issuecomment-605643420
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#issuecomment-605643295
 
 
   @dongjoon-hyun @wangyum 
   Maybe we should check this part first then do same change in Hive part, seems my pr https://github.com/apache/spark/pull/25290 's code is complex, here we can reuse some code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#issuecomment-605680831
 
 
   Although this PR is wrong, you may refocus SPARK-31298 on what is the problem you have. So, I still keep SPARK-31298 open. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #28066: [SPARK-31298][SQL] Create DataSource External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #28066: [SPARK-31298][SQL] Create External Table path also need non-existent or empty
URL: https://github.com/apache/spark/pull/28066#issuecomment-605643667
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org