You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/01 23:22:32 UTC

[GitHub] [spark] holdenk opened a new pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

holdenk opened a new pull request #31427:
URL: https://github.com/apache/spark/pull/31427


   ### What changes were proposed in this pull request?
   
   Delegate table name validation to the session catalog 
   
   
   ### Why are the changes needed?
   
   Queerying of tables with nested namespaces.
   
   ### Does this PR introduce _any_ user-facing change?
   
   SQL queries of nested namespace queries
   
   ### How was this patch tested?
   
   Unit tests updated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-777141815


   Please don't get me wrong. This PR doesn't take any approval from anyone including contributors before merging. I'm also pretty sure most committees folks wait committer's approval. We look to want to avoid picking a specific length of time but I don't think it means a week - I initiated such discussion because my PRs were holding for 6+ months and these PRs also reviewed and approved by contributors who works on such area. This PR is a different case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771248088


   **[Test build #134749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134749/testReport)** for PR 31427 at commit [`23964ff`](https://github.com/apache/spark/commit/23964ff1b4bdae26f4e5617322316197a388ae2c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
holdenk commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r570428856



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -2100,8 +2100,7 @@ class DataSourceV2SQLSuite
     val e1 = intercept[AnalysisException] {
       sql("DESCRIBE FUNCTION default.ns1.ns2.fun")
     }
-    assert(e1.message.contains(
-      "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun"))
+    assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'"))

Review comment:
       So the reason isn't that the namespace in session catalog must have exactly one name part anymore, so I think it's an OK change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771248088


   **[Test build #134749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134749/testReport)** for PR 31427 at commit [`23964ff`](https://github.com/apache/spark/commit/23964ff1b4bdae26f4e5617322316197a388ae2c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771263775


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39335/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776580168


   I tend to agree that limiting the namespace only on session catalog isn't good and I really would like to see bright solution which addresses this, but the new solution shouldn't conflict with the existing V1 space (table, view, temp view) which has no concept of "custom" catalog.
   
   Btw, looks like we are still not clear about the "role" of session catalog. Is it for extending V1 space, or make V2 catalog simply used by default? I see both claims from previous discussion and existing implementation out there. If it's limited to V1, limiting the namespace is natural as there's no "custom" catalog. If not, we'd better elaborate the problem(s) we have by limiting the namespace, and discuss what would be the ideal/realistic approach for Spark to deal with.
   (I can roughly imagine that that's from an use case that table in other catalog should be accessible in session catalog.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r568251053



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -2100,8 +2100,7 @@ class DataSourceV2SQLSuite
     val e1 = intercept[AnalysisException] {
       sql("DESCRIBE FUNCTION default.ns1.ns2.fun")
     }
-    assert(e1.message.contains(
-      "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun"))
+    assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'"))

Review comment:
       nit: Is it okay to remove the information about why a given name is unsupported?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776562258


   I'm not feeling good with this too, but not from the code diff, but from the process.
   
   I understand lazy consensus on code is applied by default ASF policy and Spark project doesn't redefine the policy (by BYLAWS), but nowadays everyone (not only from Spark project but also from majority of projects) is struggling to make sure the PR has at least one approval from committer. That has been considered as an implicit policy.
   
   I'm not claiming that we should construct BYLAWS to make sure binding +1 is minimal requirement on code change, but I'd say lazy consensus must be considered as a last resort (similar weight on veto), like after suffering from finding reviewers for months even trying to get traction from dev@ mailing list.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-777141815


   Please don't get me wrong. This PR doesn't take any approval from anyone including contributors before merging. I'm also pretty sure most committees folks wait committer's approval. We look to want to avoid picking a specific length of time but I don't think it means a week - I initiated such discussion because my PRs were holding for 6+ months and these PRs also reviewed by contributors who works on such area. This PR is a different case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771359661


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134749/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r573453436



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -2216,27 +2213,46 @@ class DataSourceV2SQLSuite
         sql("CREATE TABLE t USING json AS SELECT 1 AS i")
 
         val t = "spark_catalog.t"
+
         def verify(sql: String): Unit = {
           val e = intercept[AnalysisException](spark.sql(sql))
-          assert(e.message.contains(
-            s"The namespace in session catalog must have exactly one name part: $t"))
+          assert(e.message.contains(s"Table or view not found: $t"),

Review comment:
       The error message is confusing now because there is a table `t` and Spark still complains that `Table or view not found`. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r568251053



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -2100,8 +2100,7 @@ class DataSourceV2SQLSuite
     val e1 = intercept[AnalysisException] {
       sql("DESCRIBE FUNCTION default.ns1.ns2.fun")
     }
-    assert(e1.message.contains(
-      "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun"))
+    assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'"))

Review comment:
       nit: Is it okay to remove the information about why a given name is unsupported?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776562258


   I'm not feeling good with this too, but not from the code diff, but from the process.
   
   I understand lazy consensus on code is applied by default ASF policy (if I'm not mistaken) and Spark project doesn't redefine the policy (by BYLAWS), but nowadays everyone (not only from Spark project but also from majority of projects) is struggling to make sure the PR has at least one approval from committer. That has been considered as an implicit policy.
   
   I'm not claiming that we should construct BYLAWS to make sure binding +1 is minimal requirement on code change, but I'd say lazy consensus must be considered as a last resort (similar weight on veto), like after suffering from finding reviewers for months even trying to get traction from dev@ mailing list.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776457314


   Hey, I am still lost why we needed to change this. This is a regression in the error message, but I am not clear what's the gain here @dongjoon-hyun too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771270713


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39335/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771270740






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771270740


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39335/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771350989


   **[Test build #134749 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134749/testReport)** for PR 31427 at commit [`23964ff`](https://github.com/apache/spark/commit/23964ff1b4bdae26f4e5617322316197a388ae2c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771248088






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
holdenk commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r570428856



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -2100,8 +2100,7 @@ class DataSourceV2SQLSuite
     val e1 = intercept[AnalysisException] {
       sql("DESCRIBE FUNCTION default.ns1.ns2.fun")
     }
-    assert(e1.message.contains(
-      "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun"))
+    assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'"))

Review comment:
       So the reason isn't that the namespace in session catalog must have exactly one name part anymore, so I think it's an OK change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776580168


   I tend to agree that limiting the namespace only on session catalog isn't good and I really would like to see bright solution which addresses this, but the new solution shouldn't conflict with the existing V1 space (table, view, temp view) which has no concept of "custom" catalog.
   
   Btw, looks like we are still not clear about the "role" of session catalog. Is it for extending V1 space, or make V2 catalog simply used by default? I see both claims from previous discussion and existing implementation out there. If it's limited to V1, limiting the namespace is natural as there's no catalog. If not, we'd better elaborate the problem(s) we have by limiting the namespace, and discuss what would be the ideal/realistic approach for Spark to deal with.
   (I can roughly imagine that that's from an use case that table in other catalog should be accessible in session catalog.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776449191


   @holdenk Can we make sure PRs are merged with at least one approval from committers? And also please enrich the PR description a bit more: I don't see where the delegation happens. This PR simply removes the name check in `SessionCatalogAndIdentifier`.
   
   I'm not going to revert it as the behavior change seems only to happen in the error message. But I think we should explain clearly how we handle invalid identifiers now. There are two different errors: `Unsupported function name ...` and `Table or view not found ...`, and I'm curious about what leads to this difference.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776580168


   I tend to agree that limiting the namespace only on session catalog isn't good and I really would like to see bright solution which addresses this, but the new solution shouldn't conflict with the existing V1 space (table, view, temp view) which has no concept of catalog.
   
   Btw, looks like we are still not clear about the "role" of session catalog. Is it for extending V1 space, or make V2 catalog simply used by default? I see both claims from previous discussion and existing implementation out there. If it's limited to V1, limiting the namespace is natural as there's no catalog. If not, we'd better elaborate the problem(s) we have by limiting the namespace, and discuss what would be the ideal/realistic approach for Spark to deal with.
   (I can roughly imagine that that's from an use case that table in other catalog should be accessible in session catalog.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-777141815


   Please don't get me wrong. This PR doesn't take any approval from anyone including contributors before merging. I'm also pretty sure most committees folks wait committer's approval. We seem to want to avoid picking a specific length of time but I don't think it means a week - I initiated such discussion because my PRs were holding for 6+ months and these PRs also reviewed and approved by contributors who works on such area. This PR is a different case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] asfgit closed pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #31427:
URL: https://github.com/apache/spark/pull/31427


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776772987


   So I'm pretty sure most committees folks don't wait months for reviews. In the previous discussion that we had on the dev@ list we avoided picking a specific length of time but this exceeds a week so I think it's well within reason. If we want to change the process that's totally reasonable, and we can bring that discussion to dev@, I don't have strong feelings one way or the other.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776580168


   I tend to agree that limiting the namespace only on session catalog isn't good and I really would like to see bright solution which addresses this, but the new solution shouldn't conflict with the existing V1 space (table, view, temp view) which has no concept of catalog.
   
   And looks like we are still not clear about the "role" of session catalog. Is it for extending V1 space, or make V2 catalog simply used by default? I see both claims from previous discussion and existing implementation out there. If it's limited to V1, limiting the namespace is natural as there's no catalog. If not, we'd better elaborate the problem(s) we have by limiting the namespace, and discuss what would be the ideal/realistic approach for Spark to deal with.
   (I can roughly imagine that that's from an use case that table in other catalog should be accessible in session catalog.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776244069


   @maropu was your comment addressed properly? It would have been great if we had a review at least from the author or reviewer from the previous PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771270740






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #31427:
URL: https://github.com/apache/spark/pull/31427#discussion_r573248606



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -2100,8 +2100,7 @@ class DataSourceV2SQLSuite
     val e1 = intercept[AnalysisException] {
       sql("DESCRIBE FUNCTION default.ns1.ns2.fun")
     }
-    assert(e1.message.contains(
-      "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun"))
+    assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'"))

Review comment:
       Do you mean now we have the allowed case? Do we have a test case when it works if so?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-776367549


   Ah, yea. I didn't notice the reply.. I re-checked the code and the change looks reasonable. late lgtm. Thanks, @holdenk @HyukjinKwon 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771270740






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31427:
URL: https://github.com/apache/spark/pull/31427#issuecomment-771248088


   **[Test build #134749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134749/testReport)** for PR 31427 at commit [`23964ff`](https://github.com/apache/spark/commit/23964ff1b4bdae26f4e5617322316197a388ae2c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org