You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/22 16:54:49 UTC

[GitHub] [spark] tedyu opened a new pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

tedyu opened a new pull request #31613:
URL: https://github.com/apache/spark/pull/31613


   ### What changes were proposed in this pull request?
   exprId is appended to each reference for ambiguousReferences.
   
   ### Why are the changes needed?
   In the current AnalysisException for ambiguousReferences, the strings of references may look the same.
   ```
   org.apache.spark.sql.AnalysisException: Reference 'phone->'key'->1->'m'->2->>'b'' is ambiguous, could be: mycatalog.test.person.phone->'key'->1->'m'->2->>'b', mycatalog.test.person.phone->'key'->1->'m'->2->>'b'.; line 1 pos 8
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Verified via local build where exprId is observed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784585920


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135385/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581515619



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       Does https://github.com/apache/spark/pull/31613/files#r581342459 answer @cloud-fan's question?
   
   So with the attr ID added, how it helps the case you show? I think it is the point we care about.
   
   BTW, what is `get_json_string`? Do you mean `get_json_object`?
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783718360


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135356/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784485877


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39965/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581566633



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       Considering snippet of physical plan:
   ```
         +- BatchScan[id#6, address#7, phone->'key'->1->'m'->2->>'b'#10, phone->'key'->1->'m'->2->'b'#12] Cassandra Scan: test.person
   ```
   multiple json path expressions would be accompanied by ExprId.id. It would be easier to match the reference (with exprId.id) given in the AnalysisException with the expression.
   
   w.r.t. get_json_string, it is a function which is interpreted by Spark extension, translating arguments to json path expression.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu edited a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu edited a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784425256


   @cloud-fan @imback82 
   I added the link to the Java test code in Yugabyte DB to description.
   Since the test code uses Spark extension, I am not sure how to trigger the same situation without using Spark extension.
   
   I have updated relevant out files under sql/core/src/test/resources/sql-tests/results/postgreSQL which show the issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783562009


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39935/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] imback82 commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
imback82 commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581282942



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
##########
@@ -359,7 +359,18 @@ package object expressions  {
 
         case ambiguousReferences =>
           // More than one match.
-          val referenceNames = ambiguousReferences.map(_.qualifiedName).mkString(", ")
+          var referenceNames = ""
+          if (ambiguousReferences.map(_.qualifiedName).toSet.size == 1) {

Review comment:
       Can you add a test?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783687592


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39937/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581326404



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       Adding the attr ID doesn't seem to help much. Users still don't know how to fix the query (is it un-fixable?).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] imback82 commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
imback82 commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784411264


   cc @cloud-fan @viirya 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783611882


   @maropu 
   Can you take a look ?
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783687612


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39937/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581308665



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
##########
@@ -359,7 +359,18 @@ package object expressions  {
 
         case ambiguousReferences =>
           // More than one match.
-          val referenceNames = ambiguousReferences.map(_.qualifiedName).mkString(", ")
+          var referenceNames = ""
+          if (ambiguousReferences.map(_.qualifiedName).toSet.size == 1) {

Review comment:
       I have updated relevant out files under sql/core/src/test/resources/sql-tests/results/postgreSQL




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] imback82 commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
imback82 commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784411056


   @tedyu Could you update the PR description `Does this PR introduce any user-facing change?` with the output after this PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784425256


   @cloud-fan @imback82 
   I added the link to the Java test code in Yugabyte DB to description.
   Since the test code uses Spark extension, I am not sure how to trigger the same situation without using Spark extension.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581473182



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       I have the same feeling with @cloud-fan. I think the other databases show a message message with the same granularity in the case, e.g.,
   ```
   postgres=# create table t1 (id int);
   CREATE TABLE
   postgres=# create table t2 (id int);
   CREATE TABLE
   postgres=# select * from t1, t2 where id = id;
   ERROR:  column reference "id" is ambiguous
   LINE 1: select * from t1, t2 where id = id;
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783629512


   **[Test build #135352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135352/testReport)** for PR 31613 at commit [`254da04`](https://github.com/apache/spark/commit/254da048fc12861ed8455bc55ddec3a12161b998).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783609167






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783687612


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39937/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783611882


   @maropu 
   Can you take a look ?
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783646666


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135352/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784417427


   Can we use a simple example to demonstrate the change, instead of `phone->'key'->1->'m'->2->>'b'`?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784395396


   @imback82 
   Can you take a look ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783661051


   **[Test build #135356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135356/testReport)** for PR 31613 at commit [`43e2709`](https://github.com/apache/spark/commit/43e270948efce532ae8288589d8849482cbabc99).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783718360


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135356/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784449610


   **[Test build #135385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135385/testReport)** for PR 31613 at commit [`e51a428`](https://github.com/apache/spark/commit/e51a428ef0ee77c91f8cd07d02fcf08a03a29f69).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783661051


   **[Test build #135356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135356/testReport)** for PR 31613 at commit [`43e2709`](https://github.com/apache/spark/commit/43e270948efce532ae8288589d8849482cbabc99).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] imback82 commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
imback82 commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581287156



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
##########
@@ -359,7 +359,18 @@ package object expressions  {
 
         case ambiguousReferences =>
           // More than one match.
-          val referenceNames = ambiguousReferences.map(_.qualifiedName).mkString(", ")
+          var referenceNames = ""
+          if (ambiguousReferences.map(_.qualifiedName).toSet.size == 1) {

Review comment:
       Also, `SQLQueryTestSuite` is failing, so you many want to check.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783671243


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39937/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu closed pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu closed pull request #31613:
URL: https://github.com/apache/spark/pull/31613


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783718202


   **[Test build #135356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135356/testReport)** for PR 31613 at commit [`43e2709`](https://github.com/apache/spark/commit/43e270948efce532ae8288589d8849482cbabc99).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784449610


   **[Test build #135385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135385/testReport)** for PR 31613 at commit [`e51a428`](https://github.com/apache/spark/commit/e51a428ef0ee77c91f8cd07d02fcf08a03a29f69).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784520918


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39965/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581566633



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       Considering snippet of physical plan:
   ```
         +- BatchScan[id#6, address#7, phone->'key'->1->'m'->2->>'b'#10, phone->'key'->1->'m'->2->'b'#12] Cassandra Scan: test.person
   ```
   multiple json path expressions would appear. It would be easier to match the reference (with exprId.id) given in the AnalysisException with the expression.
   
   w.r.t. get_json_string, it is a function which is interpreted by Spark extension, translating arguments to json path expression.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783547839


   **[Test build #135352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135352/testReport)** for PR 31613 at commit [`254da04`](https://github.com/apache/spark/commit/254da048fc12861ed8455bc55ddec3a12161b998).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581479534



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       The above example from postgres doesn't apply to the json path case because there is only one table.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783618313


   ```
   19:20:18.103 ERROR org.apache.spark.sql.SQLQueryTestSuite: Error using configs:
   [info] - udf/postgreSQL/udf-select_implicit.sql - Regular Python UDF *** FAILED *** (11 seconds, 22 milliseconds)
   [info]   udf/postgreSQL/udf-select_implicit.sql - Regular Python UDF
   [info]   Python: 3.7
   [info]   Expected "...guous, could be: x.b[, y.b].; line 3 pos 14", but got "...guous, could be: x.b[#ExprId(196786,0a40143b-4799-41c2-81db-91aee2983997), y.b#ExprId(196909,0a40143b-4799-41c2-81db-91aee2983997)].; line 3 pos 14" Result did not match for query #21
   ```
   Since JVM Id would vary from test run to test run, I would only show id after the '#' sign.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784504712


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39965/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784577991


   **[Test build #135385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135385/testReport)** for PR 31613 at commit [`e51a428`](https://github.com/apache/spark/commit/e51a428ef0ee77c91f8cd07d02fcf08a03a29f69).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581342459



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       For json path expression case, one expression is from filter and the other is from projection.
   ```
         String query = "SELECT id, address, get_json_string(phone, '$.key[1].m[2].b') as key " +
                       "FROM mycatalog.test.person " +
                       "WHERE get_json_string(phone, '$.key[1].m[2].b') >= '100' order by id limit 2";
   ```
   get_json_string produces the json path expression.
   
   See innerResolve() of sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
   ```
   2021-02-21 04:04:00,467 (Time-limited test) [DEBUG - org.apache.spark.internal.Logging.logDebug(Logging.scala:61)] inner Resolving 'phone->'key'->1->'m'->2->>'b' to phone->'key'->1->'m'->2->>'b'#25
   ```
   I am willing to get input from people who are familiar with the analyzer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783609167






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783582065


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39935/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
tedyu commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-785482922


   Thanks for the comments.
   
   Looks like the change itself leaves something to be desired.
   
   Closing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783646666


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135352/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-783547839


   **[Test build #135352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135352/testReport)** for PR 31613 at commit [`254da04`](https://github.com/apache/spark/commit/254da048fc12861ed8455bc55ddec3a12161b998).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784585920


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135385/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31613:
URL: https://github.com/apache/spark/pull/31613#issuecomment-784520918


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39965/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31613: [SPARK-34476][SQL] Duplicate referenceNames are given for ambiguousReferences

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31613:
URL: https://github.com/apache/spark/pull/31613#discussion_r581633880



##########
File path: sql/core/src/test/resources/sql-tests/results/postgreSQL/join.sql.out
##########
@@ -3225,7 +3225,7 @@ select * from
 struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
-Reference 'f1' is ambiguous, could be: j.f1, j.f1.; line 2 pos 63
+Reference 'f1' is ambiguous, could be: j.f1#x, j.f1#x.; line 2 pos 63

Review comment:
       It doesn't help because end-users can't specify attr id when referring to the column. If a table/relation has duplicated column names, I think the only way out is to get the column by position, e.g. `df.select(Column(df.logicalPlan.output(2)))`, and attr id doesn't matter.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org