You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/22 12:07:51 UTC

[GitHub] [spark] peter-toth opened a new pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

peter-toth opened a new pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675
 
 
   ### What changes were proposed in this pull request?
   This PR fixes a bug in nested column aliasing by taking into account the data type of the referenced nested field to calculate the number of extracted columns. After this PR this query runs without issues:
   ```
   SELECT explodedvalue.*
   FROM VALUES array(named_struct('nested', named_struct('a', 1, 'b', 2))) AS (value)
   LATERAL VIEW explode(value) AS explodedvalue
   ```
   This is a regression from Spark 2.4.
   
   ### Why are the changes needed?
   To fix a bug.
   
   ### Does this PR introduce any user-facing change?
   No.
   
   ### How was this patch tested?
   Added new UT.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383460910
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   @dongjoon-hyun hmm, does this test have anything to do with limit push down? There is no `LimitPushDown` is the optimizer of this suite: https://github.com/apache/spark/blob/5a51b9472f5dfbf99ef6f8d6c7151618157c446e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala#L34-L39 and actually `limit` is closer to the relation in the original `query` than in `expected`, but I might be wrong.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590325422
 
 
   **[Test build #118869 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118869/testReport)** for PR 27675 at commit [`e4c9009`](https://github.com/apache/spark/commit/e4c900979baffd4f04f3d953b3f0b7ec3c387b2f).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590008439
 
 
   **[Test build #118819 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118819/testReport)** for PR 27675 at commit [`5a51b94`](https://github.com/apache/spark/commit/5a51b9472f5dfbf99ef6f8d6c7151618157c446e).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589989062
 
 
   **[Test build #118819 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118819/testReport)** for PR 27675 at commit [`5a51b94`](https://github.com/apache/spark/commit/5a51b9472f5dfbf99ef6f8d6c7151618157c446e).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590008580
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589950445
 
 
   **[Test build #118817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118817/testReport)** for PR 27675 at commit [`b09e19b`](https://github.com/apache/spark/commit/b09e19b6711b1d84774c4f6f22c51ce640ef2f72).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589950545
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23567/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#issuecomment-590313394
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589967940
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589967876
 
 
   **[Test build #118817 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118817/testReport)** for PR 27675 at commit [`b09e19b`](https://github.com/apache/spark/commit/b09e19b6711b1d84774c4f6f22c51ce640ef2f72).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590008582
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118819/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589950545
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23567/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383473150
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   `[SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition` is about that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
peter-toth commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-591279917
 
 
   > @peter-toth @dongjoon-hyun Can we backport this to 2.4?
   
   @gatorsmile only Spark 3 is affected.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589967941
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118817/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590468776
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118869/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589989182
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23569/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383460910
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   @dongjoon-hyun hmm, does this test have anything to do with limit push down? There is no `LimitPushDown` in the optimizer of this suite: https://github.com/apache/spark/blob/5a51b9472f5dfbf99ef6f8d6c7151618157c446e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala#L34-L39 and actually `limit` is closer to the relation in the original `query` than in `expected`, but I might be wrong.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589989176
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590448395
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383478996
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Yes, it is. In this test case there is no point in pushing down the `project` over the `limit`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590468770
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#issuecomment-590313404
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23617/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383456121
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   But the test name is `Some nested column means the whole structure`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590448395
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590312805
 
 
   **[Test build #118868 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118868/testReport)** for PR 27675 at commit [`6a6ea0d`](https://github.com/apache/spark/commit/6a6ea0d3296e4f999cadd321fdb929935224e845).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383459248
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Ya. It might be misleading now, but It was `true` in the original context.
   - https://github.com/apache/spark/pull/23964

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590325969
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23618/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590325969
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23618/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589950543
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590325422
 
 
   **[Test build #118869 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118869/testReport)** for PR 27675 at commit [`e4c9009`](https://github.com/apache/spark/commit/e4c900979baffd4f04f3d953b3f0b7ec3c387b2f).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589967940
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382934390
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Since we need the whole structure, why we expected the local relation to be column pruned?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590008582
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118819/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589989062
 
 
   **[Test build #118819 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118819/testReport)** for PR 27675 at commit [`5a51b94`](https://github.com/apache/spark/commit/5a51b9472f5dfbf99ef6f8d6c7151618157c446e).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#issuecomment-590313404
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23617/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590053149
 
 
   Thank you, @peter-toth .

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#issuecomment-590313394
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589950543
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
gatorsmile commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-591219213
 
 
   @peter-toth @dongjoon-hyun Could we backport this to 2.4? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
peter-toth commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590221472
 
 
   Thanks for your review, I will try to address your comments today.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#discussion_r383257527
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
 ##########
 @@ -3393,6 +3393,16 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
       )
     }
   }
+
+  test("SPARK-30870: Fix nested column aliasing") {
 
 Review comment:
   I've changed the name of the test and the PR. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589967941
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118817/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590448414
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118868/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382969363
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
 ##########
 @@ -129,7 +129,9 @@ object NestedColumnAliasing {
         // If all nested fields of `attr` are used, we don't need to introduce new aliases.
         // By default, ColumnPruning rule uses `attr` already.
         if (nestedFieldToAlias.nonEmpty &&
-            nestedFieldToAlias.length < totalFieldNum(attr.dataType)) {
+            nestedFieldToAlias
+              .map { case (nestedField, _) => totalFieldNum(nestedField.dataType) }
+              .sum < totalFieldNum(attr.dataType)) {
 
 Review comment:
   Ur, I see. nice catch.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590468776
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118869/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#issuecomment-590312805
 
 
   **[Test build #118868 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118868/testReport)** for PR 27675 at commit [`6a6ea0d`](https://github.com/apache/spark/commit/6a6ea0d3296e4f999cadd321fdb929935224e845).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382990794
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
 ##########
 @@ -3393,6 +3393,16 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
       )
     }
   }
+
+  test("SPARK-30870: Fix nested column aliasing") {
 
 Review comment:
   +1 for @maropu 's comment. Please revise the PR title together.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589989182
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23569/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
peter-toth commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590720451
 
 
   Thanks for the review @dongjoon-hyun, @HyukjinKwon, @maropu, @viirya.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590447475
 
 
   **[Test build #118868 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118868/testReport)** for PR 27675 at commit [`6a6ea0d`](https://github.com/apache/spark/commit/6a6ea0d3296e4f999cadd321fdb929935224e845).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589989176
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590008580
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-591537475
 
 
   Yes. It's only for 3.0 in Apache Spark, @gatorsmile .

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#discussion_r383269472
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
 ##########
 @@ -3393,6 +3393,16 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
       )
     }
   }
+
+  test("SPARK-30870: Fix nested column aliasing") {
 
 Review comment:
   Sorry, I've changed it again. Let me now if a different name would fit better.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382969306
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   btw, can you add tests in this suite, too?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590325958
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#discussion_r383258900
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   I could add something similar to the SQL test to here as well:
   ```
     test("SPARK-30870: Don't alias a nested column if it means the whole attribute") {
       val valueStructType = StructType.fromDDL("field struct<a:int, b:int>")
       val r = LocalRelation('value.struct(valueStructType))
   
       val field = GetStructField('value, 0, Some("field"))
   
       val query = r
         .limit(5)
         .select(field)
         .analyze
   
       val optimized = Optimize.execute(query)
   
       comparePlans(optimized, query)
     }
   ```
   but it wouldn't be much different to this particular test (`Some nested column means the whole structure`). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590325958
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383451101
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Hi, @peter-toth and @maropu .
   The original code is correct, this is not about column pruning. This is about `limit` push down.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590468770
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile edited a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
gatorsmile edited a comment on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-591219213
 
 
   @peter-toth @dongjoon-hyun Can we backport this to 2.4? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Don't alias a nested column if it means the whole attribute
URL: https://github.com/apache/spark/pull/27675#discussion_r383258900
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   I could add something similar to the SQL test to here as well:
   ```
     test("SPARK-30870: Don't alias nested field if it means a whole attribute") {
       val valueStructType = StructType.fromDDL("field struct<a:int, b:int>")
       val r = LocalRelation('value.struct(valueStructType))
   
       val field = GetStructField('value, 0, Some("field"))
   
       val query = r
         .limit(5)
         .select(field)
         .analyze
   
       val optimized = Optimize.execute(query)
   
       comparePlans(optimized, query)
     }
   ```
   but it wouldn't be much different to this particular test (`Some nested column means the whole structure`). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-590171352
 
 
   cc @viirya fyi

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#issuecomment-589950445
 
 
   **[Test build #118817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118817/testReport)** for PR 27675 at commit [`b09e19b`](https://github.com/apache/spark/commit/b09e19b6711b1d84774c4f6f22c51ce640ef2f72).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
peter-toth commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382934390
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Since we need the whole structure, why we did expect the local relation to be column pruned?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382969272
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Yea, it seems that's just a mistake. cc: @dongjoon-hyun  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590489985
 
 
   cc @dbtsai 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590448414
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118868/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #27675: [SPARK-30870][SQL] Fix nested column aliasing
URL: https://github.com/apache/spark/pull/27675#discussion_r382969291
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
 ##########
 @@ -3393,6 +3393,16 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
       )
     }
   }
+
+  test("SPARK-30870: Fix nested column aliasing") {
 
 Review comment:
   Can you make the test title clearer?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#issuecomment-590467932
 
 
   **[Test build #118869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118869/testReport)** for PR 27675 at commit [`e4c9009`](https://github.com/apache/spark/commit/e4c900979baffd4f04f3d953b3f0b7ec3c387b2f).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383473037
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Sorry, I meant a pushdown over `limit`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #27675: [SPARK-30870][SQL] Column pruning shouldn't alias a nested column if it means the whole structure
URL: https://github.com/apache/spark/pull/27675#discussion_r383474094
 
 

 ##########
 File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasingSuite.scala
 ##########
 @@ -215,12 +215,7 @@ class NestedColumnAliasingSuite extends SchemaPruningTest {
 
     val optimized = Optimize.execute(query)
 
-    val expected = nestedRelation
-      .select(GetStructField('a, 0, Some("b")))
-      .limit(5)
-      .analyze
-
-    comparePlans(optimized, expected)
+    comparePlans(optimized, query)
 
 Review comment:
   Hmm. I got it. So, this is the result of bug fix, isn't it?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org