You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/26 06:14:21 UTC

[GitHub] [spark] wangyum opened a new pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

wangyum opened a new pull request #30146:
URL: https://github.com/apache/spark/pull/30146


   ### What changes were proposed in this pull request?
   
   This pr add support dynamic pruning on data column.  Dynamic pruning on data column support 2 cases:
   1. Pruning on aggregate function to reduce shuffle data:
   ![image](https://user-images.githubusercontent.com/5399861/97138993-72968d80-1794-11eb-84c7-4ad59c2bc1e8.png)
   
   2. Pruning on sort merge join to reduce shuffle data:
   ![image](https://user-images.githubusercontent.com/5399861/97139030-85a95d80-1794-11eb-9bb9-d0056d40f5d7.png)
   
   
   In addition, the dynamic pruning on data column also supports pushing down the dynamic filter to the data source.
   
   ### Why are the changes needed?
   
   Improve query performance.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Unit test and performance test:
   
   1. TPC-DS Performance Evaluation
   
   SQL | Shuffle stage origin data size | Pruning shuffle stage data size | Shuffle stage origin records | Pruning shuffle stage records | Disable shuffle pruning (second) | Enable shuffle pruning (second)
   -- | -- | -- | -- | -- | -- | --
   q13 | 3.2 GiB | 93.1 MiB | 86,409,332 | 1,480,662 | 13 | 14
   q16 | 158.2 GiB | 270.7 MiB | 7,136,969,739 | 10,295,246 | 84 | 36
   q24a | 355.6 GiB | 39.7 GiB | 13,428,037,922 | 1,504,810,137 | 660 | 432
   q24b | 355.6 GiB | 39.7 GiB | 13,428,037,922 | 1,504,810,137 | 660 | 492
   q65 | 37.2 GiB | 37.2 GiB | 2,627,543,089 | 2,627,543,089 | 45 | 45
   q72 | 40.9 GiB | 1543.3 MiB | 1,418,327,817 | 47,248,271 | 276 | 38
   q80 | 8.3 GiB | 8.2 GiB | 295,853,928 | 292,353,065 | 37 | 36
   q85 | 12.8 GiB | 1508.2 MiB | 329,635,219 | 37,337,231 | 16 | 12
   q93 | 26.7 GiB | 435.9 MiB | 1,389,592,792 | 20,744,184 | 270 | 258
   q94 | 87.6 GiB | 1012.0 MiB | 3,598,433,079 | 34,538,210 | 58 | 27
   q95 | 640.5 GiB | 344.6 GiB | 872,596,314 | 793,654,526 | 414 | 402
   
   2. Production Performance Evaluation
   
   
   
   
   Before this PR | After this PR
   -- | --
   ![image](https://user-images.githubusercontent.com/5399861/97138805-f8660900-1793-11eb-9c94-79ca49f2cb86.png) | ![image](https://user-images.githubusercontent.com/5399861/97138827-02880780-1794-11eb-8e91-082e2e6de80b.png)
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r512961316



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -245,6 +245,13 @@ object SQLConf {
     .stringConf
     .createOptional
 
+  val DYNAMIC_FILTER_PRUNING_ENABLED =
+    buildConf("spark.sql.optimizer.dynamicFilterPruning.enabled")

Review comment:
       `DYNAMIC_FILTER_PRUNING_ENABLED` -> `DYNAMIC_COLUMN_PRUNING_ENABLED`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716360998


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34858/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718030287


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34977/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716329185


   **[Test build #130258 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130258/testReport)** for PR 30146 at commit [`f5eef11`](https://github.com/apache/spark/commit/f5eef11d1cc0b86842ebe1ac29ce30e121b51eea).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717983578


   **[Test build #130374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130374/testReport)** for PR 30146 at commit [`9d6e97a`](https://github.com/apache/spark/commit/9d6e97a75ba0d1889d869f1ea4f970bf01140ac9).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r512962567



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -282,6 +289,23 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val DYNAMIC_DATA_PRUNING_ENABLED =

Review comment:
       This looks like the real config in this PR.
   `DYNAMIC_DATA_PRUNING_ENABLED` -> `DYNAMIC_COLUMN_PRUNING_ENABLED`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716502197






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717973045






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716419237






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716349277


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34858/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-774567540


   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718016063


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34976/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718802642


   **[Test build #130403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130403/testReport)** for PR 30146 at commit [`9bdb860`](https://github.com/apache/spark/commit/9bdb86005112443c48cb4965bc82406129e23f66).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716350833


   **[Test build #130258 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130258/testReport)** for PR 30146 at commit [`f5eef11`](https://github.com/apache/spark/commit/f5eef11d1cc0b86842ebe1ac29ce30e121b51eea).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class TableScanType(logicalRelation: LogicalRelation, isPartitionCol: Boolean)`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717983578


   **[Test build #130374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130374/testReport)** for PR 30146 at commit [`9d6e97a`](https://github.com/apache/spark/commit/9d6e97a75ba0d1889d869f1ea4f970bf01140ac9).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716351017


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716502197






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716361030






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716619906


   **[Test build #130277 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130277/testReport)** for PR 30146 at commit [`60b2109`](https://github.com/apache/spark/commit/60b210976048616b1f39ee576246977297c00e9c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718030318






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716492708


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34877/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717496117


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34941/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718016092






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum closed pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
wangyum closed pull request #30146:
URL: https://github.com/apache/spark/pull/30146


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718589836


   **[Test build #130403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130403/testReport)** for PR 30146 at commit [`9bdb860`](https://github.com/apache/spark/commit/9bdb86005112443c48cb4965bc82406129e23f66).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717511471






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r513530157



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -282,6 +289,23 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val DYNAMIC_DATA_PRUNING_ENABLED =

Review comment:
       This config is the switch of is the switch of the data column. May be `DYNAMIC_DATA_COLUMN_PRUNING_ENABLED`?
   But this is inconsistent with `DYNAMIC_PARTITION_PRUNING_ENABLED`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717472024


   Retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716352782


   **[Test build #130265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130265/testReport)** for PR 30146 at commit [`f5eef11`](https://github.com/apache/spark/commit/f5eef11d1cc0b86842ebe1ac29ce30e121b51eea).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717968735


   **[Test build #130373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130373/testReport)** for PR 30146 at commit [`13e62d5`](https://github.com/apache/spark/commit/13e62d585bbca60ec317f1dea2371d980002ca77).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718677037






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717474027


   **[Test build #130338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130338/testReport)** for PR 30146 at commit [`60b2109`](https://github.com/apache/spark/commit/60b210976048616b1f39ee576246977297c00e9c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r513526708



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -245,6 +245,13 @@ object SQLConf {
     .stringConf
     .createOptional
 
+  val DYNAMIC_FILTER_PRUNING_ENABLED =
+    buildConf("spark.sql.optimizer.dynamicFilterPruning.enabled")

Review comment:
       Yes. This PR introduces a new configuration: `spark.sql.optimizer.dynamicFilterPruning.enabled`, which is the master switch of Dynamic Filter Pruning, `spark.sql.optimizer.dynamicPartitionPruning.enabled` is the switch of the partition column, `spark.sql.optimizer.dynamicDataPruning.enabled` is the switch of the data column.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716419237


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718059084


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718589836


   **[Test build #130403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130403/testReport)** for PR 30146 at commit [`9bdb860`](https://github.com/apache/spark/commit/9bdb86005112443c48cb4965bc82406129e23f66).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718676973


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35007/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716351028


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130258/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r512962059



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -245,6 +245,13 @@ object SQLConf {
     .stringConf
     .createOptional
 
+  val DYNAMIC_FILTER_PRUNING_ENABLED =
+    buildConf("spark.sql.optimizer.dynamicFilterPruning.enabled")

Review comment:
       Ur, BTW, is this used in this PR?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718016092






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716351017






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716418995


   **[Test build #130265 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130265/testReport)** for PR 30146 at commit [`f5eef11`](https://github.com/apache/spark/commit/f5eef11d1cc0b86842ebe1ac29ce30e121b51eea).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class TableScanType(logicalRelation: LogicalRelation, isPartitionCol: Boolean)`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r512963049



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -282,6 +289,23 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val DYNAMIC_DATA_PRUNING_ENABLED =
+    buildConf("spark.sql.optimizer.dynamicDataPruning.enabled")
+      .doc("When true, we will generate predicate for column when it's used as join key " +

Review comment:
       `column` -> `data column` if this PR isn't aiming for `partition column`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717606921






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717996765


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34976/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716621660






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716329185


   **[Test build #130258 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130258/testReport)** for PR 30146 at commit [`f5eef11`](https://github.com/apache/spark/commit/f5eef11d1cc0b86842ebe1ac29ce30e121b51eea).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717474027


   **[Test build #130338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130338/testReport)** for PR 30146 at commit [`60b2109`](https://github.com/apache/spark/commit/60b210976048616b1f39ee576246977297c00e9c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717606226


   **[Test build #130338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130338/testReport)** for PR 30146 at commit [`60b2109`](https://github.com/apache/spark/commit/60b210976048616b1f39ee576246977297c00e9c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717511451


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34941/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718014940


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34977/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717973045


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717968735


   **[Test build #130373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130373/testReport)** for PR 30146 at commit [`13e62d5`](https://github.com/apache/spark/commit/13e62d585bbca60ec317f1dea2371d980002ca77).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716361030






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717606921






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716419244


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130265/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716402308


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34865/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716352782


   **[Test build #130265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130265/testReport)** for PR 30146 at commit [`f5eef11`](https://github.com/apache/spark/commit/f5eef11d1cc0b86842ebe1ac29ce30e121b51eea).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718030318






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717973012


   **[Test build #130373 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130373/testReport)** for PR 30146 at commit [`13e62d5`](https://github.com/apache/spark/commit/13e62d585bbca60ec317f1dea2371d980002ca77).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716351677


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716467612


   **[Test build #130277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130277/testReport)** for PR 30146 at commit [`60b2109`](https://github.com/apache/spark/commit/60b210976048616b1f39ee576246977297c00e9c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718059084






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718804027






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716402329






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718677037






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716621660






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716402329






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717973053


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130373/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718647046


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35007/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-717511471






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716467612


   **[Test build #130277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130277/testReport)** for PR 30146 at commit [`60b2109`](https://github.com/apache/spark/commit/60b210976048616b1f39ee576246977297c00e9c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716502179


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34877/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-716391455


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34865/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718058851


   **[Test build #130374 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130374/testReport)** for PR 30146 at commit [`9d6e97a`](https://github.com/apache/spark/commit/9d6e97a75ba0d1889d869f1ea4f970bf01140ac9).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718804027






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30146:
URL: https://github.com/apache/spark/pull/30146#discussion_r512961527



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -245,6 +245,13 @@ object SQLConf {
     .stringConf
     .createOptional
 
+  val DYNAMIC_FILTER_PRUNING_ENABLED =
+    buildConf("spark.sql.optimizer.dynamicFilterPruning.enabled")
+      .doc("When true, we will generate predicate for partition column when it's used as join key")

Review comment:
       `partition column` -> `data column`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30146: [SPARK-33241][SQL] Dynamic pruning on data column

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30146:
URL: https://github.com/apache/spark/pull/30146#issuecomment-718059093


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130374/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org