You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/12 15:57:17 UTC

[GitHub] [spark] maryannxue opened a new pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

maryannxue opened a new pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551
 
 
   ### What changes were proposed in this pull request?
   This PR adds a config for Dynamic Partition Pruning subquery duplication and turns it off by default due to its potential performance regression.
   When planning a DPP filter, it seeks to reuse the broadcast exchange relation if the corresponding join is a BHJ with the filter relation being on the build side, otherwise it will either opt out or plan the filter as an un-reusable subquery duplication based on the cost estimate. However, the cost estimate is not accurate and only takes into account the table scan overhead, thus adding an un-reusable subquery duplication DPP filter can sometimes cause perf regression.
   This PR turns off the subquery duplication DPP filter by:
   1. adding a config `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly` and setting it `true` by default.
   2. remove the existing meaningless config `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcast` since we always want to reuse broadcast results if possible. 
   
   ### Why are the changes needed?
   This is to fix a potential performance regression caused by DPP.
   
   ### Does this PR introduce any user-facing change?
   No.
   
   ### How was this patch tested?
   Updated DynamicPartitionPruningSuite to test the new configuration.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585466260
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118317/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378352814
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -474,10 +480,11 @@ class DynamicPartitionPruningSuite
    */
   test("filtering ratio policy fallback") {
     withSQLConf(
-      SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
+      SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST_ONLY.key -> "false") {
       Given("no stats and selective predicate")
       withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
-        SQLConf.DYNAMIC_PARTITION_PRUNING_USE_STATS.key -> "true") {
+        SQLConf.DYNAMIC_PARTITION_PRUNING_USE_STATS.key -> "true",
+        SQLConf.EXCHANGE_REUSE_ENABLED.key -> "false") {
 
 Review comment:
   shall we move it just below `SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST_ONLY.key -> "false"`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378353807
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -769,17 +762,18 @@ class DynamicPartitionPruningSuite
 
       checkAnswer(df,
         Row(1030, 2, 10, 3) ::
-        Row(1040, 2, 50, 3) ::
-        Row(1050, 2, 50, 3) ::
-        Row(1060, 2, 50, 3) :: Nil
+          Row(1040, 2, 50, 3) ::
 
 Review comment:
   the previous indentation is right.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585355989
 
 
   **[Test build #118317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118317/testReport)** for PR 27551 at commit [`29f5ae8`](https://github.com/apache/spark/commit/29f5ae8cca510cd8ddbcd0cf146587f2fba05fab).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585356678
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23075/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585278544
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23068/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378353425
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -543,10 +552,11 @@ class DynamicPartitionPruningSuite
    */
   test("filtering ratio policy with stats when the broadcast pruning is disabled") {
     withSQLConf(
-      SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
+      SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST_ONLY.key -> "false") {
       Given("disabling the use of stats in the DPP heuristic")
       withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
-        SQLConf.DYNAMIC_PARTITION_PRUNING_USE_STATS.key -> "false") {
+        SQLConf.DYNAMIC_PARTITION_PRUNING_USE_STATS.key -> "false",
+        SQLConf.EXCHANGE_REUSE_ENABLED.key -> "false") {
 
 Review comment:
   ditto

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585278544
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23068/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maryannxue commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
maryannxue commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585275793
 
 
   cc @cloud-fan @gatorsmile 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585404530
 
 
   **[Test build #118310 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118310/testReport)** for PR 27551 at commit [`3514cf4`](https://github.com/apache/spark/commit/3514cf4cd79b7d744b5529f542ba637d2e276642).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378352814
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -474,10 +480,11 @@ class DynamicPartitionPruningSuite
    */
   test("filtering ratio policy fallback") {
     withSQLConf(
-      SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
+      SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST_ONLY.key -> "false") {
       Given("no stats and selective predicate")
       withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
-        SQLConf.DYNAMIC_PARTITION_PRUNING_USE_STATS.key -> "true") {
+        SQLConf.DYNAMIC_PARTITION_PRUNING_USE_STATS.key -> "true",
+        SQLConf.EXCHANGE_REUSE_ENABLED.key -> "false") {
 
 Review comment:
   shall we move it to the outer `withSQLConf` and just below `SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST_ONLY.key -> "false"`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585466260
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118317/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585466248
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585356678
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23075/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585356668
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585355989
 
 
   **[Test build #118317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118317/testReport)** for PR 27551 at commit [`29f5ae8`](https://github.com/apache/spark/commit/29f5ae8cca510cd8ddbcd0cf146587f2fba05fab).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378353807
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -769,17 +762,18 @@ class DynamicPartitionPruningSuite
 
       checkAnswer(df,
         Row(1030, 2, 10, 3) ::
-        Row(1040, 2, 50, 3) ::
-        Row(1050, 2, 50, 3) ::
-        Row(1060, 2, 50, 3) :: Nil
+          Row(1040, 2, 50, 3) ::
 
 Review comment:
   the previous indentation is right.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378353902
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -791,59 +785,54 @@ class DynamicPartitionPruningSuite
 
       checkAnswer(df,
         Row(1030, 2, 10, 3) ::
-        Row(1040, 2, 50, 3) ::
-        Row(1050, 2, 50, 3) ::
-        Row(1060, 2, 50, 3) :: Nil
+          Row(1040, 2, 50, 3) ::
 
 Review comment:
   ditto

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585277687
 
 
   **[Test build #118310 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118310/testReport)** for PR 27551 at commit [`3514cf4`](https://github.com/apache/spark/commit/3514cf4cd79b7d744b5529f542ba637d2e276642).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585405320
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118310/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585277687
 
 
   **[Test build #118310 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118310/testReport)** for PR 27551 at commit [`3514cf4`](https://github.com/apache/spark/commit/3514cf4cd79b7d744b5529f542ba637d2e276642).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585356668
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585278529
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585405312
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#discussion_r378353902
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
 ##########
 @@ -791,59 +785,54 @@ class DynamicPartitionPruningSuite
 
       checkAnswer(df,
         Row(1030, 2, 10, 3) ::
-        Row(1040, 2, 50, 3) ::
-        Row(1050, 2, 50, 3) ::
-        Row(1060, 2, 50, 3) :: Nil
+          Row(1040, 2, 50, 3) ::
 
 Review comment:
   ditto

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585694109
 
 
   LGTM, merging to master/3.0!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585405320
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118310/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585465591
 
 
   **[Test build #118317 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118317/testReport)** for PR 27551 at commit [`29f5ae8`](https://github.com/apache/spark/commit/29f5ae8cca510cd8ddbcd0cf146587f2fba05fab).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585405312
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27551: [SPARK-30528][SQL] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585466248
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27551: [SPARK-30528] Turn off DPP subquery duplication by default
URL: https://github.com/apache/spark/pull/27551#issuecomment-585278529
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org