You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/19 09:06:36 UTC

[GitHub] [spark] prakharjain09 opened a new pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

prakharjain09 opened a new pull request #30426:
URL: https://github.com/apache/spark/pull/30426


   ### What changes were proposed in this pull request?
   This PR tries to reduce the number of physical aggregation nodes by collapsing the PARTIAL and the FINAL aggregation nodes together when there is no Exchange between them. 
   
   Example - consider the following query:
   
       SELECT sum(t2.col1), max(t2.col2), t1.col1, t1.col2
       FROM t1, t2
       WHERE t1.col1 = t2.col1
       GROUP BY t1.col1, t1.col2
   
   Current plan:
   
         == Physical Plan ==
         *(5) HashAggregate(keys=[col1#7, col2#8], functions=[sum(cast(col1#18 as bigint)), max(col2#19)], output=[sum(col1)#140L, max(col2)#141, col1#7, col2#8])
         +- *(5) HashAggregate(keys=[col1#7, col2#8], functions=[partial_sum(cast(col1#18 as bigint)), partial_max(col2#19)], output=[col1#7, col2#8, sum#148L, max#149])
            +- *(5) SortMergeJoin [col1#7], [col1#18], Inner
               :- *(2) Sort [col1#7 ASC NULLS FIRST], false, 0
               :  +- Exchange hashpartitioning(col1#7, 5), true, [id=#644]
               :     +- *(1) Project [value#2 AS col1#7, (value#2 % 10) AS col2#8]
               :        +- *(1) SerializeFromObject [input[0, int, false] AS value#2]
               :           +- Scan[obj#1]
               +- *(4) Sort [col1#18 ASC NULLS FIRST], false, 0
                  +- Exchange hashpartitioning(col1#18, 5), true, [id=#653]
                     +- *(3) Project [value#13 AS col1#18, (value#13 % 10) AS col2#19]
                        +- *(3) SerializeFromObject [input[0, int, false] AS value#13]
                           +- Scan[obj#12]
   
   
   The above plan can be optimized to following:
   
         == Physical Plan ==
         *(5) HashAggregate(keys=[col1#7, col2#8], functions=[sum(cast(col1#18 as bigint)), max(col2#19)], output=[sum(col1)#157L, max(col2)#158, col1#7, col2#8])
         +- *(5) SortMergeJoin [col1#7], [col1#18], Inner
            :- *(2) Sort [col1#7 ASC NULLS FIRST], false, 0
            :  +- Exchange hashpartitioning(col1#7, 5), true, [id=#727]
            :     +- *(1) Project [value#2 AS col1#7, (value#2 % 10) AS col2#8]
            :        +- *(1) SerializeFromObject [input[0, int, false] AS value#2]
            :           +- Scan[obj#1]
            +- *(4) Sort [col1#18 ASC NULLS FIRST], false, 0
               +- Exchange hashpartitioning(col1#18, 5), true, [id=#736]
                  +- *(3) Project [value#13 AS col1#18, (value#13 % 10) AS col2#19]
                     +- *(3) SerializeFromObject [input[0, int, false] AS value#13]
                        +- Scan[obj#12]
   
   ### Why are the changes needed?
   This change removed the unrequired Aggregation node and so will help in improving performance.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Added UTs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730234260


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731363978


   **[Test build #131427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131427/testReport)** for PR 30426 at commit [`e7f326a`](https://github.com/apache/spark/commit/e7f326a64719d8dcc339427ce803d5e438d3de9a).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class GetShufflePushMergerLocations(numMergersNeeded: Int, hostsToFilter: Set[String])`
     * `  case class RemoveShufflePushMergerLocation(host: String) extends ToBlockManagerMaster`
     * `abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes with NullIntolerant `
     * `case class LikeAll(child: Expression, patterns: Seq[UTF8String]) extends LikeAllBase `
     * `case class NotLikeAll(child: Expression, patterns: Seq[UTF8String]) extends LikeAllBase `
     * `case class ParseUrl(children: Seq[Expression], failOnError: Boolean = SQLConf.get.ansiEnabled)`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #30426:
URL: https://github.com/apache/spark/pull/30426


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731192899


   **[Test build #131427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131427/testReport)** for PR 30426 at commit [`e7f326a`](https://github.com/apache/spark/commit/e7f326a64719d8dcc339427ce803d5e438d3de9a).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731365070






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730320034


   ok to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731178107


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731215673


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731365070






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730414492






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730413934


   **[Test build #131347 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131347/testReport)** for PR 30426 at commit [`2c68fe3`](https://github.com/apache/spark/commit/2c68fe3b2cf1b39306291b1043c9549545917c6f).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730624646






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730363531


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35950/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730384017






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] abmodi commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
abmodi commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-738766897


   We have also seen the use case with customers when they do aggregation on close to primary keys.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731229635






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731215653


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36031/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731176751


   **[Test build #131425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131425/testReport)** for PR 30426 at commit [`dfad4fc`](https://github.com/apache/spark/commit/dfad4fc093c632f10d788668e7b51c4cfada840e).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730550628


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35964/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731215686


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/36031/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730323942


   **[Test build #131346 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131346/testReport)** for PR 30426 at commit [`5965fb9`](https://github.com/apache/spark/commit/5965fb994cd7bb1b1c21cc91a0fc04d3d3e76f45).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731215324


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36033/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] prakharjain09 commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
prakharjain09 commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731980696


   @maropu @cloud-fan Gentle reminder - Please review the changes and provide your feedback.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731229635


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730550663






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730346842


   **[Test build #131347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131347/testReport)** for PR 30426 at commit [`2c68fe3`](https://github.com/apache/spark/commit/2c68fe3b2cf1b39306291b1043c9549545917c6f).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730322618


   **[Test build #131346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131346/testReport)** for PR 30426 at commit [`5965fb9`](https://github.com/apache/spark/commit/5965fb994cd7bb1b1c21cc91a0fc04d3d3e76f45).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730363501


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35950/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731176751


   **[Test build #131425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131425/testReport)** for PR 30426 at commit [`dfad4fc`](https://github.com/apache/spark/commit/dfad4fc093c632f10d788668e7b51c4cfada840e).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731192899


   **[Test build #131427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131427/testReport)** for PR 30426 at commit [`e7f326a`](https://github.com/apache/spark/commit/e7f326a64719d8dcc339427ce803d5e438d3de9a).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-799018724


   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731178083


   **[Test build #131425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131425/testReport)** for PR 30426 at commit [`dfad4fc`](https://github.com/apache/spark/commit/dfad4fc093c632f10d788668e7b51c4cfada840e).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730363520


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730323975


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131346/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730234260


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730510326


   **[Test build #131360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131360/testReport)** for PR 30426 at commit [`e9a25d9`](https://github.com/apache/spark/commit/e9a25d9ea02e479504a599a317903fc71479f52b).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730414492


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730624058


   **[Test build #131360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131360/testReport)** for PR 30426 at commit [`e9a25d9`](https://github.com/apache/spark/commit/e9a25d9ea02e479504a599a317903fc71479f52b).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730363520






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730510326


   **[Test build #131360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131360/testReport)** for PR 30426 at commit [`e9a25d9`](https://github.com/apache/spark/commit/e9a25d9ea02e479504a599a317903fc71479f52b).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731215673






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731229650


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/36033/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730383992


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35951/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730550663






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730346025


   I remember SPARK-12978 (#15945 and #10896) and is this related to it? cc: @cloud-fan Btw, have you checked if this optimization could make some queries (e.g., TPCDS) faster? (I just want to know actual performance numbers)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730234855






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] prakharjain09 commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
prakharjain09 commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-734668231


   > So, could you give us a concrete example of how much it will improve performance?
   
   @maropu We have seen customer queries where Aggregation happens on close to primary keys. In those scenarios, it makes complete sense to remove redundant Aggregation operator as it will unnecessarily increase the execution time.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731229614


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36033/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731199831


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36031/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730322618


   **[Test build #131346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131346/testReport)** for PR 30426 at commit [`5965fb9`](https://github.com/apache/spark/commit/5965fb994cd7bb1b1c21cc91a0fc04d3d3e76f45).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] prakharjain09 commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
prakharjain09 commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730533709


   @maropu Thanks for pointing out to old PR and jirs - Yes SPARK-12978 seems related to SPARK-33486.
   
   > Btw, have you checked if this optimization could make some queries (e.g., TPCDS) faster?
   
   I did impact analysis on TPCDS 100 scale and didn't find noticeable improvement - In TPCDS at most of the places, the 1st HashAggregate (HA) reduces rows significantly and the 2nd HA doesn't take a lot of time after that.
   
   But we have seen some good improvements in some customer queries - Specifically when HA-1 doesn't reduce rows significantly. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730624653


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131360/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730534217


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35964/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-732517901


   > But we have seen some good improvements in some customer queries - Specifically when HA-1 doesn't reduce rows significantly.
   
   Yea, I've checked TPCDS performances w/this change again by myself, but I couldn't find any improvement. So, could you give us a concrete example of how much it will improve performance? This change can make rules complicated, so I think we need to consider the tradeoff between complexity and performance improvements.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730624646


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730234855


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730414504


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131347/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] prakharjain09 commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
prakharjain09 commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730345926


   cc - @maropu @cloud-fan @dongjoon-hyun 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731178115


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131425/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730323975






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730384017






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730350839


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35950/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-731178107






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730368778


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35951/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730346842


   **[Test build #131347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131347/testReport)** for PR 30426 at commit [`2c68fe3`](https://github.com/apache/spark/commit/2c68fe3b2cf1b39306291b1043c9549545917c6f).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org