You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/02 07:37:44 UTC
[GitHub] [spark] beliefer opened a new pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
beliefer opened a new pull request #29626:
URL: https://github.com/apache/spark/pull/29626
### What changes were proposed in this pull request?
Spark SQL exists a bug show below:
```
spark.sql(
" SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 2, 3)")
.show()
+-----------------+--------------------+
|count(DISTINCT 2)|count(DISTINCT 2, 3)|
+-----------------+--------------------+
| 1| 1|
+-----------------+--------------------+
spark.sql(
" SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 3, 2)")
.show()
+-----------------+--------------------+
|count(DISTINCT 2)|count(DISTINCT 3, 2)|
+-----------------+--------------------+
| 1| 0|
+-----------------+--------------------+
```
The first query is correct, but the second query is not.
The root reason is the first not rewrited by `RewriteDistinctAggregates`.
### Why are the changes needed?
Fix a bug.
`SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 3, 2)` should return 1, 1
### Does this PR introduce _any_ user-facing change?
Yes
### How was this patch tested?
New UT
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686084180
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685692737
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r486020540
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -292,13 +292,14 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
// Final aggregate
val operators = expressions.map { e =>
val af = e.aggregateFunction
- val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
- } else {
- None
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ val naf = if (af.children.forall(_.foldable)) {
+ val firstChild = evalWithinGroup(id, af.children.head, condition)
Review comment:
can we add some comments to explain why we are doing it?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686300125
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128219/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689495913
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690035403
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686084180
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685995731
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687223438
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685427053
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689925385
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688085420
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685456568
**[Test build #128201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128201/testReport)** for PR 29626 at commit [`9ad3469`](https://github.com/apache/spark/commit/9ad3469343a43afa7b33f997b6cf21523cf0a65e).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689266971
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685459438
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685425718
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690036270
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128478/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686319737
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688088063
**[Test build #128345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128345/testReport)** for PR 29626 at commit [`fa7d578`](https://github.com/apache/spark/commit/fa7d578dfbedae50dd3bd794a1c94909d455e2a4).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685422998
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686485729
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689358829
**[Test build #128438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128438/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690004634
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689495926
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128438/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689355568
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686965915
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685418113
**[Test build #128196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128196/testReport)** for PR 29626 at commit [`0f355ad`](https://github.com/apache/spark/commit/0f355ad454d0cf6ef6f1213400b7028244d60d95).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686299298
**[Test build #128219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128219/testReport)** for PR 29626 at commit [`52896fb`](https://github.com/apache/spark/commit/52896fbb534c6b6d70b4db9d980890af5f7ad6a8).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685423498
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687222135
**[Test build #128294 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128294/testReport)** for PR 29626 at commit [`43cb6a0`](https://github.com/apache/spark/commit/43cb6a09e18aff081bf6632635d4bcf54f9749ae).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690039026
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686195766
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690036260
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353373
**[Test build #128429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128429/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690002547
**[Test build #128479 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128479/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685479157
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690186690
github action passed, merging to master, thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686319171
**[Test build #128240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128240/testReport)** for PR 29626 at commit [`52896fb`](https://github.com/apache/spark/commit/52896fbb534c6b6d70b4db9d980890af5f7ad6a8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353934
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128429/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688085420
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689495913
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687223438
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689941405
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686195315
**[Test build #128219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128219/testReport)** for PR 29626 at commit [`52896fb`](https://github.com/apache/spark/commit/52896fbb534c6b6d70b4db9d980890af5f7ad6a8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689266524
**[Test build #128429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128429/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686195315
**[Test build #128219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128219/testReport)** for PR 29626 at commit [`52896fb`](https://github.com/apache/spark/commit/52896fbb534c6b6d70b4db9d980890af5f7ad6a8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690007069
**[Test build #128489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128489/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687042706
**[Test build #128294 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128294/testReport)** for PR 29626 at commit [`43cb6a0`](https://github.com/apache/spark/commit/43cb6a09e18aff081bf6632635d4bcf54f9749ae).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685426202
**[Test build #128198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128198/testReport)** for PR 29626 at commit [`e15519e`](https://github.com/apache/spark/commit/e15519ed85826e40eb8d36965048d30ce8f8ba4e).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686195766
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688061913
**[Test build #128341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128341/testReport)** for PR 29626 at commit [`64834f8`](https://github.com/apache/spark/commit/64834f82893957b2a5578f4e46610161350f0743).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689941405
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689266524
**[Test build #128429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128429/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484971289
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +295,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ if (distinctAggGroupLookup(e).contains(x)) {
Review comment:
I'm wondering if we can simplify the logic a bit. The goal is to only do the replacement for the first child if all the children are foldable. How about
```
val af = e.aggregateFunction
val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
val naf = if (af.children.forall(_.foldable)) {
val firstChild = evalWithinGroup(id, af.children.head, condition)
af.withNewChildren(firstChild +: af.children.drop(1)).asInstanceOf[AggregateFunction]
} else {
distinctAggChildAttrLookup.get(x).map(evalWithinGroup(id, _, condition))
}
```
Then we don't need to change a lot of code to create `distinctAggGroupLookup`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685415006
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685418113
**[Test build #128196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128196/testReport)** for PR 29626 at commit [`0f355ad`](https://github.com/apache/spark/commit/0f355ad454d0cf6ef6f1213400b7028244d60d95).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690826597
@cloud-fan Thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685459446
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128198/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689941100
**[Test build #128479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128479/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690175700
**[Test build #128498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128498/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685479183
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128197/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686319171
**[Test build #128240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128240/testReport)** for PR 29626 at commit [`52896fb`](https://github.com/apache/spark/commit/52896fbb534c6b6d70b4db9d980890af5f7ad6a8).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690035413
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128489/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687042706
**[Test build #128294 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128294/testReport)** for PR 29626 at commit [`43cb6a0`](https://github.com/apache/spark/commit/43cb6a09e18aff081bf6632635d4bcf54f9749ae).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484971289
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +295,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ if (distinctAggGroupLookup(e).contains(x)) {
Review comment:
I'm wondering if we can simplify the logic a bit. The goal is to only do the replacement for the first child if all the children are foldable. How about
```
val af = e.aggregateFunction
if (af.children.forall(_.foldable)) {
val firstChild = evalWithinGroup(id, af.children.head, condition)
af.withNewChildren(firstChild +: af.children.drop(1)).asInstanceOf[AggregateFunction]
} else {
// the original code
}
```
Then we don't need to change a lot of code to create `distinctAggGroupLookup`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687110082
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687110082
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686000897
**[Test build #128208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128208/testReport)** for PR 29626 at commit [`81eb9e4`](https://github.com/apache/spark/commit/81eb9e4bc0bad82a749cc3cacc18174b7831eafe).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686084183
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128208/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690002992
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690004634
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690039699
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r485303905
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +295,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ if (distinctAggGroupLookup(e).contains(x)) {
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688272142
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685479157
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685432292
cc @cloud-fan and @linhongliu-db
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484971289
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +295,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ if (distinctAggGroupLookup(e).contains(x)) {
Review comment:
I'm wondering if we can simplify the logic a bit. The goal is to only do the replacement for the first child if all the children are foldable. How about
```
val af = e.aggregateFunction
val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
val naf = if (af.children.forall(_.foldable)) {
val firstChild = evalWithinGroup(id, af.children.head, condition)
af.withNewChildren(firstChild +: af.children.drop(1)).asInstanceOf[AggregateFunction]
} else {
patchAggregateFunctionChildren(af) { x =>
distinctAggChildAttrLookup.get(x).map(evalWithinGroup(id, _, condition))
}
}
```
Then we don't need to change a lot of code to create `distinctAggGroupLookup`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686484035
**[Test build #128240 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128240/testReport)** for PR 29626 at commit [`52896fb`](https://github.com/apache/spark/commit/52896fbb534c6b6d70b4db9d980890af5f7ad6a8).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685422205
**[Test build #128197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128197/testReport)** for PR 29626 at commit [`e15519e`](https://github.com/apache/spark/commit/e15519ed85826e40eb8d36965048d30ce8f8ba4e).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685423506
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128196/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688272142
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690042647
**[Test build #128498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128498/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688061913
**[Test build #128341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128341/testReport)** for PR 29626 at commit [`64834f8`](https://github.com/apache/spark/commit/64834f82893957b2a5578f4e46610161350f0743).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690177064
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484184121
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -216,20 +216,24 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val distinctAggs = aggExpressions.filter(_.isDistinct)
// Extract distinct aggregate expressions.
- val distinctAggGroups = aggExpressions.filter(_.isDistinct).groupBy { e =>
- val unfoldableChildren = e.aggregateFunction.children.filter(!_.foldable).toSet
- if (unfoldableChildren.nonEmpty) {
- // Only expand the unfoldable children
- unfoldableChildren
- } else {
- // If aggregateFunction's children are all foldable
- // we must expand at least one of the children (here we take the first child),
- // or If we don't, we will get the wrong result, for example:
- // count(distinct 1) will be explained to count(1) after the rewrite function.
- // Generally, the distinct aggregateFunction should not run
- // foldable TypeCheck for the first child.
- e.aggregateFunction.children.take(1).toSet
- }
+ val distinctAggGroupMap = aggExpressions.filter(_.isDistinct).map { e =>
+ val unfoldableChildren = e.aggregateFunction.children.filter(!_.foldable).toSet
+ if (unfoldableChildren.nonEmpty) {
+ // Only expand the unfoldable children
+ e -> unfoldableChildren
+ } else {
+ // If aggregateFunction's children are all foldable
+ // we must expand at least one of the children (here we take the first child),
+ // or If we don't, we will get the wrong result, for example:
+ // count(distinct 1) will be explained to count(1) after the rewrite function.
+ // Generally, the distinct aggregateFunction should not run
+ // foldable TypeCheck for the first child.
+ e -> e.aggregateFunction.children.take(1).toSet
+ }
+ }
+ val distinctAggGroupLookup = distinctAggGroupMap.toMap
+ val distinctAggGroups = distinctAggGroupMap.groupBy(_._2).map{ kv =>
Review comment:
nit: `mapValues`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685415006
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685692737
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686300116
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686000897
**[Test build #128208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128208/testReport)** for PR 29626 at commit [`81eb9e4`](https://github.com/apache/spark/commit/81eb9e4bc0bad82a749cc3cacc18174b7831eafe).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686965915
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688085433
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128341/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353930
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484363415
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +297,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).getOrElse(None)
Review comment:
Done
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484971289
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +295,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ if (distinctAggGroupLookup(e).contains(x)) {
Review comment:
I'm wondering if we can simplify the logic a bit. The goal is to only do the replacement for the first child if all the children are foldable. How about
```
val af = e.aggregateFunction
val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
val naf = if (af.children.forall(_.foldable)) {
val firstChild = evalWithinGroup(id, af.children.head, condition)
af.withNewChildren(firstChild +: af.children.drop(1)).asInstanceOf[AggregateFunction]
} else {
// the original code
}
```
Then we don't need to change a lot of code to create `distinctAggGroupLookup`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686551259
cc @cloud-fan
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686965358
**[Test build #128285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128285/testReport)** for PR 29626 at commit [`8622411`](https://github.com/apache/spark/commit/8622411e78807a3008a44bb6b29ff18faa8293ac).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690177064
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685692746
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128201/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685995731
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688089036
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689924888
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689358829
**[Test build #128438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128438/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688089036
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690826597
@cloud-fan Thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690003006
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128479/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685457604
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r481900922
##########
File path: sql/core/src/test/resources/sql-tests/results/group-by-filter.sql.out
##########
@@ -150,6 +150,38 @@ struct<count(DISTINCT id) FILTER (WHERE false):bigint>
0
+-- !query
+SELECT COUNT(DISTINCT id), COUNT(DISTINCT 2,3) FILTER (WHERE dept_id = 40) FROM emp
+-- !query schema
+struct<count(DISTINCT id):bigint,count(DISTINCT 2, 3) FILTER (WHERE (dept_id = 40)):bigint>
+-- !query output
+8 0
+
+
+-- !query
+SELECT COUNT(DISTINCT id), COUNT(DISTINCT 3,2) FILTER (WHERE dept_id = 40) FROM emp
+-- !query schema
+struct<count(DISTINCT id):bigint,count(DISTINCT 3, 2) FILTER (WHERE (dept_id = 40)):bigint>
+-- !query output
+8 0
+
+
+-- !query
+SELECT COUNT(DISTINCT id), COUNT(DISTINCT 2,3) FILTER (WHERE dept_id = 30) FROM emp
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353930
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689925385
**[Test build #128478 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128478/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688062450
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690004119
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688062450
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687043287
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686319737
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689355143
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686485729
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687043287
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690039699
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685459183
**[Test build #128198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128198/testReport)** for PR 29626 at commit [`e15519e`](https://github.com/apache/spark/commit/e15519ed85826e40eb8d36965048d30ce8f8ba4e).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `case class JDBCScan(`
* `case class JDBCScanBuilder(`
* `case class JDBCWriteBuilder(schema: StructType, options: JdbcOptionsInWrite) extends V1WriteBuilder`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690177072
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128498/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689355568
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690036260
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685426202
**[Test build #128198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128198/testReport)** for PR 29626 at commit [`e15519e`](https://github.com/apache/spark/commit/e15519ed85826e40eb8d36965048d30ce8f8ba4e).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685478258
**[Test build #128197 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128197/testReport)** for PR 29626 at commit [`e15519e`](https://github.com/apache/spark/commit/e15519ed85826e40eb8d36965048d30ce8f8ba4e).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `case class JDBCScan(`
* `case class JDBCScanBuilder(`
* `case class JDBCWriteBuilder(schema: StructType, options: JdbcOptionsInWrite) extends V1WriteBuilder`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-687108506
**[Test build #128285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128285/testReport)** for PR 29626 at commit [`8622411`](https://github.com/apache/spark/commit/8622411e78807a3008a44bb6b29ff18faa8293ac).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685457604
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690002992
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685422998
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686084085
**[Test build #128208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128208/testReport)** for PR 29626 at commit [`81eb9e4`](https://github.com/apache/spark/commit/81eb9e4bc0bad82a749cc3cacc18174b7831eafe).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686300116
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484357029
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -216,20 +216,24 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val distinctAggs = aggExpressions.filter(_.isDistinct)
// Extract distinct aggregate expressions.
- val distinctAggGroups = aggExpressions.filter(_.isDistinct).groupBy { e =>
- val unfoldableChildren = e.aggregateFunction.children.filter(!_.foldable).toSet
- if (unfoldableChildren.nonEmpty) {
- // Only expand the unfoldable children
- unfoldableChildren
- } else {
- // If aggregateFunction's children are all foldable
- // we must expand at least one of the children (here we take the first child),
- // or If we don't, we will get the wrong result, for example:
- // count(distinct 1) will be explained to count(1) after the rewrite function.
- // Generally, the distinct aggregateFunction should not run
- // foldable TypeCheck for the first child.
- e.aggregateFunction.children.take(1).toSet
- }
+ val distinctAggGroupMap = aggExpressions.filter(_.isDistinct).map { e =>
+ val unfoldableChildren = e.aggregateFunction.children.filter(!_.foldable).toSet
+ if (unfoldableChildren.nonEmpty) {
+ // Only expand the unfoldable children
+ e -> unfoldableChildren
+ } else {
+ // If aggregateFunction's children are all foldable
+ // we must expand at least one of the children (here we take the first child),
+ // or If we don't, we will get the wrong result, for example:
+ // count(distinct 1) will be explained to count(1) after the rewrite function.
+ // Generally, the distinct aggregateFunction should not run
+ // foldable TypeCheck for the first child.
+ e -> e.aggregateFunction.children.take(1).toSet
+ }
+ }
+ val distinctAggGroupLookup = distinctAggGroupMap.toMap
+ val distinctAggGroups = distinctAggGroupMap.groupBy(_._2).map{ kv =>
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685456568
**[Test build #128201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128201/testReport)** for PR 29626 at commit [`9ad3469`](https://github.com/apache/spark/commit/9ad3469343a43afa7b33f997b6cf21523cf0a65e).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688084973
**[Test build #128341 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128341/testReport)** for PR 29626 at commit [`64834f8`](https://github.com/apache/spark/commit/64834f82893957b2a5578f4e46610161350f0743).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689941100
**[Test build #128479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128479/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r484184462
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -293,12 +297,16 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
val operators = expressions.map { e =>
val af = e.aggregateFunction
val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).getOrElse(None)
Review comment:
nit: is `.getOrElse(None)` needed?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r481897176
##########
File path: sql/core/src/test/resources/sql-tests/results/group-by-filter.sql.out
##########
@@ -150,6 +150,38 @@ struct<count(DISTINCT id) FILTER (WHERE false):bigint>
0
+-- !query
+SELECT COUNT(DISTINCT id), COUNT(DISTINCT 2,3) FILTER (WHERE dept_id = 40) FROM emp
+-- !query schema
+struct<count(DISTINCT id):bigint,count(DISTINCT 2, 3) FILTER (WHERE (dept_id = 40)):bigint>
+-- !query output
+8 0
+
+
+-- !query
+SELECT COUNT(DISTINCT id), COUNT(DISTINCT 3,2) FILTER (WHERE dept_id = 40) FROM emp
+-- !query schema
+struct<count(DISTINCT id):bigint,count(DISTINCT 3, 2) FILTER (WHERE (dept_id = 40)):bigint>
+-- !query output
+8 0
+
+
+-- !query
+SELECT COUNT(DISTINCT id), COUNT(DISTINCT 2,3) FILTER (WHERE dept_id = 30) FROM emp
Review comment:
can we test `WHERE dept_id > 0`? to make sure that we get correct result even if there are more than one inputs after filter.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689495514
**[Test build #128438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128438/testReport)** for PR 29626 at commit [`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685422205
**[Test build #128197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128197/testReport)** for PR 29626 at commit [`e15519e`](https://github.com/apache/spark/commit/e15519ed85826e40eb8d36965048d30ce8f8ba4e).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685692167
**[Test build #128201 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128201/testReport)** for PR 29626 at commit [`9ad3469`](https://github.com/apache/spark/commit/9ad3469343a43afa7b33f997b6cf21523cf0a65e).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690035403
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690035170
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685459438
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689925681
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688088063
**[Test build #128345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128345/testReport)** for PR 29626 at commit [`fa7d578`](https://github.com/apache/spark/commit/fa7d578dfbedae50dd3bd794a1c94909d455e2a4).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685427053
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686965358
**[Test build #128285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128285/testReport)** for PR 29626 at commit [`8622411`](https://github.com/apache/spark/commit/8622411e78807a3008a44bb6b29ff18faa8293ac).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #29626:
URL: https://github.com/apache/spark/pull/29626
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-686315855
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689266971
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-690042647
**[Test build #128498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128498/testReport)** for PR 29626 at commit [`16588f5`](https://github.com/apache/spark/commit/16588f56d1d50e73908d5431062012a8ed20d922).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685423445
**[Test build #128196 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128196/testReport)** for PR 29626 at commit [`0f355ad`](https://github.com/apache/spark/commit/0f355ad454d0cf6ef6f1213400b7028244d60d95).
* This patch **fails MiMa tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-688271138
**[Test build #128345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128345/testReport)** for PR 29626 at commit [`fa7d578`](https://github.com/apache/spark/commit/fa7d578dfbedae50dd3bd794a1c94909d455e2a4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
beliefer commented on a change in pull request #29626:
URL: https://github.com/apache/spark/pull/29626#discussion_r486028869
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##########
@@ -292,13 +292,14 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] {
// Final aggregate
val operators = expressions.map { e =>
val af = e.aggregateFunction
- val naf = patchAggregateFunctionChildren(af) { x =>
- val condition = if (e.filter.isDefined) {
- e.filter.map(distinctAggFilterAttrLookup.get(_)).get
- } else {
- None
+ val condition = e.filter.map(distinctAggFilterAttrLookup.get(_)).flatten
+ val naf = if (af.children.forall(_.foldable)) {
+ val firstChild = evalWithinGroup(id, af.children.head, condition)
Review comment:
OK
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689925681
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29626: [WIP][SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-685423498
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org