You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/23 11:41:42 UTC
[GitHub] [spark] deshanxiao opened a new pull request, #37628: [SPARK-40192][MINOR] Remove redundant groupby
deshanxiao opened a new pull request, #37628:
URL: https://github.com/apache/spark/pull/37628
### What changes were proposed in this pull request?
Remove redundant groupby invoking in code.
### Why are the changes needed?
For Code optimization.
`Dataset.agg()` has invoked the function `groupBy()`. We don't need to call `groupBy` again before executing `agg()`.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing UT
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] deshanxiao commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
deshanxiao commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225075925
cc @zhengruifeng @LuciferYang @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1227285291
Merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225121461
LGTM for the ML side
But for the SQL side, all changes are in test suites, I am not sure whether those `groupby` were added intendedly
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen closed pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
srowen closed pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
URL: https://github.com/apache/spark/pull/37628
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #37628: [SPARK-40192][MINOR] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1223968260
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225323986
> LGTM for the ML side
>
> But for the SQL side, all changes are in test suites, I am not sure whether those `groupby` were added intendedly
+1, Agree. Need to be check the `groupby` is not intentionally added
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #37628: [SPARK-40192][MINOR] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1224260354
The pr title should be `[SPARK-40192][SQL][ML] Remove redundant groupby`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225725832
There is no functional difference, as .agg() is the same as .groupBy().agg(). I'm OK with it but it doesn't optimize anything
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org