You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/23 11:41:42 UTC

[GitHub] [spark] deshanxiao opened a new pull request, #37628: [SPARK-40192][MINOR] Remove redundant groupby

deshanxiao opened a new pull request, #37628:
URL: https://github.com/apache/spark/pull/37628

   ### What changes were proposed in this pull request?
   Remove redundant groupby invoking in code.
   
   ### Why are the changes needed?
   For Code optimization.
   `Dataset.agg()` has invoked the function `groupBy()`. We don't need to call `groupBy` again before executing `agg()`.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Existing UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] deshanxiao commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
deshanxiao commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225075925

   cc @zhengruifeng @LuciferYang @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1227285291

   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225121461

   LGTM for the ML side
   
   But for the SQL side, all changes are in test suites, I am not sure whether those `groupby` were added intendedly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby
URL: https://github.com/apache/spark/pull/37628


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #37628: [SPARK-40192][MINOR] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1223968260

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225323986

   > LGTM for the ML side
   > 
   > But for the SQL side, all changes are in test suites, I am not sure whether those `groupby` were added intendedly
   
   +1, Agree. Need to be check the `groupby` is not intentionally added
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #37628: [SPARK-40192][MINOR] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1224260354

   The pr title should be `[SPARK-40192][SQL][ML] Remove redundant groupby`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37628: [SPARK-40192][SQL][ML] Remove redundant groupby

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37628:
URL: https://github.com/apache/spark/pull/37628#issuecomment-1225725832

   There is no functional difference, as .agg() is the same as .groupBy().agg(). I'm OK with it but it doesn't optimize anything


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org