You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/01 08:50:17 UTC

[GitHub] [spark] zhengruifeng commented on pull request #34504: [SPARK-37226][SQL] Filter push down through window if partitionSpec isEmpty

zhengruifeng commented on pull request #34504:
URL: https://github.com/apache/spark/pull/34504#issuecomment-983419947


   > > @zhengruifeng can you highlight the differences between your PR and this one?
   > 
   > IMHO, there are two main differences:
   > 
   > 1, a new node `RankLimit` is introduced, and it supports both the empty partitionSpec cases and non-empty partitionSpec cases. It could support `rank` and `dense_rank` as the rank function in the future;
   > 
   > 2, Normally,`TakeOrderedAndProjectExec` performs the top-K filtering in both mappers and the reducers, while `RankLimitExec` only filters rows in mappers.
   
   update on https://github.com/apache/spark/pull/34367/commits/877558e439663d1028028e9a332a5e4e6a18ad6c
   
   1, `RankLimit` now supports row_number/rank/dense_rank, empty and non-empty partitionSpec;
   2, two `RankLimitExec` nodes are inserted now, one on the map side and one on the reduce side; if there is no shuffle between the two `RankLimitExec` nodes, the filtering in the second node will be skiped;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org