You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "Reminiscent (via GitHub)" <gi...@apache.org> on 2023/04/13 10:32:40 UTC

[GitHub] [doris] Reminiscent opened a new issue, #18646: [Enhancement] filter on row_number() can be pushed down below the window

Reminiscent opened a new issue, #18646:
URL: https://github.com/apache/doris/issues/18646

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Description
   
   For window function row_number(), there is a common usage that user only wants to get the first N rows for each window, so user will add a filter on the results of row_number(). For example:`select *,row_number() over(partition by a order by b desc) row_num from t where row_num <= 100;` The filter `row_number <= 100` can be pushed down below the window operator, thus for each window can only process no more than 100 rows for the window function.
   
   ### Solution
   
   The details of the design will be added in the PR. Here just list some key points:
   1. This is a rule-based rewrite. So this rule will be added to the rule-based optimization phase.
   2. [Pattern Match] I will start with the simple case and expand the scope based on the user's requirements.
       * For the window function, the `row_number()` will be supported. And for other window function like `rank()`, `dense_rank()` can also work well in this rule will be considered in the future.
       * The conditions in filter should be simple at first, like `row_number </ <=/ = / >= / > int constant`. Other complex conditions will be considered in the future.
       * The window frame should be simple, for example there are only `order by` and `partition by` in the window frame.
       * [May need more considerations] The child node for the window part should be simple. For example, only support the scan node as the child of the window part.
   3. [Rewrite Result] After the rewrite, the plan will be changed from `filter -> window -> sort -> scan` to `filter -> window -> topn -> scan`.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xzj7019 commented on issue #18646: [Enhancement] filter on row_number() can be pushed down below the window

Posted by "xzj7019 (via GitHub)" <gi...@apache.org>.
xzj7019 commented on issue #18646:
URL: https://github.com/apache/doris/issues/18646#issuecomment-1512366225

   1. > or >= isn't for this optimization.
   2. if partition by order by pattern is preferred to be handled, window type needs to be specified to be UNBOUNDED to CURRENT.
   3. better to collect other similar win func into scope, such as count, percent_rank, cume_dist, which are aligned with partition topn semantics, and rank and dense_rank, which are aligned with partition topn-with-ties semantics.
   4. for fe side, it seems that putting the filter expr into winfunc desc is a necessary and leave the rest to be.
   5. seems that order by rk limit n is similar with the filter condition on rk, but with a lower priority.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] Reminiscent commented on issue #18646: [Enhancement] filter on row_number() can be pushed down below the window

Posted by "Reminiscent (via GitHub)" <gi...@apache.org>.
Reminiscent commented on issue #18646:
URL: https://github.com/apache/doris/issues/18646#issuecomment-1506732910

   /assign


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org