You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/01 09:24:11 UTC

[GitHub] [arrow-datafusion] ming535 commented on a diff in pull request #2638: Fix limit pushdown

ming535 commented on code in PR #2638:
URL: https://github.com/apache/arrow-datafusion/pull/2638#discussion_r886586708


##########
datafusion/core/src/optimizer/limit_push_down.rs:
##########
@@ -40,29 +40,67 @@ impl LimitPushDown {
     }
 }
 
+/// Ancestor indicates the current ancestor in the LogicalPlan tree
+/// when traversing down related to "limit push down".
+enum Ancestor {
+    /// Limit
+    FromLimit,
+    /// Offset
+    FromOffset,
+    /// Other nodes that don't affect the adjustment of "Limit"
+    NotRelevant,
+}
+
+///
+/// When doing limit push down with "offset" and "limit" during traversal,
+/// the "limit" should be adjusted.
+/// limit_push_down is a recursive function that tracks three important information
+/// to make the adjustment.
+///
+/// 1. ancestor: the kind of Ancestor.
+/// 2. ancestor_offset: ancestor's offset value
+/// 3. ancestor_limit: ancestor's limit value
+///
+/// (ancestor_offset, ancestor_limit) is updated in the following cases
+/// 1. Ancestor_Limit(n1) -> .. -> Current_Limit(n2)
+///    When the ancestor is a "Limit" and the current node is a "Limit",
+///    it is updated to (None, min(n1, n2))).
+/// 2. Ancestor_Limit(n1) -> .. -> Current_Offset(m1)
+///    it is updated to (m1, n1 + m1).
+/// 3. Ancestor_Offset(m1) -> .. -> Current_Offset(m2)

Review Comment:
   > I think without subquery, it should not allowed two `offset`.
   
   I think the sql parser should forbids this (two `offset` without subquery). As of the plan tree building with builder api, I think it is fine that it is not aware of this as long as the semantics looks right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org