You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/03/29 19:35:14 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5772: Change required input ordering API to allow any NULLS FIRST / LAST and ASC / DESC

alamb commented on code in PR #5772:
URL: https://github.com/apache/arrow-datafusion/pull/5772#discussion_r1152395346


##########
datafusion/core/src/physical_plan/windows/mod.rs:
##########
@@ -187,6 +188,30 @@ fn create_built_in_window_expr(
     })
 }
 
+pub(crate) fn calc_requirements(
+    partition_by_exprs: &[Arc<dyn PhysicalExpr>],
+    orderby_sort_exprs: &[PhysicalSortExpr],
+) -> Option<Vec<PhysicalSortRequirement>> {
+    let mut sort_reqs = vec![];
+    for partition_by in partition_by_exprs {
+        sort_reqs.push(PhysicalSortRequirement {
+            expr: partition_by.clone(),
+            options: None,

Review Comment:
   Is it correct that being able to express `options:None` for partitioning operations is the key rationale (thing we can't do before) for this PR?



##########
datafusion/core/src/physical_plan/windows/mod.rs:
##########
@@ -187,6 +188,30 @@ fn create_built_in_window_expr(
     })
 }
 
+pub(crate) fn calc_requirements(
+    partition_by_exprs: &[Arc<dyn PhysicalExpr>],
+    orderby_sort_exprs: &[PhysicalSortExpr],
+) -> Option<Vec<PhysicalSortRequirement>> {
+    let mut sort_reqs = vec![];
+    for partition_by in partition_by_exprs {
+        sort_reqs.push(PhysicalSortRequirement {
+            expr: partition_by.clone(),
+            options: None,
+        });
+    }
+    for PhysicalSortExpr { expr, options } in orderby_sort_exprs {
+        let contains = sort_reqs.iter().any(|e| expr.eq(&e.expr));
+        if !contains {
+            sort_reqs.push(PhysicalSortRequirement {
+                expr: expr.clone(),
+                options: Some(*options),
+            });
+        }
+    }
+    // Convert empty result to None. Otherwise wrap result inside Some()
+    (!sort_reqs.is_empty()).then_some(sort_reqs)
+}
+

Review Comment:
   Computing `rownum()` over  `PARTITION BY a, ORDER BY a DESC`  I think will result in arbitrary assignments of row numbers (as @ozankabak  and @mustafasrepo  say above, the value of `a` within each partition is the same)
   
   
   



##########
datafusion/physical-expr/src/utils.rs:
##########
@@ -224,31 +245,102 @@ pub fn ordering_satisfy_concrete<F: FnOnce() -> EquivalenceProperties>(
     } else if required
         .iter()
         .zip(provided.iter())
-        .all(|(order1, order2)| order1.eq(order2))
+        .all(|(req, given)| req.eq(given))
     {
         true
     } else if let eq_classes @ [_, ..] = equal_properties().classes() {
-        let normalized_required_exprs = required
+        required
             .iter()
             .map(|e| {
                 normalize_sort_expr_with_equivalence_properties(e.clone(), eq_classes)
             })
-            .collect::<Vec<_>>();
-        let normalized_provided_exprs = provided
+            .zip(provided.iter().map(|e| {
+                normalize_sort_expr_with_equivalence_properties(e.clone(), eq_classes)
+            }))
+            .all(|(req, given)| req.eq(&given))
+    } else {
+        false
+    }
+}
+
+/// Checks whether the given [`PhysicalSortRequirement`]s are satisfied by the
+/// provided [`PhysicalSortExpr`]s.
+pub fn ordering_satisfy_requirement<F: FnOnce() -> EquivalenceProperties>(
+    provided: Option<&[PhysicalSortExpr]>,
+    required: Option<&[PhysicalSortRequirement]>,
+    equal_properties: F,
+) -> bool {
+    match (provided, required) {
+        (_, None) => true,
+        (None, Some(_)) => false,
+        (Some(provided), Some(required)) => {
+            ordering_satisfy_requirement_concrete(provided, required, equal_properties)
+        }
+    }
+}
+
+/// Checks whether the given [`PhysicalSortRequirement`]s are satisfied by the
+/// provided [`PhysicalSortExpr`]s.
+pub fn ordering_satisfy_requirement_concrete<F: FnOnce() -> EquivalenceProperties>(

Review Comment:
   Is this meant to be `pub`? It seems like callers should always use `ordering_satisfy_requirement`, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org