You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/31 21:27:35 UTC

[GitHub] [arrow] drusso commented on a change in pull request #9364: ARROW-11431: [Rust][DataFusion] Support the HAVING clause.

drusso commented on a change in pull request #9364:
URL: https://github.com/apache/arrow/pull/9364#discussion_r567486688



##########
File path: rust/datafusion/src/sql/planner.rs
##########
@@ -421,16 +415,85 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         // The SELECT expressions, with wildcards expanded.
         let select_exprs = self.prepare_select_exprs(&plan, &select.projection)?;
 
+        // Optionally the HAVING expression.
+        let having_expr_opt = if let Some(having_expr) = &select.having {
+            let having_expr = self.sql_expr_to_logical_expr(having_expr)?;
+
+            // This step "dereferences" any aliases in the HAVING clause.
+            //
+            // This is how we support queries with HAVING expressions that
+            // refer to aliased columns.
+            //
+            // For example:
+            //
+            //   SELECT c1 AS m FROM t HAVING m > 10;
+            //   SELECT c1, MAX(c2) AS m FROM t GROUP BY c1 HAVING m > 10;
+            //
+            // are rewritten as, respectively:
+            //
+            //   SELECT c1 AS m FROM t HAVING c1 > 10;
+            //   SELECT c1, MAX(c2) AS m FROM t GROUP BY c1 HAVING MAX(c2) > 10;
+            //
+            let having_expr =
+                resolve_aliases_to_exprs(&having_expr, &extract_aliases(&select_exprs))?;
+
+            Some(having_expr)
+        } else {
+            None
+        };
+
+        // The outer expressions we will search through for
+        // aggregates. Aggregates may be sourced from the SELECT...
+        let mut aggr_expr_haystack = select_exprs.clone();
+
+        // ... or from the HAVING.
+        if let Some(having_expr) = &having_expr_opt {
+            aggr_expr_haystack.push(having_expr.clone());
+        }
+
         // All of the aggregate expressions (deduplicated).
-        let aggr_exprs = find_aggregate_exprs(&select_exprs);
+        let aggr_exprs = find_aggregate_exprs(&aggr_expr_haystack);
 
-        let (plan, select_exprs_post_aggr) =
+        let (plan, select_exprs_post_aggr, having_expr_post_aggr_opt) =
             if !select.group_by.is_empty() || !aggr_exprs.is_empty() {
-                self.aggregate(&plan, &select_exprs, &select.group_by, &aggr_exprs)?
+                self.aggregate(
+                    &plan,
+                    &select_exprs,
+                    &having_expr_opt,
+                    &select.group_by,
+                    &aggr_exprs,
+                )?
             } else {
-                (plan, select_exprs)
+                if let Some(having_expr) = &having_expr_opt {
+                    let available_columns = select_exprs
+                        .iter()
+                        .map(|expr| expr_as_column_expr(expr, &plan))
+                        .collect::<Result<Vec<Expr>>>()?;
+
+                    // Ensure the HAVING expression is using only columns
+                    // provided by the SELECT.
+                    if !can_columns_satisfy_exprs(
+                        &available_columns,
+                        &vec![having_expr.clone()],

Review comment:
       I didn't end up making this change since the callee is expecting a `&Vec`. I gather we might generally prefer slices rather than vectors (please correct me if I am mistaken!), and since there's a handful of functions in ./utils.rs that can be updated, I can follow up with another PR to address them all in one shot. 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org