You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/06 10:45:06 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #3727: Consolidate and better tests for expression re-rewriting / aliasing

alamb commented on code in PR #3727:
URL: https://github.com/apache/arrow-datafusion/pull/3727#discussion_r988878264


##########
datafusion/optimizer/src/utils.rs:
##########
@@ -315,13 +316,63 @@ pub fn alias_cols(cols: &[Column]) -> Vec<Expr> {
         .collect()
 }
 
+/// Rewrites `expr` using `rewriter`, ensuring that the output has the
+/// same name as `expr` prior to rewrite, adding an alias if necessary.
+///
+/// This is important when optimzing plans to ensure the the output
+/// schema of plan nodes don't change after optimization
+pub fn rewrite_preserving_name<R>(expr: Expr, rewriter: &mut R) -> Result<Expr>
+where
+    R: ExprRewriter<Expr>,
+{
+    let original_name = name_for_alias(&expr)?;
+    let expr = expr.rewrite(rewriter)?;
+    add_alias_if_changed(original_name, expr)
+}
+
+/// Return the name to use for the specific Expr, recursing into
+/// `Expr::Sort` as appropriate
+fn name_for_alias(expr: &Expr) -> Result<String> {
+    match expr {
+        Expr::Sort { expr, .. } => name_for_alias(expr),

Review Comment:
   Basically because calling `Expr::name()` on a `Expr::Sort` will throw an error:
   
   https://github.com/apache/arrow-datafusion/blob/7c5c2e5/datafusion/expr/src/expr.rs#L1133-L1135
   
   I am not super thrilled in general about how this works -- I wonder if I should support calling `Expr::name()` on Expr::Sort` 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org