You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "ygf11 (via GitHub)" <gi...@apache.org> on 2023/02/22 08:26:53 UTC

[GitHub] [arrow-datafusion] ygf11 commented on a diff in pull request #5345: Refactor DecorrelateWhereExists and add back Distinct if needs

ygf11 commented on code in PR #5345:
URL: https://github.com/apache/arrow-datafusion/pull/5345#discussion_r1113983497


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1773,6 +1773,22 @@ impl Aggregate {
             _ => plan_err!("Could not coerce into Aggregate!"),
         }
     }
+
+    pub fn is_distinct(&self) -> datafusion_common::Result<bool> {
+        let group_expr_size = self.group_expr.len();
+        if !self.aggr_expr.is_empty() || group_expr_size != self.schema.fields().len() {
+            return Ok(false);
+        }
+
+        let expected_group_exprs = expand_wildcard(&self.schema, self.input.as_ref())?;
+        let expected_expr_set = expected_group_exprs.iter().collect::<HashSet<&Expr>>();
+        let group_expr_set = self.group_expr.iter().collect::<HashSet<&Expr>>();
+        Ok(group_expr_set
+            .intersection(&expected_expr_set)
+            .collect::<HashSet<_>>()
+            .len()
+            == group_expr_size)
+    }

Review Comment:
   A bool flag may be better.
   
   ```rust
   pub strct Aggregate {
       /// The incoming logical plan
       pub input: Arc<LogicalPlan>,
       /// Grouping expressions
       pub group_expr: Vec<Expr>,
       ...
       is_distinct: bool,
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org