You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "ozankabak (via GitHub)" <gi...@apache.org> on 2023/04/07 21:26:51 UTC

[GitHub] [arrow-datafusion] ozankabak commented on a diff in pull request #5908: Count agg support multiple expressions

ozankabak commented on code in PR #5908:
URL: https://github.com/apache/arrow-datafusion/pull/5908#discussion_r1160969184


##########
datafusion/expr/src/type_coercion/aggregates.rs:
##########
@@ -263,6 +263,14 @@ fn check_arg_count(
                 )));
             }
         }
+        TypeSignature::VariadicAny => {
+            if input_types.is_empty() {
+                return Err(DataFusionError::Plan(format!(
+                    "The function {:?} expects at least one arguments",

Review Comment:
   ```suggestion
                       "The function {:?} expects at least one argument",
   ```



##########
datafusion/expr/src/type_coercion/functions.rs:
##########
@@ -78,6 +78,9 @@ fn get_valid_types(
                 .map(|_| current_types[0].clone())
                 .collect()]
         }
+        TypeSignature::VariadicAny => vec![(0..current_types.len())
+            .map(|i| current_types[i].clone())
+            .collect()],

Review Comment:
   Would the more simple `vec![current_types.iter().map(|c| c.clone()).collect()]` work too?



##########
datafusion/physical-expr/src/aggregate/count.rs:
##########
@@ -53,11 +55,43 @@ impl Count {
     ) -> Self {
         Self {
             name: name.into(),
-            expr,
+            exprs: vec![expr],
             data_type,
             nullable: true,
         }
     }
+
+    pub fn new_with_multiple_exprs(
+        exprs: Vec<Arc<dyn PhysicalExpr>>,
+        name: impl Into<String>,
+        data_type: DataType,
+    ) -> Self {
+        Self {
+            name: name.into(),
+            exprs,
+            data_type,
+            nullable: true,
+        }
+    }
+}
+
+/// count null values for multiple columns
+/// for each row if one column value is null, then null_count + 1
+fn null_count_for_multiple_cols(values: &[ArrayRef]) -> usize {

Review Comment:
   I have a feeling this can be done with a simpler logic but we can always refactor later.



##########
datafusion/expr/src/signature.rs:
##########
@@ -46,6 +46,8 @@ pub enum TypeSignature {
     // A function such as `array` is `VariadicEqual`
     // The first argument decides the type used for coercion
     VariadicEqual,
+    /// arbitrary number of arguments of an arbitrary type

Review Comment:
   This means to say all arguments can have different types right? I suggest:
   ```suggestion
       /// arbitrary number of arguments with arbitrary types
   ```



##########
datafusion/core/tests/sql/errors.rs:
##########
@@ -58,7 +58,7 @@ async fn test_aggregation_with_bad_arguments() -> Result<()> {
     assert_eq!(
         err,
         DataFusionError::Plan(
-            "The function Count expects 1 arguments, but 0 were provided".to_string()
+            "The function Count expects at least one arguments".to_string()

Review Comment:
   ```suggestion
               "The function Count expects at least one argument".to_string()
   ```



##########
datafusion/expr/src/signature.rs:
##########
@@ -89,6 +91,13 @@ impl Signature {
             volatility,
         }
     }
+    /// variadic_any - Creates a variadic signature that represents an arbitrary number of arguments of the any type.

Review Comment:
   ```suggestion
       /// variadic_any - Creates a variadic signature that represents an arbitrary number of arguments of any type.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org