You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jaylmiller (via GitHub)" <gi...@apache.org> on 2023/02/23 20:55:12 UTC

[GitHub] [arrow-datafusion] jaylmiller opened a new issue, #5378: UDF with zero params broken (doesn't receive null array as input)

jaylmiller opened a new issue, #5378:
URL: https://github.com/apache/arrow-datafusion/issues/5378

   **Describe the bug**
   For a UDF with zero args, the docs say that the implementation function will receive a single array in its input, consisting of null values, so that the implementation knows the size of the array that needs to be output.
   
   **To Reproduce**
   ```rust
   use std::sync::Arc;
   
   use arrow::{
       array::{ArrayRef, UInt64Array},
       record_batch::RecordBatch,
   };
   use arrow_schema::DataType;
   use datafusion::{
       datasource::MemTable, error::Result, physical_plan::functions::make_scalar_function,
       prelude::SessionContext,
   };
   use datafusion_expr::{create_udf, Volatility};
   
   #[tokio::main]
   async fn main() -> Result<()> {
       let udf_impl = |args: &[ArrayRef]| {
           // according to datafusion docs, this should have a single arg
           // consisting of a null array
           assert_eq!(args.len(), 1);
           // use len of input array to know how big the output array needs to be
           let size = args[0].len();
           let output = (0..size).map(|_| 1).collect::<UInt64Array>();
           Ok(Arc::new(output) as ArrayRef)
       };
   
       let udf = create_udf(
           "no_args",
           vec![],
           Arc::new(DataType::UInt64),
           Volatility::Immutable,
           make_scalar_function(udf_impl),
       );
       let ctx = SessionContext::new();
       ctx.register_udf(udf);
       let batch = RecordBatch::try_from_iter(vec![(
           "col",
           Arc::new((0..5).collect::<UInt64Array>()) as ArrayRef,
       )])?;
       let table = MemTable::try_new(batch.schema(), vec![vec![batch]])?;
       ctx.register_table("test", Arc::new(table))?;
       ctx.sql("select no_args() from test").await?;
       // select no_args() does the same
       Ok(())
   }
   ```
   
   **Expected behavior**
   What the docs say the behavior will be:
   > ...with the exception of zero param function, where a singular element vec
   will be passed. In that case the single element is a null array to indicate
   the batch's row count (so that the generative zero-argument function can know
   the result array size).
   
   **Additional context**
   I can look into fixing this one.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio closed issue #5378: UDF with zero params broken (doesn't receive null array as input)

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio closed issue #5378: UDF with zero params broken (doesn't receive null array as input)
URL: https://github.com/apache/arrow-datafusion/issues/5378


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org