You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/01 17:35:38 UTC

[GitHub] [arrow-rs] alamb commented on issue #1047: Add `Scalar` / `Datum` support to compute kernels

alamb commented on issue #1047:
URL: https://github.com/apache/arrow-rs/issues/1047#issuecomment-1572509590

   > Taking a step back I had a potentially controversial thought, why not just treat a single element array as a scalar array?
   
   For what it is worth I think this is what DuckDB does (at least this is how I interpret this slide from the  [22 - DuckDB Internals (CMU Advanced Databases / Spring 2023)](https://www.youtube.com/watch?v=bZOvAKGkzpQ) lecture
   
   
   <img width="589" alt="Screenshot 2023-06-01 at 1 30 17 PM" src="https://github.com/apache/arrow-rs/assets/490673/575dbc83-f32c-4d23-9b85-afea579a2576">
   
   I think the biggest potential downside of the "use a single row to mean a scalar value" could be that it is confusing as an API. The two input array sizes have to match *except* when one of them (would it always be the right argument?) had a single row, in which case it would have a special fast path 🤔 But it wouldn't be clear from the call signature if there was a special fast path or not. 
   
   
   
   ```
   enum Datum<'a> {
       Array(&'a dyn Array),
       Scalar(&'a dyn Scalar)
   }
   ``
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org