You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/09/07 20:34:00 UTC

[jira] [Resolved] (ARROW-9836) [Rust] [DataFusion] Improve API for usage of UDFs

     [ https://issues.apache.org/jira/browse/ARROW-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Grove resolved ARROW-9836.
-------------------------------
    Fix Version/s: 2.0.0
       Resolution: Fixed

Issue resolved by pull request 8032
[https://github.com/apache/arrow/pull/8032]

> [Rust] [DataFusion] Improve API for usage of UDFs
> -------------------------------------------------
>
>                 Key: ARROW-9836
>                 URL: https://issues.apache.org/jira/browse/ARROW-9836
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust, Rust - DataFusion
>            Reporter: Jorge
>            Assignee: Jorge
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.0.0
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> TL;DR; currently, users call UDFs through
>   
>  {color:#000000}df.select(scalar_functions(“sqrt”, vec![col(“a”)], DataType::Float64)){color}
>   
>  Proposal:
>   
>  {color:#000000}let f = df.registry();{color}
> {color:#000000}df.select(f.udf(“sqrt”, vec![col(“a”)])?){color}
>   
>  so that they do not have to remember the UDFs return type when using it.
>   
>  This API will in the future allow to declare the UDF as part of the planning, like spark, instead of having to register it in the registry before using it (we just need to check if the UDF is registered or not before doing so).
>  See complete proposal here: [https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)