You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/25 07:03:49 UTC

[GitHub] [arrow-datafusion] gaojun2048 opened a new issue #1882: UDF/UDAF plugin

gaojun2048 opened a new issue #1882:
URL: https://github.com/apache/arrow-datafusion/issues/1882


   Now we cannot use UDF and UDAF in ballista because ballista cannot know how to serialize and deserialize UDF / UDAF.
   We are using Trino. Referring to the practice of Trino, we can realize the plug-in of UDF through the way of rust dynamic library. In this way, ballista and datafusion only need to know the plug-in interface of UDF, and they can work without knowing the specific implementation of UDF.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Igosuki commented on issue #1882: UDF/UDAF plugin

Posted by GitBox <gi...@apache.org>.
Igosuki commented on issue #1882:
URL: https://github.com/apache/arrow-datafusion/issues/1882#issuecomment-1053548340


   Hi, really cool stuff. I use dtolnay/inventory on my project, but it has a known issue where one cannot guarantee that symbols won't get mangled from statically compiled code by llvm. Stuff got stirred and the core team reacted https://github.com/rust-lang/rust/issues/47384 but it's not solved yet. 
   Can we guarantee here that statically compiled plugins won't end up forgotten in binaries ?
   
   Secondly, I think it'd be cool to implement your interface in datafusion python so that people can use a python function as a UDAF like it's done in pyspark https://spark.apache.org/docs/2.4.0/sql-pyspark-pandas-with-arrow.html#pandas-udfs-aka-vectorized-udfs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] gaojun2048 commented on issue #1882: UDF/UDAF plugin

Posted by GitBox <gi...@apache.org>.
gaojun2048 commented on issue #1882:
URL: https://github.com/apache/arrow-datafusion/issues/1882#issuecomment-1053574320


   > 
   
   在https://github.com/apache/arrow-datafusion/pull/1881这个pr中,我参考了https://adventures.michaelfbryan.com/posts/plugins-in-rust/和https://michael-f-bryan.github.io/rust-ffi-guide/dynamic_loading.html 这两篇文章来设计UDF Plugin,这个想法要求插件的crate-type必须是cdylib. 通过初步的测试来看并没有出现statically compiled plugins won't end up forgotten in binaries的问题。
   
   对不起,我对python并不是太熟悉,如果你愿意,你可以帮忙实现datafusion python相关的代码。
   
   谢谢
   
   In the PR: https://github.com/apache/arrow-datafusion/pull/1881 I refer to https://adventures.michaelfbryan.com/posts/plugins-in-rust/ and https://michael-f-bryan.github.io/rust-ffi-guide/dynamic_loading.html These two articles are used to design UDF plugin. The idea requires that the crate type of the plug-in must be cdylib .According to the preliminary test, there is no problem of statically compiled plugins won't end up forgotten in binaries.
   Sorry, I'm not very familiar with Python. If you like, you can help implement the code related to datafusion python.
   thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] gaojun2048 edited a comment on issue #1882: UDF/UDAF plugin

Posted by GitBox <gi...@apache.org>.
gaojun2048 edited a comment on issue #1882:
URL: https://github.com/apache/arrow-datafusion/issues/1882#issuecomment-1053574320


   > 
   
   In the PR: https://github.com/apache/arrow-datafusion/pull/1881 I refer to https://adventures.michaelfbryan.com/posts/plugins-in-rust/ and https://michael-f-bryan.github.io/rust-ffi-guide/dynamic_loading.html These two articles are used to design UDF plugin. The idea requires that the crate type of the plug-in must be cdylib .According to the preliminary test, there is no problem of statically compiled plugins won't end up forgotten in binaries.
   Sorry, I'm not very familiar with Python. If you like, you can help implement the code related to datafusion python.
   thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org