You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/01 02:35:48 UTC

[GitHub] [arrow-datafusion] gaojun2048 opened a new pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

gaojun2048 opened a new pull request #2131:
URL: https://github.com/apache/arrow-datafusion/pull/2131


   A sub pr of https://github.com/apache/arrow-datafusion/pull/1881
   Because https://github.com/apache/arrow-datafusion/pull/1881 It includes plugin, plugin load and serialization and deserialization.  Then, the serialization and deserialization communities of LogicalPlan and PhysicalPlan  have been changing the implementation. So, I push this pr, This PR only includes plugin, plugin loader, Not includes serialization and deserialization for UDF/UDAF.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] gaojun2048 commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

Posted by GitBox <gi...@apache.org>.
gaojun2048 commented on pull request #2131:
URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1085497102


   > 
   
   Good Idea, I will add this feature in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jiangzhx commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

Posted by GitBox <gi...@apache.org>.
jiangzhx commented on pull request #2131:
URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1086533235


   > > Currently I believe the "plugin_dir is a local dir, I think it is better to support distributed file systems(HDFS/Object store) so that both the Executors and Scheduler can load the plugin files from a single place.
   > 
   > Alternatively, users could package up dependencies in a Docker container and deploy that way. This could be more efficient in the case where multiple executors are running on the same node since the image will be downloaded once and cached. It also provides better version control - all executors will be guaranteed to be running the same code (assume a specific version of the image is deployed).
   > 
   > I would be interested to hear more about the use case of loading dependencies from object store though. What would be the motivation of this approach?
   
   maybe in the future, we can support create  custom udf&udaf like hive.
   
   ```sql
   CREATE FUNCTION myfunc AS 'myclass' USING JAR 'hdfs:///path/to/jar';
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] gaojun2048 commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

Posted by GitBox <gi...@apache.org>.
gaojun2048 commented on pull request #2131:
URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1085902625


   @thinkharderdev 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] andygrove commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #2131:
URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1086424955


   > Currently I believe the "plugin_dir is a local dir, I think it is better to support distributed file systems(HDFS/Object store) so that both the Executors and Scheduler can load the plugin files from a single place.
   
   Alternatively, users could package up dependencies in a Docker container and deploy that way. This could be more efficient in the case where multiple executors are running on the same node since the image will be downloaded once and cached. It also provides better version control - all executors will be guaranteed to be running the same code (assume a specific version of the image is deployed).
   
   I would be interested to hear more about the use case of loading dependencies from object store though. What would be the motivation of this approach? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] mingmwang commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

Posted by GitBox <gi...@apache.org>.
mingmwang commented on pull request #2131:
URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1085493136


   Currently I believe the "plugin_dir is a local dir, I think it is better to support distributed file systems(HDFS/Object store) so that both the Executors and Scheduler can load the plugin files from a single place.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org