You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2024/03/10 09:58:08 UTC

[I] Make it easier to register configuration extension options [arrow-datafusion]

alamb opened a new issue, #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529

   Also, inspired by simplification you've done in `SessionContext`, maybe we should add companion object to `FunctionFactory` so it can register configuration as well. At the moment its a bit of work to register configuration:
   
   https://github.com/milenkovicm/torchfusion/blob/634c2a4e39b81e1969671db8faea50b96bc43c7f/src/lib.rs#L101
   
   ```rust
   let mut session_config = SessionConfig::new().with_information_schema(true);
   
   session_config
         .options_mut()
         .extensions
         // register torch factory configuration
         .insert(TorchConfig::default());
   ```
   
   it would be much better if config could be registered when `ctx.with_function_factory(...)` called ... but that could be a follow up PR
   
   _Originally posted by @milenkovicm in https://github.com/apache/arrow-datafusion/issues/9482#issuecomment-1986983413_
               


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1987178757

   @alamb, having it as the part of `FunctionFactory` would be ideal, imho. Something like:
   
   ```rust
   #[async_trait]
   pub trait FunctionFactory: Sync + Send {
   
       type Config : Default + datafusion_common::config::ExtensionOptions + datafusion_common::config::ConfigExtension;
       
       async fn create(
           &self,
           state: &SessionConfig,
           statement: CreateFunction,
       ) -> Result<RegisterFunction>;
   }
   ```
   
   wdyt ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1989331411

   Then let's do it like that and close this issue. Could you please push PR?
   
   Thanks a lot @alamb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #9529: Make it easier to register configuration extension options
URL: https://github.com/apache/arrow-datafusion/issues/9529


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1987334615

   `ConfigExtension` is not object safe, which makes my initial idea a bit harder to implement 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1989279808

   That would definitely work better than any of previous changes I've done to function factory. 
   
   I'd propose to change nothing at the moment, leave interfaces as they are if this is not common use case. Would it make sense to add it from your perspective? Is configuration execution commonly used?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-2016561946

   Hey @alamb , I thought you pushed 
   
   ```rust
   SessionConfig.with_extension_options(...)
   ```
   
   but I don't see it in the code, Am I looking at the wrong place? I can push change if you agree. 
   Thanks, regards 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1987168988

   @milenkovicm  Is there a reason you don't want to put the configuration directly on the `FunctionFactory` (e.g. on the `CustomFunctionFactory` in https://github.com/apache/arrow-datafusion/pull/9482)?
   
   If not, perhaps we could add an API like `SessionConfig.with_extension_options`:
   
   ```rust
   let mut session_config = SessionConfig::new()
     .with_information_schema(true)
     // register torch factory configuration
     .with_extension_options(TorchConfig::default())
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1989255781

   > Use case is not to register function factory it's more to configure it once it's registered, something like:
   
   Ah, that makes sense
   
   So in that case what do you think about this
   
   ```rust
   let mut session_config = SessionConfig::new()
     .with_information_schema(true)
     // register torch factory configuration
     .with_extension_options(TorchConfig::default())
   ```
   
   That just does the gymnastics like
   
   ```rust
   self.state.write().
          .session_config
         .options_mut()
         .extensions
         // register torch factory configuration
         .insert(TorchConfig::default());
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-2016778073

   > but I don't see it in the code, Am I looking at the wrong place? I can push change if you agree.
   
   
   I don't think I did (maybe I got confused above). It still seems like a good change to me to make it eaiser to configure SessionContext
   
   A PR would be most appreciated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1989192977

   > @alamb, having it as the part of FunctionFactory would be ideal, imho. Something like:
   
   
   What about something like this (no changes to `FunctionFactory`, but literally put it on the struct directly)
   
   ```rust
   /// Configuration is stored specifically for this function factory
   #[derive(Debug, Default)]
   struct CustomFunctionFactory {
     config: CustomConfigObject
   }
   ```
   
   What I am missing is the usecase for dynamically passing the configuration. It would make sense if there was DDL like
   
   ```sql
   CREATE FUNCTION FACTORY ...
   ```
   
   Where you had to connect the configuration but didn't know it beforehand. But if you have to make a `CustomFunctionFactory` instance anyway, why not just give it the config directly 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1989208556

   Use case is not to register function factory it's more to configure it once it's registered, something like:
   
   ```rust
   ctx.sql("SET torchfusion.device = cpu").await?.show().await?;
   ```
   
   It can be done now, but there are few steps which should be done. IMHO,it makes more sense to keep it like that than make `FunctionFactory` more complicated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-2016785788

   Apologies for confusion @alamb. Will push PR later 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1989306658

   > Would it make sense to add it from your perspective? 
   
   I think so
   
   > Is configuration extension commonly used?
   
   I am not sure 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Make it easier to register configuration extension options [arrow-datafusion]

Posted by "milenkovicm (via GitHub)" <gi...@apache.org>.
milenkovicm commented on issue #9529:
URL: https://github.com/apache/arrow-datafusion/issues/9529#issuecomment-1988451635

   ```rust
   // does not work
   //
   // the trait `ConfigExtension` cannot be made into an object
   // `ConfigExtension` cannot be made into an object
   // for a trait to be "object safe" it needs to allow building a vtable to 
   // allow the call to be resolvable dynamically; for more information visit
   //
   fn configuration(&self) -> Option<&dyn ConfigExtension> {
       todo!()
   }
   ```
   ```rust
   //
   // does work but I don't like it 
   // 
   // user can  return whatever they want, registration process 
   // should try to register extension ... 
   // this would involve creating documentation to 
   // explain what we want, which makes bad API IMHO 
   fn configuration(&self) -> Option<dyn Any> {
       None
   }
   ```
   ```rust
   // we can let user do it on init
   // user should know what they are doing 
   //
   // i dont find this api particullary useful at this point
   fn init(&self, state: &mut SessionState) {
       state.options().extensions.insert(extension)
   }
   ```
   ```rust
   /// config extension to register with this function factory 
   fn config_extension<T:ConfigExtension>() -> Option<T> where Self: Sized {
       None
   }
   ```
   ```rust
   pub fn with_function_factory(
       mut self,
       function_factory: Arc<dyn FunctionFactory>,
   ) -> Self {
       // the size for values of type `dyn FunctionFactory` cannot be known at compilation time
       // the trait `Sized` is not implemented for `dyn FunctionFactory
       let c = <dyn FunctionFactory>::config_extension();
   
       self.function_factory = Some(function_factory);
       self
   }
   ```
   ```rust
   // it would force user to define `ConfigExtension`
   pub fn with_function_factory<T: FunctionFactory + 'static, C : ConfigExtension>(
       mut self,
       function_factory: Arc<T>,
   ) -> Self {
       let c : Option<C> = T::config_extension();
       self.function_factory = Some(function_factory);
       
       self
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org