You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/09/29 16:06:21 UTC

[GitHub] [arrow-datafusion] alamb commented on issue #7697: date_bin cannot use a string origin parameter when the expression has a time zone.

alamb commented on issue #7697:
URL: https://github.com/apache/arrow-datafusion/issues/7697#issuecomment-1741131533

   @mhilton  and I just had chat and here are some notes:
   
   The way the code currently works is that `date_bin` lists several signatures it can handle, such as
   
   https://github.com/apache/arrow-datafusion/blob/d19e9d684bbe1fd820674d48a96795bfbea9db7d/datafusion/expr/src/built_in_function.rs#L1041-L1046
   
   If the user specifies arguments to that function as, eg, a String that doesn't match the required types, the datafusion coercion logic kicks in and adds casts on the specific arguments to make them conform to one of the available signatures. 
   
   The current convention is that `Timestamp(Second, Some("+TZ")))` effectively means "any `Timestamp` type with a timezone, and then the implementation of the function (in this case `date_bin`) must handle any possible timezone that comes in. 
   
   The problem with the current convention is that the coercion rules for `Timestamp(Second, Some("+TZ")))` https://github.com/apache/arrow-datafusion/blob/d19e9d684bbe1fd820674d48a96795bfbea9db7d/datafusion/expr/src/type_coercion/functions.rs#L222-L226
   
   Only Support casting from another timestamp, not from Strings, the way `Timestamp(Second, None))` does: 
   https://github.com/apache/arrow-datafusion/blob/d19e9d684bbe1fd820674d48a96795bfbea9db7d/datafusion/expr/src/type_coercion/functions.rs#L209-L216
   
   
   Thus I think the solution @mhilton  and I brainstormed is to change the signature to `Timestamp(Second, Some("+00:00)))` (aka only accept UTC timestamps) and teach the coercion logic to cast all argument to that time.  Then the implementation of `date_bin` only needs to handle one timezone (UTC)
   
   cc @wiedld 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org