You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/03 21:18:17 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue #251: Implement Postgres compatible `now()` function

alamb opened a new issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   I want to write queries that have predicates on relative timeranges like "the last 1 hour" or "the last 2 days"
   
   **Describe the solution you'd like**
   I would like a new function `now()` that takes no arguments and returns a `DataType::Timestamp(TimeUnit::Nanoseconds, NULL)` of the current timestamp 
   
   It should follow the postgres semantics described in https://www.postgresql.org/docs/current/functions-datetime.html
   
   ```
   now ( ) → timestamp with time zone
   Current date and time (start of current transaction); see Section 9.9.4
   ```
   
   Bonus points for adding SQL standard functions https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT such as `CURRENT_TIME` and `CURRENT_TIMESTAMP`.
   
   **Describe alternatives you've considered**
   N/A
   
   **Additional context**
   I imagine this is largely an exercise in creating a new built in function `now()` and calling `chrono::Utc::now()` to get the current time:
   
   https://docs.rs/chrono/0.4.19/chrono/offset/struct.Utc.html#method.now
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jorgecarleitao commented on issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251#issuecomment-833766100


   My two cents:
   
   I do not think that this issue is a "good first issue". Atm, functions are:
   
   1. stateless
   2. do not know what is the expected resulting size of the array (e.g. creating a function that filters currently works)
   
   Solving this issue requires both:
   
   * `now` is stateful because its value is stored after its first evaluation, like @returnString says
   * `now` requires knowing the output size of the array, which must be passed to when the record batch is passing through the stream
   
   This requires a (imo large) change in our internal workings.
   
   Note that another class of functions that is unlocked by this work are functions that return non-deterministic numbers (e.g. random numbers).
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251#issuecomment-833759034


   Hi @msathis  ! Thanks for giving this one a try. 
   
   One thing you might try is  just commenting out the `arg_types.is_empty()` check. The comments don't sound right to me (as a function always returns 1 row....) so maybe that check is now outdated. It may be that if no `Array` batch is passed to the implementation it doesn't know how many rows to output 🤔  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
alamb closed issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251#issuecomment-833784346


   @jorgecarleitao  -- all excellent points. Now that we have considered this more, I agree it is not a good first issue. Sorry for the confusion. Removing the label. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251#issuecomment-833793043


   Just some design input here, IMO it is best to not generate the timestamp at the moment of the first invocation when evaluating the query, but "just" always at the start of the query.
   
   An important optimization is to evaluate / replace the timestamp at planning time as a optimization rule, so it can be used to prune partitions and can be used for predicate pushdown.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] returnString commented on issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
returnString commented on issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251#issuecomment-831808148


   One interesting point - calling `now` multiple times in a single query should return the same timestamp (the txn start time, as noted in the Postgres docs) so I think we'll need some extra per-query state to store that, and then have this function grab the time from there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] msathis commented on issue #251: Implement Postgres compatible `now()` function

Posted by GitBox <gi...@apache.org>.
msathis commented on issue #251:
URL: https://github.com/apache/arrow-datafusion/issues/251#issuecomment-832601672


   @alamb I tried this today. We aren't allowing no arg functions right now as per https://github.com/apache/arrow-datafusion/blob/9e84f15191860befab07d4f257ccb0be75f1b206/datafusion/src/physical_plan/functions.rs#L298 . I might need more pointers to implement the no argument functions as my rust & datafusion knowledge is very limited. 😬 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org