You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/05/14 01:15:58 UTC

[GitHub] [spark] bersprockets commented on pull request #36546: [SPARK-37544][SQL] Correct date arithmetic in sequences

bersprockets commented on PR #36546:
URL: https://github.com/apache/spark/pull/36546#issuecomment-1126602056

   This PR brings `Date` in line with `Timestamp` (that is, time-zone aware).
   
   But even `Timestamp` sequences have some anomalies, e.g. (from a Spark without my change, in the America/Los_Angeles time-zone):
   ```
   spark-sql> select element_at(sequence(timestamp'2021-01-01', timestamp'2021-01-01' + interval 82 hours * 97, interval 82 hours), 97) as a;
   2021-11-24 23:00:00
   Time taken: 0.076 seconds, Fetched 1 row(s)
   spark-sql> select timestamp'2021-01-01' + interval 82 hours * 96 as x;
   2021-11-25 00:00:00
   Time taken: 0.053 seconds, Fetched 1 row(s)
   spark-sql> 
   ```
   The 96th (origin 0) element of the sequence from the first query is 1 hour less than the result of the second query. One would think they should be the same (both supposedly being `'2021-01-01' + interval 82 hours * 96 `), but the "fall back" is being handled differently around element 92 (origin 0) of the sequence.
   
   `Date` sequences also have (and will continue to have, after this PR) the same anomaly:
   ```
   spark-sql> select date'2021-01-01' + interval 82 hours * 96 as x;
   2021-11-25 00:00:00
   Time taken: 4.146 seconds, Fetched 1 row(s)
   spark-sql> select element_at(sequence(date'2021-01-01', date'2022-01-05', interval 82 hours), 97) as a;
   2021-11-24
   Time taken: 0.125 seconds, Fetched 1 row(s)
   spark-sql>  
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org