You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2020/03/30 22:21:00 UTC

[jira] [Updated] (SPARK-31297) Speed-up date-time rebasing

     [ https://issues.apache.org/jira/browse/SPARK-31297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun updated SPARK-31297:
----------------------------------
        Parent: SPARK-30951
    Issue Type: Sub-task  (was: Improvement)

> Speed-up date-time rebasing
> ---------------------------
>
>                 Key: SPARK-31297
>                 URL: https://issues.apache.org/jira/browse/SPARK-31297
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Maxim Gekk
>            Priority: Major
>
> I do believe it is possible to speed up date-time rebasing by building a map of micros to diffs between original and rebased micros. And look up at the map via binary search.
> For example, the *America/Los_Angeles* time zone has less than 100 points when diff changes:
> {code:scala}
>   test("optimize rebasing") {
>     val start = instantToMicros(LocalDateTime.of(1, 1, 1, 0, 0, 0)
>       .atZone(getZoneId("America/Los_Angeles"))
>       .toInstant)
>     val end = instantToMicros(LocalDateTime.of(2030, 1, 1, 0, 0, 0)
>       .atZone(getZoneId("America/Los_Angeles"))
>       .toInstant)
>     var micros = start
>     var diff = Long.MaxValue
>     var counter = 0
>     while (micros < end) {
>       val rebased = rebaseGregorianToJulianMicros(micros)
>       val curDiff = rebased - micros
>       if (curDiff != diff) {
>         counter += 1
>         diff = curDiff
>         val ldt = microsToInstant(micros).atZone(getZoneId("America/Los_Angeles")).toLocalDateTime
>         println(s"local date-time = $ldt diff = ${diff / MICROS_PER_MINUTE} minutes")
>       }
>       micros += MICROS_PER_HOUR
>     }
>     println(s"counter = $counter")
>   }
> {code}
> {code:java}
> local date-time = 0001-01-01T00:00 diff = -2909 minutes
> local date-time = 0100-02-28T14:00 diff = -1469 minutes
> local date-time = 0200-02-28T14:00 diff = -29 minutes
> local date-time = 0300-02-28T14:00 diff = 1410 minutes
> local date-time = 0500-02-28T14:00 diff = 2850 minutes
> local date-time = 0600-02-28T14:00 diff = 4290 minutes
> local date-time = 0700-02-28T14:00 diff = 5730 minutes
> local date-time = 0900-02-28T14:00 diff = 7170 minutes
> local date-time = 1000-02-28T14:00 diff = 8610 minutes
> local date-time = 1100-02-28T14:00 diff = 10050 minutes
> local date-time = 1300-02-28T14:00 diff = 11490 minutes
> local date-time = 1400-02-28T14:00 diff = 12930 minutes
> local date-time = 1500-02-28T14:00 diff = 14370 minutes
> local date-time = 1582-10-14T14:00 diff = -29 minutes
> local date-time = 1899-12-31T16:52:58 diff = 0 minutes
> local date-time = 1917-12-27T11:52:58 diff = 60 minutes
> local date-time = 1917-12-27T12:52:58 diff = 0 minutes
> local date-time = 1918-09-15T12:52:58 diff = 60 minutes
> local date-time = 1918-09-15T13:52:58 diff = 0 minutes
> local date-time = 1919-06-30T16:52:58 diff = 31 minutes
> local date-time = 1919-06-30T17:52:58 diff = 0 minutes
> local date-time = 1919-08-15T12:52:58 diff = 60 minutes
> local date-time = 1919-08-15T13:52:58 diff = 0 minutes
> local date-time = 1921-08-31T10:52:58 diff = 60 minutes
> local date-time = 1921-08-31T11:52:58 diff = 0 minutes
> local date-time = 1921-09-30T11:52:58 diff = 60 minutes
> local date-time = 1921-09-30T12:52:58 diff = 0 minutes
> local date-time = 1922-09-30T12:52:58 diff = 60 minutes
> local date-time = 1922-09-30T13:52:58 diff = 0 minutes
> local date-time = 1981-09-30T12:52:58 diff = 60 minutes
> local date-time = 1981-09-30T13:52:58 diff = 0 minutes
> local date-time = 1982-09-30T12:52:58 diff = 60 minutes
> local date-time = 1982-09-30T13:52:58 diff = 0 minutes
> local date-time = 1983-09-30T12:52:58 diff = 60 minutes
> local date-time = 1983-09-30T13:52:58 diff = 0 minutes
> local date-time = 1984-09-29T15:52:58 diff = 60 minutes
> local date-time = 1984-09-29T16:52:58 diff = 0 minutes
> local date-time = 1985-09-28T15:52:58 diff = 60 minutes
> local date-time = 1985-09-28T16:52:58 diff = 0 minutes
> local date-time = 1986-09-27T15:52:58 diff = 60 minutes
> local date-time = 1986-09-27T16:52:58 diff = 0 minutes
> local date-time = 1987-09-26T15:52:58 diff = 60 minutes
> local date-time = 1987-09-26T16:52:58 diff = 0 minutes
> local date-time = 1988-09-24T15:52:58 diff = 60 minutes
> local date-time = 1988-09-24T16:52:58 diff = 0 minutes
> local date-time = 1989-09-23T15:52:58 diff = 60 minutes
> local date-time = 1989-09-23T16:52:58 diff = 0 minutes
> local date-time = 1990-09-29T15:52:58 diff = 60 minutes
> local date-time = 1990-09-29T16:52:58 diff = 0 minutes
> local date-time = 1991-09-28T16:52:58 diff = 60 minutes
> local date-time = 1991-09-28T17:52:58 diff = 0 minutes
> local date-time = 1992-09-26T15:52:58 diff = 60 minutes
> local date-time = 1992-09-26T16:52:58 diff = 0 minutes
> local date-time = 1993-09-25T15:52:58 diff = 60 minutes
> local date-time = 1993-09-25T16:52:58 diff = 0 minutes
> local date-time = 1994-09-24T15:52:58 diff = 60 minutes
> local date-time = 1994-09-24T16:52:58 diff = 0 minutes
> local date-time = 1995-09-23T15:52:58 diff = 60 minutes
> local date-time = 1995-09-23T16:52:58 diff = 0 minutes
> local date-time = 1996-10-26T15:52:58 diff = 60 minutes
> local date-time = 1996-10-26T16:52:58 diff = 0 minutes
> local date-time = 1997-10-25T15:52:58 diff = 60 minutes
> local date-time = 1997-10-25T16:52:58 diff = 0 minutes
> local date-time = 1998-10-24T15:52:58 diff = 60 minutes
> local date-time = 1998-10-24T16:52:58 diff = 0 minutes
> local date-time = 1999-10-30T15:52:58 diff = 60 minutes
> local date-time = 1999-10-30T16:52:58 diff = 0 minutes
> local date-time = 2000-10-28T15:52:58 diff = 60 minutes
> local date-time = 2000-10-28T16:52:58 diff = 0 minutes
> local date-time = 2001-10-27T15:52:58 diff = 60 minutes
> local date-time = 2001-10-27T16:52:58 diff = 0 minutes
> local date-time = 2002-10-26T15:52:58 diff = 60 minutes
> local date-time = 2002-10-26T16:52:58 diff = 0 minutes
> local date-time = 2003-10-25T15:52:58 diff = 60 minutes
> local date-time = 2003-10-25T16:52:58 diff = 0 minutes
> local date-time = 2004-10-30T15:52:58 diff = 60 minutes
> local date-time = 2004-10-30T16:52:58 diff = 0 minutes
> local date-time = 2005-10-29T15:52:58 diff = 60 minutes
> local date-time = 2005-10-29T16:52:58 diff = 0 minutes
> local date-time = 2006-10-28T15:52:58 diff = 60 minutes
> local date-time = 2006-10-28T16:52:58 diff = 0 minutes
> local date-time = 2007-10-27T15:52:58 diff = 60 minutes
> local date-time = 2007-10-27T16:52:58 diff = 0 minutes
> local date-time = 2008-10-25T15:52:58 diff = 60 minutes
> local date-time = 2008-10-25T16:52:58 diff = 0 minutes
> local date-time = 2009-10-24T15:52:58 diff = 60 minutes
> local date-time = 2009-10-24T16:52:58 diff = 0 minutes
> local date-time = 2010-10-30T15:52:58 diff = 60 minutes
> local date-time = 2010-10-30T16:52:58 diff = 0 minutes
> local date-time = 2014-10-25T14:52:58 diff = 60 minutes
> local date-time = 2014-10-25T15:52:58 diff = 0 minutes
> counter = 91
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org