You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by holdenk <gi...@git.apache.org> on 2016/07/29 06:10:40 UTC

[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

GitHub user holdenk opened a pull request:

    https://github.com/apache/spark/pull/14398

    [SPARK-16774][SQL] Fix use of deprecated timestamp constructor & improve timezone handling

    ## What changes were proposed in this pull request?
    
    Removes the deprecated timestamp constructor and incidentally fixes the use which was using system timezone rather than the one specified when working near DST.
    
    This change also causes the roundtrip tests to fail since it now actually uses all the timezones near DST boundaries where it didn't before.
    
    Note: this is only a partial the solution, longer term we should follow up with https://issues.apache.org/jira/browse/SPARK-16788 to avoid this problem & simplify our timezone handling code.
    
    ## How was this patch tested?
    
    New tests for two timezones added so even if user timezone happens to coincided with one, the other tests should still fail. Important note: this (temporarily) disables the round trip tests until we can fix the issue more thoroughly.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/holdenk/spark SPARK-16774-fix-use-of-deprecated-timestamp-constructor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14398.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14398
    
----
commit 7c0fd8d4f50378b419bfd0059fe31f6e95205fb3
Author: Holden Karau <ho...@us.ibm.com>
Date:   2016-07-28T04:23:00Z

    Stop using deprecated Timestamp constructor

commit 634a05e3c7afed2987efdcb0680b2e3547ae10a2
Author: Holden Karau <ho...@us.ibm.com>
Date:   2016-07-29T00:25:45Z

    Wait the old timezone code was badnews bears - did stuff based implicitly on local system timezone - lets avoid that

commit 6233b4cbdc55930334634b9acadea627117e6e23
Author: Holden Karau <ho...@us.ibm.com>
Date:   2016-07-29T05:55:13Z

    Simplify & disable the roundtrip test - it wasn't really running anyways since it was always falling back to system timezone in the fall through case

commit e4508e32662f9eba4621c7ffb54ec7661dbc54fd
Author: Holden Karau <ho...@us.ibm.com>
Date:   2016-07-29T06:01:21Z

    not used anymore

commit 0385d8bd3f7ea85abb14191a4f3eb7eb1dbd4582
Author: Holden Karau <ho...@us.ibm.com>
Date:   2016-07-29T06:07:37Z

    Just remove the roundtrip test for now - can be added back from git history later

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72845540
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -24,6 +24,8 @@ import javax.xml.bind.DatatypeConverter
     
     import scala.annotation.tailrec
     
    +import sun.util.calendar.CalendarSystem
    --- End diff --
    
    Seems reasonable - will also allow us to hide the julian calendar usage for old dates.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63028/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #62997 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62997/consoleFull)** for PR 14398 at commit [`0385d8b`](https://github.com/apache/spark/commit/0385d8bd3f7ea85abb14191a4f3eb7eb1dbd4582).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63078/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63030 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63030/consoleFull)** for PR 14398 at commit [`1109275`](https://github.com/apache/spark/commit/110927562ac57e8c0490c373569a59eedb09c593).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72789569
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +878,20 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val millis = millisOfDay % 1000
    +        // Choose calendar based on java.util.Date getCalendarSystem
    +        val calendar = if (year >= 1582) {
    +          CalendarSystem.getGregorianCalendar()
    --- End diff --
    
    I think we can safely assume the Gregorian calendar? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72789531
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -24,6 +24,8 @@ import javax.xml.bind.DatatypeConverter
     
     import scala.annotation.tailrec
     
    +import sun.util.calendar.CalendarSystem
    --- End diff --
    
    We should be using `java.util.Calendar` right -- I presume any `sun.` classes are not for use by applications.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63028/consoleFull)** for PR 14398 at commit [`8670bb6`](https://github.com/apache/spark/commit/8670bb62a57cabe8a1a6bbab32330f3dc71f9977).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Also http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time - I really think we should stop handling time as much as possibly and leave it to people who spend enough time thinking about it to get it mostly right. :p 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72877149
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,11 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val calendar = Calendar.getInstance(tz)
     
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        calendar.set(year, month, day, hh, mm, ss)
    --- End diff --
    
    huh one would think that the tests with the timestamps close to DST looking up offsets would also need this if that was the intended constructor but ok let me give it a shot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #62997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62997/consoleFull)** for PR 14398 at commit [`0385d8b`](https://github.com/apache/spark/commit/0385d8bd3f7ea85abb14191a4f3eb7eb1dbd4582).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63030/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14398


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by ckadner <gi...@git.apache.org>.
Github user ckadner commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72877468
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,11 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val calendar = Calendar.getInstance(tz)
     
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        calendar.set(year, month, day, hh, mm, ss)
    --- End diff --
    
    I take that back -- I had my code tangled up -- sorry!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Merged to master/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63028/consoleFull)** for PR 14398 at commit [`8670bb6`](https://github.com/apache/spark/commit/8670bb62a57cabe8a1a6bbab32330f3dc71f9977).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    cc @ckadner & @davies - still think the longterm solution is a larger rewrite of this into a thin wrapper around something like JSR-310, but in the meantime this should be some progress.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72877870
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,11 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val calendar = Calendar.getInstance(tz)
     
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        calendar.set(year, month, day, hh, mm, ss)
    --- End diff --
    
    no worries :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63030/consoleFull)** for PR 14398 at commit [`1109275`](https://github.com/apache/spark/commit/110927562ac57e8c0490c373569a59eedb09c593).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72878822
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,11 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val calendar = Calendar.getInstance(tz)
     
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        calendar.set(year, month, day, hh, mm, ss)
    --- End diff --
    
    ok so my (previous) comments don't make sense in response to the comment edit but I'll leave as is.
    
    As for the newly updated comment, the approach does avoid the deprecation warning while keeping the existing functionality - but:
    1) Its going to throw away any subsecond information in the timestamp by including it in the difference from millis local to the time in miliseconds
    2) It doesn't seem to resolve the timezone offset correctly (it seems to give back non-DST times in DST transitions)*
    
    That being said - it is a strict improvement over the previous code so we could go for this approach instead since it is only a stop-gap solution and its really about choosing which way we want this code to be more broken in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated tim...

Posted by ckadner <gi...@git.apache.org>.
Github user ckadner commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72880137
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,12 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    -
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        val ms = millisOfDay % 1000
    +        val calendar = Calendar.getInstance(tz)
    +        calendar.set(year, month - 1, day, hh, mm, ss)
    +        calendar.set(Calendar.MILLISECOND, ms)
    +        val date = calendar.getTime()
    --- End diff --
    
    this seems unnecessary ... if the intend is to make sure `Calendar.computeTime()` will be invoked then that will happen when calling `calendar.getTimeInMillis()` on the next line as well


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72845861
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +878,20 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val millis = millisOfDay % 1000
    +        // Choose calendar based on java.util.Date getCalendarSystem
    +        val calendar = if (year >= 1582) {
    +          CalendarSystem.getGregorianCalendar()
    --- End diff --
    
    For this to happen millisLocal would have to be negative... which is at least well defined, but, would certainly need to be defined relative to January 1 1970 in the Gregorian calendar?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated tim...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72879633
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,11 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val calendar = Calendar.getInstance(tz)
     
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        calendar.set(year, month, day, hh, mm, ss)
    --- End diff --
    
    So DST is probably ok and my tests are just borked - so lets go with the new calendar approach and I'll fix it to preserve ms information.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by ckadner <gi...@git.apache.org>.
Github user ckadner commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72876907
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +877,11 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val calendar = Calendar.getInstance(tz)
     
    -        // create a Timestamp to get the unix timestamp (in UTC)
    -        val timestamp = new Timestamp(year - 1900, month - 1, day, hh, mm, ss, nano)
    -        guess = (millisLocal - timestamp.getTime).toInt
    +        calendar.set(year, month, day, hh, mm, ss)
    --- End diff --
    
    `       calendar.set(year, month - 1, day, hh - 1, mm, ss)`  makes the existing test cases work


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14398: [SPARK-16774][SQL] Fix use of deprecated timestam...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14398#discussion_r72844129
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -875,11 +878,20 @@ object DateTimeUtils {
             val hh = seconds / 3600
             val mm = seconds / 60 % 60
             val ss = seconds % 60
    -        val nano = millisOfDay % 1000 * 1000000
    +        val millis = millisOfDay % 1000
    +        // Choose calendar based on java.util.Date getCalendarSystem
    +        val calendar = if (year >= 1582) {
    +          CalendarSystem.getGregorianCalendar()
    --- End diff --
    
    I hope so? But if someone has weird historic data I'd rather give it a shot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Yay - I still think we want to replace this entire file with a thin wrapper around specialized library at some point soon but incremental progress (avoid warning - don't depend on system timezone randomly near DST boundaries during timestamp conversion - which I think might be related to the issue Reynold added some logging for recently).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63050/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by ckadner <gi...@git.apache.org>.
Github user ckadner commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by ckadner <gi...@git.apache.org>.
Github user ckadner commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    @holdenk @davies
    I agree that date, time, and timezone handling is best left to the experts and leveraging [JSR-310](http://www.threeten.org/threetenbp/) ([SPARK-16788](https://issues.apache.org/jira/browse/SPARK-16788)) in the long run would probably solve most of the problems that have been painstakingly fixed or attempted to be fixed in `DateTimeUtils.scala`.
    
    The change proposed in this PR makes for cleaner code but it also regresses the fix for [SPARK-15613](https://issues.apache.org/jira/browse/SPARK-15613). With this fix there are 4 incorrect `day -> millis -> day` conversions since 1970/01/01 just for `Europe/Moscow`: 
    
    ```scala
      test("15613 Incorrect days to millis conversion Europe/Moscow") {
        DateTimeTestUtils.withDefaultTimeZone(TimeZone.getTimeZone("Europe/Moscow")) {
          val badDays = (0 to 20000).filterNot{ d =>
            d === millisToDays(daysToMillis(d))
          }
          val df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss z")
          badDays.foreach { d =>
            println(s"day: ${d}, date: ${df.format(new Date(daysToMillis(d)))}")
          }
          assert(badDays.isEmpty)
        }
      }
    ```
    ```console
    TestFailedException: Vector(4108, 4473, 4838, 5204) was not empty
    day: 4108, date: 1981-03-31 23:00:00 MSK
    day: 4473, date: 1982-03-31 23:00:00 MSK
    day: 4838, date: 1983-03-31 23:00:00 MSK
    day: 5204, date: 1984-03-31 23:00:00 MSK
    ```
    
    ...and it increases the number of incorrectly converted days from 52 to 4036 when testing all time zones and dates between 1900 and 2020 ([see test code snippet here](https://github.com/apache/spark/pull/13652#issuecomment-226877084)).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL] Fix use of deprecated timestamp const...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62997/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    The approach LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Just for good measure lets re-run the jenkins tests.
    Jenkins retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    Jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63078/consoleFull)** for PR 14398 at commit [`671f7be`](https://github.com/apache/spark/commit/671f7bef7b953e686365a0f8e33d847b5b0d753f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63050/consoleFull)** for PR 14398 at commit [`671f7be`](https://github.com/apache/spark/commit/671f7bef7b953e686365a0f8e33d847b5b0d753f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14398: [SPARK-16774][SQL][WIP] Fix use of deprecated timestamp ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14398
  
    **[Test build #63078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63078/consoleFull)** for PR 14398 at commit [`671f7be`](https://github.com/apache/spark/commit/671f7bef7b953e686365a0f8e33d847b5b0d753f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org