You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/06/18 10:57:31 UTC

[GitHub] [spark] TJX2014 opened a new pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

TJX2014 opened a new pull request #28856:
URL: https://github.com/apache/spark/pull/28856


   ### What changes were proposed in this pull request?
   Add a unit test.
   Logical bug fix in `org.apache.spark.sql.catalyst.expressions.Sequence.TemporalSequenceImpl`
   
   ### Why are the changes needed?
   Spark sequence doesn't handle date increments that cross DST
   
   ### Does this PR introduce _any_ user-facing change?
   Before the PR, people will not get a correct result:
   `set spark.sql.session.timeZone` to `Asia/Shanghai, America/Chicago, GMT`, 
   Before execute 
   `sql("select sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 month)").show(false)`, People will get `[2011-03-01, 2011-04-01, 2011-05-01]`, **`[2011-03-01, 2011-03-28, 2011-04-28]`**, `[2011-03-01, 2011-04-01, 2011-05-01]`.
   
   After the PR, sequence date conversion is corrected:
   People will get `[2011-03-01, 2011-04-01, 2011-05-01]` from the former three conditions.
   
   ### How was this patch tested?
   Unit test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646396920






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443560745



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (daysToMicros(num.toInt(start), zoneId),
+            daysToMicros(num.toInt(stop), zoneId))

Review comment:
       Because when `scale` != 1,it is converted to day count,so we may need to use zone info to translate into microseconds to get a correct result,rather than just multiply `MICROS_PER_DAY` which ignore timezone.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444671528



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       Actually this function is from presto: https://prestodb.io/docs/current/functions/array.html
   
   Can you check the behavior of presto? It looks confusing to use time fields as the step for date start/stop.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646415868






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646563213






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646547844


   > **[Test build #124281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).
   > 
   > * This patch **fails to build**.
   > * This patch merges cleanly.
   > * This patch adds no public classes.
   
   It seems strange, Could you please help me reset.`/home/runner/work/spark/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:2632: value epochDaysToMicros is not a member of object org.apache.spark.sql.catalyst.util.DateTimeUtils` @maropu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646491461






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646548938


   The function `epochDaysToMicros ` was removed by https://github.com/apache/spark/pull/27617. Use `daysToMicros` instead of it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646492985






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444859851



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       @cloud-fan Done, It seems can be sequence `day`,`month`,'year' when start and end are date type.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646516124






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646545519


   > **[Test build #124281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).
   > 
   > * This patch **fails to build**.
   > * This patch merges cleanly.
   > * This patch adds no public classes.
   
   reset this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646012999


   > People will get ...
   
   How do you get the result? `df.collect` or `df.show`?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442154371



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (epochDaysToMicros(num.toInt(start), zoneId),

Review comment:
       Use timezone to transfer days to micros.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442306713



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -1836,4 +1836,16 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayIntersect(empty, oneNull), Seq.empty)
     checkEvaluation(ArrayIntersect(oneNull, empty), Seq.empty)
   }
+
+  test("SPARK-31982: sequence doesn't handle date increments that cross DST") {
+    Array("America/Chicago", "GMT", "Asia/Shanghai").foreach(tz => {
+      checkEvaluation(Sequence(
+        Cast(Literal("2011-03-01"), DateType),
+        Cast(Literal("2011-04-01"), DateType),
+        Option(Literal(stringToInterval("interval 1 month"))),
+        Option(tz)),
+        Seq(
+          Date.valueOf("2011-03-01"), Date.valueOf("2011-04-01")))

Review comment:
       Yes, `America/Los_Angeles` can pass the test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444858262



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' hour);
   Query 20200624_122744_00002_pehix failed: sequence step must be a day interval if start and end values are dates
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' day);
             _col0           
    [2011-03-01, 2011-03-02] 
   (1 row)
   Query 20200624_122757_00003_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' month);
       _col0     
    [2011-03-01] 
   (1 row)
   
   Query 20200624_122806_00004_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' year);
       _col0     
    [2011-03-01] 
   (1 row)
   Query 20200624_122810_00005_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442188525



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -1836,4 +1836,16 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayIntersect(empty, oneNull), Seq.empty)
     checkEvaluation(ArrayIntersect(oneNull, empty), Seq.empty)
   }
+
+  test("SPARK-31982: sequence doesn't handle date increments that cross DST") {
+    Array("America/Chicago", "GMT", "Asia/Shanghai").foreach(tz => {
+      checkEvaluation(Sequence(
+        Cast(Literal("2011-03-01"), DateType),
+        Cast(Literal("2011-04-01"), DateType),
+        Option(Literal(stringToInterval("interval 1 month"))),
+        Option(tz)),
+        Seq(
+          Date.valueOf("2011-03-01"), Date.valueOf("2011-04-01")))

Review comment:
       The guys are still in America/Los_Angeles, right?

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2698,7 +2717,7 @@ object Sequence {
          |  int $i = 0;
          |
          |  while ($t < $exclusiveItem ^ $stepSign < 0) {
-         |    $arr[$i] = ($elemType) ($t / ${scale}L);
+         |    $arr[$i] = ($elemType) (Math.round($t / (float)${scale}L));

Review comment:
       Floating point ops looks dangerous here. Can you avoid them?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443402205



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))

Review comment:
       Don't think this is correct, see my comment https://github.com/apache/spark/pull/28856/files#r442366706 but it is at least backward compatible?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646543318






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646994470






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444859851



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       @cloud-fan Done, It seems can be sequence `day`,`month`,`year` when start and end are `DateType` in presto.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646565549


   **[Test build #124286 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124286/testReport)** for PR 28856 at commit [`347fa9d`](https://github.com/apache/spark/commit/347fa9d371df22ed286be4c23658d6fd592ac6fc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646670867


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124285/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444858262



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       Base `presto-server-0.236`:
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' hour);
   Query 20200624_122744_00002_pehix failed: sequence step must be a day interval if start and end values are dates
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' day);
             _col0           
    [2011-03-01, 2011-03-02] 
   (1 row)
   Query 20200624_122757_00003_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' month);
       _col0     
    [2011-03-01] 
   (1 row)
   
   Query 20200624_122806_00004_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' year);
       _col0     
    [2011-03-01] 
   (1 row)
   Query 20200624_122810_00005_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-645944138


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646396920






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647731446


   **[Test build #124366 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124366/testReport)** for PR 28856 at commit [`6a341bf`](https://github.com/apache/spark/commit/6a341bff3cbd6196cd1d742d9b66818185b68e4e).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646670200


   **[Test build #124285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124285/testReport)** for PR 28856 at commit [`0c5d8fb`](https://github.com/apache/spark/commit/0c5d8fb63ff53957672eac0190857c0f200dd713).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646997713






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646540244


   **[Test build #124281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443556895



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))

Review comment:
       Maybe we could separate this into different methods ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444045788



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       Yes, seems we can, the result as follows:
   `scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 hour))").count
   res19: Long = 1465
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 minute))").count
   res20: Long = 87841
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 second))").count
   res21: Long = 5270401
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 minute))").head(3)
   res25: Array[org.apache.spark.sql.Row] = Array([2011-03-01], [2011-03-01], [2011-03-01])
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 second))").head(3)
   
   res26: Array[org.apache.spark.sql.Row] = Array([2011-03-01], [2011-03-01], [2011-03-01])
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 minute))").head(3)
   res27: Array[org.apache.spark.sql.Row] = Array([2011-03-01], [2011-03-01], [2011-03-01])
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 hour))").head(3)
   res28: Array[org.apache.spark.sql.Row] = Array([2011-03-01], [2011-03-01], [2011-03-01])
   `




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646563213






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646516133


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124275/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646566075






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646415612


   **[Test build #124254 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124254/testReport)** for PR 28856 at commit [`b33514f`](https://github.com/apache/spark/commit/b33514fcb3ff536887ca1b8824de7481e875911d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646490865


   **[Test build #124273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124273/testReport)** for PR 28856 at commit [`082f9c0`](https://github.com/apache/spark/commit/082f9c029b6fe9a35ffe80b5cfd00195ddcd48a6).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646395857


   ok to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442613764



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2635,7 +2643,11 @@ object Sequence {
         var i = 0
 
         while (t < exclusiveItem ^ stepSign < 0) {
-          arr(i) = fromLong(t / scale)
+          arr(i) = if (scale == 1) {
+            fromLong(t / scale)
+          } else {
+            fromLong(Math.round(t / scale.toFloat))

Review comment:
       We could get date from `Math.round(t / scale.toFloat)`, if the target is timestamp,  the former ` fromLong(t / scale)` could be used.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646396650


   **[Test build #124246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124246/testReport)** for PR 28856 at commit [`9650122`](https://github.com/apache/spark/commit/9650122e9e58aa1a0c37210d2d3aaa3be842378c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443394825



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (daysToMicros(num.toInt(start), zoneId),
+            daysToMicros(num.toInt(stop), zoneId))

Review comment:
       can you explain a bit more? It's hard to understand this change without any comment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646966671






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444859851



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       @cloud-fan Done, It seems can be sequence `day`,`month`,`year` when start and end are `DateType`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646491461






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443401351



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (daysToMicros(num.toInt(start), zoneId),
+            daysToMicros(num.toInt(stop), zoneId))

Review comment:
       Yeh, please, explain why if `scale` != 1, `start` and `stop` contain days.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-708791002


   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442557807



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -1836,4 +1836,16 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayIntersect(empty, oneNull), Seq.empty)
     checkEvaluation(ArrayIntersect(oneNull, empty), Seq.empty)
   }
+
+  test("SPARK-31982: sequence doesn't handle date increments that cross DST") {
+    Array("America/Chicago", "GMT", "Asia/Shanghai").foreach(tz => {
+      checkEvaluation(Sequence(
+        Cast(Literal("2011-03-01"), DateType),
+        Cast(Literal("2011-04-01"), DateType),
+        Option(Literal(stringToInterval("interval 1 month"))),
+        Option(tz)),
+        Seq(
+          Date.valueOf("2011-03-01"), Date.valueOf("2011-04-01")))

Review comment:
       Sure, the result could be `tz` independently.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646543294


   **[Test build #124281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442305304



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2698,7 +2717,7 @@ object Sequence {
          |  int $i = 0;
          |
          |  while ($t < $exclusiveItem ^ $stepSign < 0) {
-         |    $arr[$i] = ($elemType) ($t / ${scale}L);
+         |    $arr[$i] = ($elemType) (Math.round($t / (float)${scale}L));

Review comment:
       @MaxGekk Because we may need the `Math.round`, if not use this float ops,it seems hard to avoid the gap about one day between the output and expected.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646547844


   > **[Test build #124281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).
   > 
   > * This patch **fails to build**.
   > * This patch merges cleanly.
   > * This patch adds no public classes.
   
   It seems strange, Could you please help me reset.`/home/runner/work/spark/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:2632: value epochDaysToMicros is not a member of object org.apache.spark.sql.catalyst.util.DateTimeUtils` @cloud-fan  @maropu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443566298



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (daysToMicros(num.toInt(start), zoneId),
+            daysToMicros(num.toInt(stop), zoneId))

Review comment:
       > Because when scale != 1,it is converted to day count
   
   How can we tell it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646965243






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442560213



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =

Review comment:
       `start is DateType, just add number of days. The same as for timestamps, time zone is not involved here.` seems strange in different `tz`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444963219



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       I think the presto behavior makes sense. Can we send a PR to follow it first? e.g. throw an exception if the step is time fields while start/end is date. This can also simplify the implementation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646540244


   **[Test build #124281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443610816



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =

Review comment:
       hi @cloud-fan,as  @MaxGekk  explain here, I am not sure if this patch look ok,I am willing to `add more documents to TemporalSequenceImpl`but I am not sure if  we can follow this way or refactor a little.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646965243






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646543330


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124281/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443981984



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       if start/end is date, can the step by seconds/minutes/hours?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646699308


   **[Test build #124286 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124286/testReport)** for PR 28856 at commit [`347fa9d`](https://github.com/apache/spark/commit/347fa9d371df22ed286be4c23658d6fd592ac6fc).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647574987


   **[Test build #124366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124366/testReport)** for PR 28856 at commit [`6a341bf`](https://github.com/apache/spark/commit/6a341bff3cbd6196cd1d742d9b66818185b68e4e).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442621499



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2698,7 +2721,11 @@ object Sequence {
          |  int $i = 0;
          |
          |  while ($t < $exclusiveItem ^ $stepSign < 0) {
-         |    $arr[$i] = ($elemType) ($t / ${scale}L);
+         |    if (${scale}L == 1L) {
+         |      $arr[$i] = ($elemType) ($t / ${scale}L);

Review comment:
       May it is better change `$arr[$i] = ($elemType) ($t / ${scale}L)` to `$arr[$i] = ($elemType) $t`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-645958671


   @cloud-fan @maropu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646565549


   **[Test build #124286 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124286/testReport)** for PR 28856 at commit [`347fa9d`](https://github.com/apache/spark/commit/347fa9d371df22ed286be4c23658d6fd592ac6fc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646966671






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 edited a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 edited a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646547844


   > **[Test build #124281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).
   > 
   > * This patch **fails to build**.
   > * This patch merges cleanly.
   > * This patch adds no public classes.
   
   It seems strange, Could you please help me reset.`/home/runner/work/spark/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:2632: value epochDaysToMicros is not a member of object org.apache.spark.sql.catalyst.util.DateTimeUtils` @cloud-fan  @maropu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646402311


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124246/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646670861






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444969655



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       Ok, I will do it tomorrow.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444297027



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       Seems pgsql can only support int as follows:
   postgres= create sequence seq_test;
   CREATE SEQUENCE
   postgres= select nextval('seq_test');
          1
   (1 行记录)
   postgres= select nextval('seq_test');
          2
   (1 行记录)
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646492969


   **[Test build #124273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124273/testReport)** for PR 28856 at commit [`082f9c0`](https://github.com/apache/spark/commit/082f9c029b6fe9a35ffe80b5cfd00195ddcd48a6).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646966550


   **[Test build #124322 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124322/testReport)** for PR 28856 at commit [`c88efec`](https://github.com/apache/spark/commit/c88efecec0faf9d2edfecf7296333f792685a360).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646474825


   **[Test build #124254 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124254/testReport)** for PR 28856 at commit [`b33514f`](https://github.com/apache/spark/commit/b33514fcb3ff536887ca1b8824de7481e875911d).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444045788



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       Yes, we can, but the result seems a little strange:
   `scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 hour))").count
   res19: Long = 1465
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 minute))").count
   res20: Long = 87841
   
   scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-05-01' as date), interval 1 second))").count
   res21: Long = 5270401`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443566831



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (daysToMicros(num.toInt(start), zoneId),
+            daysToMicros(num.toInt(stop), zoneId))

Review comment:
       maybe we should add more documents to `TemporalSequenceImpl` first, to understand what it is doing




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646415868






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646415612


   **[Test build #124254 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124254/testReport)** for PR 28856 at commit [`b33514f`](https://github.com/apache/spark/commit/b33514fcb3ff536887ca1b8824de7481e875911d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-645944138


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646965123


   **[Test build #124321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124321/testReport)** for PR 28856 at commit [`19d8c48`](https://github.com/apache/spark/commit/19d8c482b9452593e7b4ac6ba71b2d6bc922dcd3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646514135


   **[Test build #124275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124275/testReport)** for PR 28856 at commit [`871867b`](https://github.com/apache/spark/commit/871867bcce93f7828f5a14b2b11bb1c71d4376a7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646396650


   **[Test build #124246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124246/testReport)** for PR 28856 at commit [`9650122`](https://github.com/apache/spark/commit/9650122e9e58aa1a0c37210d2d3aaa3be842378c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646566075






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442366706



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =

Review comment:
       The current implementation is strange mix, it seems. There are following options:
   1. Step is an interval of (months, days, micros):
       1. If start point is TimestampType, we should convert it to local date-time in the session time zone, and add the interval by time components. The intermediate local timestamps should be converted back to micros using the session time zone but we should keep adding the interval to local timestamp "accumulator".
       2. The same for dates - convert `start` to a local date. Time zone shouldn't involved here.
   
   2. If the step is a duration in micros or days (this is not our case)
       1. start is TimestampType, we shouldn't convert it to local timestamp, and just add micros to instants. So, time zone will be not involved here.
       2. start is DateType, just add number of days. The same as for timestamps, time zone is not involved here.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443610816



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =

Review comment:
       hi @cloud-fan,as  @MaxGekk  explain here, I am not sure if this patch looks ok,I am willing to `add more documents to TemporalSequenceImpl`but I am not sure if  we can follow this way or refactor a little.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444858262



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       `base presto-server-0.236`
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' hour);
   Query 20200624_122744_00002_pehix failed: sequence step must be a day interval if start and end values are dates
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' day);
             _col0           
    [2011-03-01, 2011-03-02] 
   (1 row)
   Query 20200624_122757_00003_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' month);
       _col0     
    [2011-03-01] 
   (1 row)
   
   Query 20200624_122806_00004_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]
   presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' year);
       _col0     
    [2011-03-01] 
   (1 row)
   Query 20200624_122810_00005_pehix, FINISHED, 1 node
   Splits: 17 total, 17 done (100.00%)
   0:00 [0 rows, 0B] [0 rows/s, 0B/s]




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646540880






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #28856:
URL: https://github.com/apache/spark/pull/28856


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646402306


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442175114



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =

Review comment:
       it looks like the step should not be physical seconds, but logical interval. cc @MaxGekk 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442148485



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2635,7 +2643,7 @@ object Sequence {
         var i = 0
 
         while (t < exclusiveItem ^ stepSign < 0) {
-          arr(i) = fromLong(t / scale)

Review comment:
       Resolve Asia timezone not correct.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647732281






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442357548



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -1836,4 +1836,16 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayIntersect(empty, oneNull), Seq.empty)
     checkEvaluation(ArrayIntersect(oneNull, empty), Seq.empty)
   }
+
+  test("SPARK-31982: sequence doesn't handle date increments that cross DST") {
+    Array("America/Chicago", "GMT", "Asia/Shanghai").foreach(tz => {
+      checkEvaluation(Sequence(
+        Cast(Literal("2011-03-01"), DateType),
+        Cast(Literal("2011-04-01"), DateType),
+        Option(Literal(stringToInterval("interval 1 month"))),
+        Option(tz)),
+        Seq(
+          Date.valueOf("2011-03-01"), Date.valueOf("2011-04-01")))

Review comment:
       I mean `Date.valueOf` always uses `America/Los_Angeles` independently what you test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646475208


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442621499



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2698,7 +2721,11 @@ object Sequence {
          |  int $i = 0;
          |
          |  while ($t < $exclusiveItem ^ $stepSign < 0) {
-         |    $arr[$i] = ($elemType) ($t / ${scale}L);
+         |    if (${scale}L == 1L) {
+         |      $arr[$i] = ($elemType) ($t / ${scale}L);

Review comment:
       `$arr[$i] = ($elemType) ($t / ${scale}L)` -> `$arr[$i] = ($elemType) $t`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646516105


   **[Test build #124275 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124275/testReport)** for PR 28856 at commit [`871867b`](https://github.com/apache/spark/commit/871867bcce93f7828f5a14b2b11bb1c71d4376a7).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646997713






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646994234


   **[Test build #124321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124321/testReport)** for PR 28856 at commit [`19d8c48`](https://github.com/apache/spark/commit/19d8c482b9452593e7b4ac6ba71b2d6bc922dcd3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646514135


   **[Test build #124275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124275/testReport)** for PR 28856 at commit [`871867b`](https://github.com/apache/spark/commit/871867bcce93f7828f5a14b2b11bb1c71d4376a7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647575577






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646492985


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646540880






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646545519


   > **[Test build #124281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124281/testReport)** for PR 28856 at commit [`7196f6e`](https://github.com/apache/spark/commit/7196f6e79effe5782efd734e2bf8d1c28b075209).
   > 
   > * This patch **fails to build**.
   > * This patch merges cleanly.
   > * This patch adds no public classes.
   
   reset this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646475208






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-645943652


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443643942



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones
+        val (startMicros, stopMicros) = if (scale == 1) {
+          (num.toLong(start), num.toLong(stop))
+        }
+        else {
+          (daysToMicros(num.toInt(start), zoneId),
+            daysToMicros(num.toInt(stop), zoneId))

Review comment:
       Done, maybe we can pass the scale through constructor:
   private class TemporalSequenceImpl[T: ClassTag]
         (dt: IntegralType, **scale: Long**, fromLong: Long => T, zoneId: ZoneId)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r444079131



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2589,6 +2589,8 @@ object Sequence {
     }
   }
 
+  // To generate time sequences, we use scale 1 in TemporalSequenceImpl
+  // for `TimestampType`, while MICROS_PER_DAY for `DateType`

Review comment:
       does pgsql support it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442305304



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2698,7 +2717,7 @@ object Sequence {
          |  int $i = 0;
          |
          |  while ($t < $exclusiveItem ^ $stepSign < 0) {
-         |    $arr[$i] = ($elemType) ($t / ${scale}L);
+         |    $arr[$i] = ($elemType) (Math.round($t / (float)${scale}L));

Review comment:
       @MaxGekk Because we may need this `Math.round`, if not use this flout ops,it seems hard to avoid the gap about one day between the output and expected.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442687850



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -1836,4 +1836,16 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
     checkEvaluation(ArrayIntersect(empty, oneNull), Seq.empty)
     checkEvaluation(ArrayIntersect(oneNull, empty), Seq.empty)
   }
+
+  test("SPARK-31982: sequence doesn't handle date increments that cross DST") {
+    Array("America/Chicago", "GMT", "Asia/Shanghai").foreach(tz => {
+      checkEvaluation(Sequence(
+        Cast(Literal("2011-03-01"), DateType),
+        Cast(Literal("2011-04-01"), DateType),
+        Option(Literal(stringToInterval("interval 1 month"))),
+        Option(tz)),
+        Seq(
+          Date.valueOf("2011-03-01"), Date.valueOf("2011-04-01")))

Review comment:
       Please, wrap the code by `withDefaultTimeZone` otherwise your expected dates are wrong.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646514646






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646562790


   **[Test build #124285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124285/testReport)** for PR 28856 at commit [`0c5d8fb`](https://github.com/apache/spark/commit/0c5d8fb63ff53957672eac0190857c0f200dd713).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646966550


   **[Test build #124322 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124322/testReport)** for PR 28856 at commit [`c88efec`](https://github.com/apache/spark/commit/c88efecec0faf9d2edfecf7296333f792685a360).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646516124


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646562790


   **[Test build #124285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124285/testReport)** for PR 28856 at commit [`0c5d8fb`](https://github.com/apache/spark/commit/0c5d8fb63ff53957672eac0190857c0f200dd713).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647575577






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646700431






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646492996


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124273/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-645943652


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646514646






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r449991321



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2625,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones

Review comment:
       Why do these codes depend on few specific timezones? Also, other comments look valid here. We should take the timezone into account for timestamps too




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646475215


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124254/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646965123


   **[Test build #124321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124321/testReport)** for PR 28856 at commit [`19d8c48`](https://github.com/apache/spark/commit/19d8c482b9452593e7b4ac6ba71b2d6bc922dcd3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647732281






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646402306






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r443610816



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2623,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =

Review comment:
       hi @cloud-fan,as  @MaxGekk  explain here, I am not sure if this patch look ok,I am willing to `add more documents to TemporalSequenceImpl`if we can follow this way rather than refactor a little.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646543318


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-647574987


   **[Test build #124366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124366/testReport)** for PR 28856 at commit [`6a341bf`](https://github.com/apache/spark/commit/6a341bff3cbd6196cd1d742d9b66818185b68e4e).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r442621330



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2635,7 +2643,11 @@ object Sequence {
         var i = 0
 
         while (t < exclusiveItem ^ stepSign < 0) {
-          arr(i) = fromLong(t / scale)
+          arr(i) = if (scale == 1) {
+            fromLong(t / scale)

Review comment:
        I will change to `fromLong(t)` because it is same as ` fromLong(t / scale)` in `if` condition




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646997334


   **[Test build #124322 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124322/testReport)** for PR 28856 at commit [`c88efec`](https://github.com/apache/spark/commit/c88efecec0faf9d2edfecf7296333f792685a360).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646700431






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646402278


   **[Test build #124246 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124246/testReport)** for PR 28856 at commit [`9650122`](https://github.com/apache/spark/commit/9650122e9e58aa1a0c37210d2d3aaa3be842378c).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on a change in pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on a change in pull request #28856:
URL: https://github.com/apache/spark/pull/28856#discussion_r450020829



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -2623,8 +2625,16 @@ object Sequence {
         // about a month length in days and a day length in microseconds
         val intervalStepInMicros =
           stepMicros + stepMonths * microsPerMonth + stepDays * microsPerDay
-        val startMicros: Long = num.toLong(start) * scale
-        val stopMicros: Long = num.toLong(stop) * scale
+
+        // Date to timestamp is not equal from GMT and Chicago timezones

Review comment:
       Seems the date if different from west to east, when it is date, we might need to consider to zone info to convert to time stamp, if it is already a time stamp, not a date here, we may ignore the zone because the time stamp is already consider it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] TJX2014 commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
TJX2014 commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646084377


   > > People will get ...
   > 
   > How do you get the result? `df.collect` or `df.show`?
   
   It seems to me that both `df.collect` and `df.show` can show this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646670861


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646994470






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #28856: [SPARK-31982][SQL]Function sequence doesn't handle date increments that cross DST

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #28856:
URL: https://github.com/apache/spark/pull/28856#issuecomment-646490865


   **[Test build #124273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124273/testReport)** for PR 28856 at commit [`082f9c0`](https://github.com/apache/spark/commit/082f9c029b6fe9a35ffe80b5cfd00195ddcd48a6).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org