You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2018/05/02 20:48:28 UTC

[GitHub] spark pull request #21196: [SPARK-24123][SQL] Fix precision issues in months...

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21196#discussion_r185634058
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ---
    @@ -888,14 +888,19 @@ object DateTimeUtils {
         val months1 = year1 * 12 + monthInYear1
         val months2 = year2 * 12 + monthInYear2
     
    +    val monthDiff = (months1 - months2).toDouble
    +
         if (dayInMonth1 == dayInMonth2 || ((daysToMonthEnd1 == 0) && (daysToMonthEnd2 == 0))) {
    -      return (months1 - months2).toDouble
    +      return monthDiff
         }
    -    // milliseconds is enough for 8 digits precision on the right side
    -    val timeInDay1 = millis1 - daysToMillis(date1, timeZone)
    -    val timeInDay2 = millis2 - daysToMillis(date2, timeZone)
    -    val timesBetween = (timeInDay1 - timeInDay2).toDouble / MILLIS_PER_DAY
    -    val diff = (months1 - months2).toDouble + (dayInMonth1 - dayInMonth2 + timesBetween) / 31.0
    +    // using milliseconds can cause precision loss with more than 8 digits
    +    // we follow Hive's implementation which uses seconds
    --- End diff --
    
    I checked how Hive works. It works as this comment says. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org