You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2020/06/01 15:15:00 UTC

[jira] [Resolved] (SPARK-31885) Incorrect filtering of old millis timestamp in parquet

     [ https://issues.apache.org/jira/browse/SPARK-31885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-31885.
---------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

Issue resolved by pull request 28693
[https://github.com/apache/spark/pull/28693]

> Incorrect filtering of old millis timestamp in parquet
> ------------------------------------------------------
>
>                 Key: SPARK-31885
>                 URL: https://issues.apache.org/jira/browse/SPARK-31885
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Maxim Gekk
>            Assignee: Apache Spark
>            Priority: Major
>             Fix For: 3.0.0
>
>
> {code:scala}
> Welcome to
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 3.1.0-SNAPSHOT
>       /_/
> Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_242)
> Type in expressions to have them evaluated.
> Type :help for more information.
> scala> spark.conf.set("spark.sql.parquet.outputTimestampType", "TIMESTAMP_MILLIS")
> scala> spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "CORRECTED")
> scala> Seq(java.sql.Timestamp.valueOf("1000-06-14 08:28:53.123")).toDF("ts").write.mode("overwrite").parquet("/Users/maximgekk/tmp/ts_millis_old_filter")
> scala> spark.read.parquet("/Users/maximgekk/tmp/ts_millis_old_filter").show(false)
> +-----------------------+
> |ts                     |
> +-----------------------+
> |1000-06-14 08:28:53.123|
> +-----------------------+
> scala> spark.read.parquet("/Users/maximgekk/tmp/ts_millis_old_filter").filter($"ts" === "1000-06-14 08:28:53.123")
> res6: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [ts: timestamp]
> scala> spark.read.parquet("/Users/maximgekk/tmp/ts_millis_old_filter").filter($"ts" === "1000-06-14 08:28:53.123").show(false)
> +---+
> |ts |
> +---+
> +---+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org