You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Anatoliy Plastinin (JIRA)" <ji...@apache.org> on 2016/01/10 23:06:39 UTC

[jira] [Created] (SPARK-12744) Inconsistent behavior parsing JSON with unix timestamp values

Anatoliy Plastinin created SPARK-12744:
------------------------------------------

             Summary: Inconsistent behavior parsing JSON with unix timestamp values
                 Key: SPARK-12744
                 URL: https://issues.apache.org/jira/browse/SPARK-12744
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.6.0
            Reporter: Anatoliy Plastinin
            Priority: Minor


Let’s have following json

{code}
val rdd = sc.parallelize("""{"ts":1452386229}""" :: Nil)
{code}

Spark sql casts int to timestamp treating int value as a number of seconds.
https://issues.apache.org/jira/browse/SPARK-11724

{code}
scala> sqlContext.read.json(rdd).select($"ts".cast(TimestampType)).show
+--------------------+
|                  ts|
+--------------------+
|2016-01-10 01:37:...|
+--------------------+
{code}

However parsing json with schema gives different result

{code}
scala> val schema = (new StructType).add("ts", TimestampType)
schema: org.apache.spark.sql.types.StructType = StructType(StructField(ts,TimestampType,true))

scala> sqlContext.read.schema(schema).json(rdd).show
+--------------------+
|                  ts|
+--------------------+
|1970-01-17 20:26:...|
+--------------------+
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org