You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Anatoliy Plastinin (JIRA)" <ji...@apache.org> on 2016/01/10 23:06:39 UTC
[jira] [Created] (SPARK-12744) Inconsistent behavior parsing JSON
with unix timestamp values
Anatoliy Plastinin created SPARK-12744:
------------------------------------------
Summary: Inconsistent behavior parsing JSON with unix timestamp values
Key: SPARK-12744
URL: https://issues.apache.org/jira/browse/SPARK-12744
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.6.0
Reporter: Anatoliy Plastinin
Priority: Minor
Let’s have following json
{code}
val rdd = sc.parallelize("""{"ts":1452386229}""" :: Nil)
{code}
Spark sql casts int to timestamp treating int value as a number of seconds.
https://issues.apache.org/jira/browse/SPARK-11724
{code}
scala> sqlContext.read.json(rdd).select($"ts".cast(TimestampType)).show
+--------------------+
| ts|
+--------------------+
|2016-01-10 01:37:...|
+--------------------+
{code}
However parsing json with schema gives different result
{code}
scala> val schema = (new StructType).add("ts", TimestampType)
schema: org.apache.spark.sql.types.StructType = StructType(StructField(ts,TimestampType,true))
scala> sqlContext.read.schema(schema).json(rdd).show
+--------------------+
| ts|
+--------------------+
|1970-01-17 20:26:...|
+--------------------+
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org