You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Federico Ponzi (JIRA)" <ji...@apache.org> on 2016/06/23 12:30:16 UTC
[jira] [Created] (SPARK-16170) Throw error when row is not
schema-compatible
Federico Ponzi created SPARK-16170:
--------------------------------------
Summary: Throw error when row is not schema-compatible
Key: SPARK-16170
URL: https://issues.apache.org/jira/browse/SPARK-16170
Project: Spark
Issue Type: Bug
Reporter: Federico Ponzi
We are using Spark to import some data from mysql.
We just found that many of our imports are useless because our import function was wrongly forcing the longtype to a float column.
Consider this example:
{code}
from pyspark.sql.types import *
sqlContext = SQLContext(sc)
sch = StructType([StructField("id", LongType(), True), StructField("rol", StringType(), True)])
i = ['{"id": 1, "rol": "str"}', '{"id": 2.4, "rol": "str"}']
rdd = sc.parallelize(i)
df = sqlContext.read.json(rdd, schema=sch)
print df.collect()
{code}
The output is:
{code}
[Row(id=1, rol=u'str'), Row(id=None, rol=None)]
{code}
Every column in the second row is null, not only id which has a wrong datatype and no error is triggered.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org