You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Federico Ponzi (JIRA)" <ji...@apache.org> on 2016/06/23 12:30:16 UTC

[jira] [Created] (SPARK-16170) Throw error when row is not schema-compatible

Federico Ponzi created SPARK-16170:
--------------------------------------

             Summary: Throw error when row is not schema-compatible
                 Key: SPARK-16170
                 URL: https://issues.apache.org/jira/browse/SPARK-16170
             Project: Spark
          Issue Type: Bug
            Reporter: Federico Ponzi


We are using Spark to import some data from mysql.
We just found that many of our imports are useless because our import function was wrongly forcing the longtype to a float column. 
Consider this example:
{code}
from pyspark.sql.types import *
sqlContext = SQLContext(sc)
sch = StructType([StructField("id", LongType(), True), StructField("rol", StringType(), True)])
i = ['{"id": 1, "rol": "str"}', '{"id": 2.4, "rol": "str"}']
rdd = sc.parallelize(i)
df = sqlContext.read.json(rdd, schema=sch)
print df.collect()
{code}
The output is:
{code}
[Row(id=1, rol=u'str'), Row(id=None, rol=None)]
{code}
Every column in the second row is null, not only id which has a wrong datatype and no error is triggered.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org