You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/10/08 13:42:26 UTC

[jira] [Reopened] (SPARK-9040) StructField datatype Conversion Error

     [ https://issues.apache.org/jira/browse/SPARK-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen reopened SPARK-9040:
------------------------------

> StructField datatype Conversion Error
> -------------------------------------
>
>                 Key: SPARK-9040
>                 URL: https://issues.apache.org/jira/browse/SPARK-9040
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, Spark Core, SQL
>    Affects Versions: 1.3.0
>         Environment: Cloudera 5.3 on CDH 6
>            Reporter: Sandeep Pal
>
> The following issue occurs if I specify the StructFields in specific order in StructType as follow:
> fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("a", StringType(), True),StructField("c", IntegerType(), True)]
> But the following code words fine:
> fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("c", IntegerType(), True),StructField("a", StringType(), True)]
> <ipython-input-27-9d675dd6a2c9> in <module>()
>      18 
>      19 schema = StructType(fields)
> ---> 20 schemasimid_simple = sqlContext.createDataFrame(simid_simplereqfields, schema)
>      21 schemasimid_simple.registerTempTable("simid_simple")
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/context.py in createDataFrame(self, data, schema, samplingRatio)
>     302 
>     303         for row in rows:
> --> 304             _verify_type(row, schema)
>     305 
>     306         # convert python objects to sql data
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in _verify_type(obj, dataType)
>     986                              "length of fields (%d)" % (len(obj), len(dataType.fields)))
>     987         for v, f in zip(obj, dataType.fields):
> --> 988             _verify_type(v, f.dataType)
>     989 
>     990 _cached_cls = weakref.WeakValueDictionary()
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in _verify_type(obj, dataType)
>     970     if type(obj) not in _acceptable_types[_type]:
>     971         raise TypeError("%s can not accept object in type %s"
> --> 972                         % (dataType, type(obj)))
>     973 
>     974     if isinstance(dataType, ArrayType):
> TypeError: StringType can not accept object in type <type 'int'>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org