You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/10/08 13:42:26 UTC
[jira] [Reopened] (SPARK-9040) StructField datatype Conversion
Error
[ https://issues.apache.org/jira/browse/SPARK-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen reopened SPARK-9040:
------------------------------
> StructField datatype Conversion Error
> -------------------------------------
>
> Key: SPARK-9040
> URL: https://issues.apache.org/jira/browse/SPARK-9040
> Project: Spark
> Issue Type: Bug
> Components: PySpark, Spark Core, SQL
> Affects Versions: 1.3.0
> Environment: Cloudera 5.3 on CDH 6
> Reporter: Sandeep Pal
>
> The following issue occurs if I specify the StructFields in specific order in StructType as follow:
> fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("a", StringType(), True),StructField("c", IntegerType(), True)]
> But the following code words fine:
> fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("c", IntegerType(), True),StructField("a", StringType(), True)]
> <ipython-input-27-9d675dd6a2c9> in <module>()
> 18
> 19 schema = StructType(fields)
> ---> 20 schemasimid_simple = sqlContext.createDataFrame(simid_simplereqfields, schema)
> 21 schemasimid_simple.registerTempTable("simid_simple")
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/context.py in createDataFrame(self, data, schema, samplingRatio)
> 302
> 303 for row in rows:
> --> 304 _verify_type(row, schema)
> 305
> 306 # convert python objects to sql data
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in _verify_type(obj, dataType)
> 986 "length of fields (%d)" % (len(obj), len(dataType.fields)))
> 987 for v, f in zip(obj, dataType.fields):
> --> 988 _verify_type(v, f.dataType)
> 989
> 990 _cached_cls = weakref.WeakValueDictionary()
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in _verify_type(obj, dataType)
> 970 if type(obj) not in _acceptable_types[_type]:
> 971 raise TypeError("%s can not accept object in type %s"
> --> 972 % (dataType, type(obj)))
> 973
> 974 if isinstance(dataType, ArrayType):
> TypeError: StringType can not accept object in type <type 'int'>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org