You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sandeep Pal (JIRA)" <ji...@apache.org> on 2015/07/14 19:38:05 UTC

[jira] [Created] (SPARK-9040) StructField datatype Conversion Error

Sandeep Pal created SPARK-9040:
----------------------------------

             Summary: StructField datatype Conversion Error
                 Key: SPARK-9040
                 URL: https://issues.apache.org/jira/browse/SPARK-9040
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.3.0
         Environment: Cloudera 5.3 on CDH 6
            Reporter: Sandeep Pal


The following issue occurs if I specify the StructFields in specific order in StructType as follow:
fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("a", StringType(), True),StructField("c", IntegerType(), True)]

But the following code words fine:
fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("c", IntegerType(), True),StructField("a", StringType(), True)]

<ipython-input-27-9d675dd6a2c9> in <module>()
     18 
     19 schema = StructType(fields)
---> 20 schemasimid_simple = sqlContext.createDataFrame(simid_simplereqfields, schema)
     21 schemasimid_simple.registerTempTable("simid_simple")

/usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/context.py in createDataFrame(self, data, schema, samplingRatio)
    302 
    303         for row in rows:
--> 304             _verify_type(row, schema)
    305 
    306         # convert python objects to sql data

/usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in _verify_type(obj, dataType)
    986                              "length of fields (%d)" % (len(obj), len(dataType.fields)))
    987         for v, f in zip(obj, dataType.fields):
--> 988             _verify_type(v, f.dataType)
    989 
    990 _cached_cls = weakref.WeakValueDictionary()

/usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in _verify_type(obj, dataType)
    970     if type(obj) not in _acceptable_types[_type]:
    971         raise TypeError("%s can not accept object in type %s"
--> 972                         % (dataType, type(obj)))
    973 
    974     if isinstance(dataType, ArrayType):

TypeError: StringType can not accept object in type <type 'int'>


But the following code words fine:
fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), True),StructField("c", IntegerType(), True),StructField("a", StringType(), True)]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org