You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Szymon Matejczyk (JIRA)" <ji...@apache.org> on 2016/03/10 14:55:40 UTC

[jira] [Created] (SPARK-13802) Fields order in Row is not consistent with Schema.toInternal method

Szymon Matejczyk created SPARK-13802:
----------------------------------------

             Summary: Fields order in Row is not consistent with Schema.toInternal method
                 Key: SPARK-13802
                 URL: https://issues.apache.org/jira/browse/SPARK-13802
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.6.0
            Reporter: Szymon Matejczyk


When using Row constructor from kwargs, fields in the tuple underneath are sorted by name. When Schema is reading the row, it is not using the fields in this order.

{code:python}
from pyspark.sql import Row
from pyspark.sql.types import *

schema = StructType([
    StructField("id", StringType()),
    StructField("first_name", StringType())])
row = Row(id="39", first_name="Szymon")
schema.toInternal(row)
Out[5]: ('Szymon', '39')
{code}

{code:python}
df = sqlContext.createDataFrame([row], schema)
df.show(1)

+----------+----------+
|    id      |first_name|
+----------+----------+
|Szymon|        39|
+----------+----------+
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org