You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "David Fagnan (JIRA)" <ji...@apache.org> on 2016/02/04 01:22:39 UTC

[jira] [Created] (SPARK-13179) pyspark row name collision 'count'

David Fagnan created SPARK-13179:
------------------------------------

             Summary: pyspark row name collision 'count'
                 Key: SPARK-13179
                 URL: https://issues.apache.org/jira/browse/SPARK-13179
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.6.0
            Reporter: David Fagnan


The following example from the documentation results in a name collision:
{code:none}
>>>  df = sc.parallelize([ Row(name='Alice', age=5, height=80),                   Row(name='Alice', age=10, height=140)]).toDF()
>>> alice_counts = df.groupby(df.name).count().collect()
>>> print(alice_counts[0])
Row(name=u'Alice',count=2)
>>> print(alice_counts[0].name)
Alice
{code}
Which is correct, but the column name count results in the name collision below:
{code:none}
>>> print(alice_counts[0].count)
<built-in method count of Row object at 0x...>
{code}

The collision results from the inherited method count from python tuples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org