You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "David Fagnan (JIRA)" <ji...@apache.org> on 2016/02/04 01:22:39 UTC
[jira] [Created] (SPARK-13179) pyspark row name collision 'count'
David Fagnan created SPARK-13179:
------------------------------------
Summary: pyspark row name collision 'count'
Key: SPARK-13179
URL: https://issues.apache.org/jira/browse/SPARK-13179
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 1.6.0
Reporter: David Fagnan
The following example from the documentation results in a name collision:
{code:none}
>>> df = sc.parallelize([ Row(name='Alice', age=5, height=80), Row(name='Alice', age=10, height=140)]).toDF()
>>> alice_counts = df.groupby(df.name).count().collect()
>>> print(alice_counts[0])
Row(name=u'Alice',count=2)
>>> print(alice_counts[0].name)
Alice
{code}
Which is correct, but the column name count results in the name collision below:
{code:none}
>>> print(alice_counts[0].count)
<built-in method count of Row object at 0x...>
{code}
The collision results from the inherited method count from python tuples.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org