You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Felix Cheung (JIRA)" <ji...@apache.org> on 2015/08/29 03:00:48 UTC
[jira] [Created] (SPARK-10346) SparkR mutate and transform should
replace column with same name to match R data.frame behavior
Felix Cheung created SPARK-10346:
------------------------------------
Summary: SparkR mutate and transform should replace column with same name to match R data.frame behavior
Key: SPARK-10346
URL: https://issues.apache.org/jira/browse/SPARK-10346
Project: Spark
Issue Type: Bug
Components: R
Affects Versions: 1.5.0
Reporter: Felix Cheung
Spark doesn't seem to replace existing column with the name in mutate (ie. mutate(df, age = df$age + 2) - returned DataFrame has 2 columns with the same name 'age'), so therefore not doing that for now in transform.
Though it is clearly stated it should replace column with matching name:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/transform.html
"The tags are matched against names(_data), and for those that match, the value replace the corresponding variable in _data, and the others are appended to _data."
Also the resulting DataFrame might be hard to work with if one is to use select with column names and so on.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org