You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiang Gao (JIRA)" <ji...@apache.org> on 2016/08/22 13:21:20 UTC
[jira] [Created] (SPARK-17185) Unify naming of API for RDD and
Dataset
Xiang Gao created SPARK-17185:
---------------------------------
Summary: Unify naming of API for RDD and Dataset
Key: SPARK-17185
URL: https://issues.apache.org/jira/browse/SPARK-17185
Project: Spark
Issue Type: Improvement
Components: Spark Core, SQL
Reporter: Xiang Gao
In RDD, groupByKey is used to generate a key-list pair and aggregateByKey is used to do aggregation.
In Dataset, aggregation is done by groupBy and groupByKey, and no API for key-list pair is provided.
The same name "groupBy" is designed to do different things and this might be be confusing. Besides, it would be more convenient to provide API to generate key-list pair for Dataset.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org