You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@spark.apache.org by sr...@apache.org on 2015/09/29 23:45:20 UTC

spark git commit: [SPARK-10782] [PYTHON] Update dropDuplicates documentation

Repository: spark
Updated Branches:
  refs/heads/master 7d399c9da -> c1ad373f2


[SPARK-10782] [PYTHON] Update dropDuplicates documentation

Documentation for dropDuplicates() and drop_duplicates() is one and the same.  Resolved the error in the example for drop_duplicates using the same approach used for groupby and groupBy, by indicating that dropDuplicates and drop_duplicates are aliases.

Author: asokadiggs <as...@intel.com>

Closes #8930 from asokadiggs/jira-10782.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c1ad373f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c1ad373f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c1ad373f

Branch: refs/heads/master
Commit: c1ad373f26053e1906fce7681c03d130a642bf33
Parents: 7d399c9
Author: asokadiggs <as...@intel.com>
Authored: Tue Sep 29 17:45:18 2015 -0400
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue Sep 29 17:45:18 2015 -0400

----------------------------------------------------------------------
 python/pyspark/sql/dataframe.py | 2 ++
 1 file changed, 2 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/c1ad373f/python/pyspark/sql/dataframe.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index b09422a..033b319 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -931,6 +931,8 @@ class DataFrame(object):
         """Return a new :class:`DataFrame` with duplicate rows removed,
         optionally only considering certain columns.
 
+        :func:`drop_duplicates` is an alias for :func:`dropDuplicates`.
+
         >>> from pyspark.sql import Row
         >>> df = sc.parallelize([ \
             Row(name='Alice', age=5, height=80), \


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org