You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/29 08:01:27 UTC
[GitHub] [spark] dcoliversun commented on a diff in pull request #37702: [SPARK-40012][PYTHON][DOCS] Make pyspark.sql.dataframe examples self-contained
dcoliversun commented on code in PR #37702:
URL: https://github.com/apache/spark/pull/37702#discussion_r956980017
##########
python/pyspark/sql/dataframe.py:
##########
@@ -277,10 +297,16 @@ def registerTempTable(self, name: str) -> None:
.. deprecated:: 2.0.0
Use :meth:`DataFrame.createOrReplaceTempView` instead.
+ Parameters
+ ----------
+ name : str
+ Name of the table to register.
Review Comment:
```suggestion
Name of the temporary table to register.
```
##########
python/pyspark/sql/dataframe.py:
##########
@@ -4100,7 +4532,8 @@ def toDF(self, *cols: "ColumnOrName") -> "DataFrame":
Parameters
----------
cols : str
Review Comment:
```suggestion
cols : str, :class:`Column`, or list
```
##########
python/pyspark/sql/dataframe.py:
##########
@@ -4100,7 +4532,8 @@ def toDF(self, *cols: "ColumnOrName") -> "DataFrame":
Parameters
----------
cols : str
- new column names
+ new column names. The length of the list needs to be the same as the number
+ of columns in the initial :class:`DataFrame`
Review Comment:
```suggestion
new column names (string) or expressions (:class:`Column`).
The length of the list needs to be the same as the number
of columns in the initial :class:`DataFrame`
```
##########
python/pyspark/sql/dataframe.py:
##########
@@ -3067,6 +3436,10 @@ def unionByName(self, other: "DataFrame", allowMissingColumns: bool = False) ->
----------
other : :class:`DataFrame`
Another :class:`DataFrame` that needs to be combined.
+ allowMissingColumns : bool, optional, default False
+ Specify whether to allow missing columns.
+
+ .. versionadded:: 3.1.0
Review Comment:
```suggestion
.. versionadded:: 3.1.0
```
##########
python/pyspark/sql/dataframe.py:
##########
@@ -4153,6 +4594,7 @@ def transform(self, func: Callable[..., "DataFrame"], *args: Any, **kwargs: Any)
| 1| 1|
| 2| 2|
+-----+---+
+
Review Comment:
```suggestion
```
##########
python/pyspark/sql/dataframe.py:
##########
@@ -344,10 +372,16 @@ def createOrReplaceTempView(self, name: str) -> None:
Examples
--------
+ Create a local temporary view named 'people'
Review Comment:
```suggestion
Create a local temporary view named 'people'.
```
##########
python/pyspark/sql/dataframe.py:
##########
@@ -2412,14 +2657,40 @@ def __getitem__(self, item: Union[int, str, Column, List, Tuple]) -> Union[Colum
Examples
--------
- >>> df.select(df['age']).collect()
- [Row(age=2), Row(age=5)]
- >>> df[ ["name", "age"]].collect()
- [Row(name='Alice', age=2), Row(name='Bob', age=5)]
- >>> df[ df.age > 3 ].collect()
- [Row(age=5, name='Bob')]
- >>> df[df[0] > 3].collect()
- [Row(age=5, name='Bob')]
+ >>> df = spark.createDataFrame([
+ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"])
+
+ Retrieve a column instance.
+
+ >>> df.select(df['age']).show()
+ +---+
+ |age|
+ +---+
+ | 2|
+ | 5|
+ +---+
+
+ Selecting multiple string columns as index.
Review Comment:
Is it better to use no-ing verb? I see `ing` only used here.
```suggestion
Select multiple string columns as index.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org