You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/29 08:01:27 UTC
[GitHub] [spark] dcoliversun commented on a diff in pull request #37702: [SPARK-40012][PYTHON][DOCS] Make pyspark.sql.dataframe examples self-contained

dcoliversun commented on code in PR #37702:
URL: https://github.com/apache/spark/pull/37702#discussion_r956980017


##########
python/pyspark/sql/dataframe.py:
##########
@@ -277,10 +297,16 @@ def registerTempTable(self, name: str) -> None:
         .. deprecated:: 2.0.0
             Use :meth:`DataFrame.createOrReplaceTempView` instead.
 
+        Parameters
+        ----------
+        name : str
+            Name of the table to register.

Review Comment:
   ```suggestion
               Name of the temporary table to register.
   ```



##########
python/pyspark/sql/dataframe.py:
##########
@@ -4100,7 +4532,8 @@ def toDF(self, *cols: "ColumnOrName") -> "DataFrame":
         Parameters
         ----------
         cols : str

Review Comment:
   ```suggestion
           cols : str, :class:`Column`, or list
   ```



##########
python/pyspark/sql/dataframe.py:
##########
@@ -4100,7 +4532,8 @@ def toDF(self, *cols: "ColumnOrName") -> "DataFrame":
         Parameters
         ----------
         cols : str
-            new column names
+            new column names. The length of the list needs to be the same as the number
+            of columns in the initial :class:`DataFrame`

Review Comment:
   ```suggestion
               new column names (string) or expressions (:class:`Column`).
               The length of the list needs to be the same as the number
               of columns in the initial :class:`DataFrame`
   ```



##########
python/pyspark/sql/dataframe.py:
##########
@@ -3067,6 +3436,10 @@ def unionByName(self, other: "DataFrame", allowMissingColumns: bool = False) ->
         ----------
         other : :class:`DataFrame`
             Another :class:`DataFrame` that needs to be combined.
+        allowMissingColumns : bool, optional, default False
+           Specify whether to allow missing columns.
+
+           .. versionadded:: 3.1.0

Review Comment:
   ```suggestion
           .. versionadded:: 3.1.0
   ```



##########
python/pyspark/sql/dataframe.py:
##########
@@ -4153,6 +4594,7 @@ def transform(self, func: Callable[..., "DataFrame"], *args: Any, **kwargs: Any)
         |    1|  1|
         |    2|  2|
         +-----+---+
+

Review Comment:
   ```suggestion
   ```



##########
python/pyspark/sql/dataframe.py:
##########
@@ -344,10 +372,16 @@ def createOrReplaceTempView(self, name: str) -> None:
 
         Examples
         --------
+        Create a local temporary view named 'people'

Review Comment:
   ```suggestion
           Create a local temporary view named 'people'.
   ```



##########
python/pyspark/sql/dataframe.py:
##########
@@ -2412,14 +2657,40 @@ def __getitem__(self, item: Union[int, str, Column, List, Tuple]) -> Union[Colum
 
         Examples
         --------
-        >>> df.select(df['age']).collect()
-        [Row(age=2), Row(age=5)]
-        >>> df[ ["name", "age"]].collect()
-        [Row(name='Alice', age=2), Row(name='Bob', age=5)]
-        >>> df[ df.age > 3 ].collect()
-        [Row(age=5, name='Bob')]
-        >>> df[df[0] > 3].collect()
-        [Row(age=5, name='Bob')]
+        >>> df = spark.createDataFrame([
+        ...     (2, "Alice"), (5, "Bob")], schema=["age", "name"])
+
+        Retrieve a column instance.
+
+        >>> df.select(df['age']).show()
+        +---+
+        |age|
+        +---+
+        |  2|
+        |  5|
+        +---+
+
+        Selecting multiple string columns as index.

Review Comment:
   Is it better to use no-ing verb? I see `ing` only used here.
   ```suggestion
           Select multiple string columns as index.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org