You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/09 09:44:06 UTC

[GitHub] [spark] zhengruifeng opened a new pull request, #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL

zhengruifeng opened a new pull request, #37449:
URL: https://github.com/apache/spark/pull/37449

   ### What changes were proposed in this pull request?
   Implement the missing functions in PySpark:
   
   -  call_udf
   -  localtimestamp
   -  map_contains_key
   -  pmod
   
   After this PR, all functions in `org.apache.spark.sql.functions` can be found in `pyspark.sql.functions` or or have equivalents (e.g. `not` -> `~`)
   
   ### Why are the changes needed?
   for function parity
   
   ### Does this PR introduce _any_ user-facing change?
   yes, 4 new APIs added
   
   
   ### How was this patch tested?
   added doctests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL

Posted by GitBox <gi...@apache.org>.

HyukjinKwon commented on code in PR #37449:
URL: https://github.com/apache/spark/pull/37449#discussion_r941163013


##########
python/pyspark/sql/functions.py:
##########
@@ -1077,6 +1077,35 @@ def pow(col1: Union["ColumnOrName", float], col2: Union["ColumnOrName", float])
     return _invoke_binary_math_function("pow", col1, col2)
 
 
+def pmod(dividend: Union["ColumnOrName", float], divisor: Union["ColumnOrName", float]) -> Column:

Review Comment:
   Can we also have `Parameters` section?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL

Posted by GitBox <gi...@apache.org>.

zhengruifeng commented on code in PR #37449:
URL: https://github.com/apache/spark/pull/37449#discussion_r941394288


##########
python/pyspark/sql/functions.py:
##########
@@ -1077,6 +1077,35 @@ def pow(col1: Union["ColumnOrName", float], col2: Union["ColumnOrName", float])
     return _invoke_binary_math_function("pow", col1, col2)
 
 
+def pmod(dividend: Union["ColumnOrName", float], divisor: Union["ColumnOrName", float]) -> Column:
+    """
+    Returns the positive value of dividend mod divisor.
+
+    .. versionadded:: 3.4.0
+
+    Examples
+    --------
+    >>> df = spark.createDataFrame([
+    ...     (1.0, float('nan')), (float('nan'), 2.0),
+    ...     (float('nan'), float('nan')), (-3.0, 4.0),
+    ...      (-5.0, -6.0), (7.0, -8.0), (1.0, 2.0)],

Review Comment:
   done. thanks!



##########
python/pyspark/sql/functions.py:
##########
@@ -1077,6 +1077,35 @@ def pow(col1: Union["ColumnOrName", float], col2: Union["ColumnOrName", float])
     return _invoke_binary_math_function("pow", col1, col2)
 
 
+def pmod(dividend: Union["ColumnOrName", float], divisor: Union["ColumnOrName", float]) -> Column:

Review Comment:
   updated!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL

Posted by GitBox <gi...@apache.org>.

zhengruifeng commented on PR #37449:
URL: https://github.com/apache/spark/pull/37449#issuecomment-1210551253

   thank you all! merged into master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] aray commented on a diff in pull request #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL

Posted by GitBox <gi...@apache.org>.

aray commented on code in PR #37449:
URL: https://github.com/apache/spark/pull/37449#discussion_r941273835


##########
python/pyspark/sql/functions.py:
##########
@@ -1077,6 +1077,35 @@ def pow(col1: Union["ColumnOrName", float], col2: Union["ColumnOrName", float])
     return _invoke_binary_math_function("pow", col1, col2)
 
 
+def pmod(dividend: Union["ColumnOrName", float], divisor: Union["ColumnOrName", float]) -> Column:
+    """
+    Returns the positive value of dividend mod divisor.
+
+    .. versionadded:: 3.4.0
+
+    Examples
+    --------
+    >>> df = spark.createDataFrame([
+    ...     (1.0, float('nan')), (float('nan'), 2.0),
+    ...     (float('nan'), float('nan')), (-3.0, 4.0),
+    ...      (-5.0, -6.0), (7.0, -8.0), (1.0, 2.0)],

Review Comment:
   These are all good examples but it would be good to also have examples where |dividend| > divisor > 0 as that is normal usage of this function. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng closed pull request #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL

Posted by GitBox <gi...@apache.org>.

zhengruifeng closed pull request #37449: [SPARK-39734][SPARK-36259][SPARK-39733][SPARK-37348][PYTHON] Functions Parity between PySpark and SQL
URL: https://github.com/apache/spark/pull/37449


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org