You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ze...@apache.org on 2021/07/27 07:10:37 UTC

[spark] branch branch-3.1 updated: [SPARK-36211][PYTHON] Correct typing of `udf` return value

This is an automated email from the ASF dual-hosted git repository.

zero323 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
     new 682b306  [SPARK-36211][PYTHON] Correct typing of `udf` return value
682b306 is described below

commit 682b306f8bbbbe189ec7f2b8179b16741adad396
Author: Luran He <lu...@gmail.com>
AuthorDate: Tue Jul 27 09:07:22 2021 +0200

    [SPARK-36211][PYTHON] Correct typing of `udf` return value
    
    The following code should type-check:
    
    ```python3
    import uuid
    
    import pyspark.sql.functions as F
    
    my_udf = F.udf(lambda: str(uuid.uuid4())).asNondeterministic()
    ```
    
    ### What changes were proposed in this pull request?
    
    The `udf` function should return a more specific type.
    
    ### Why are the changes needed?
    
    Right now, `mypy` will throw spurious errors, such as for the code given above.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    This was not tested. Sorry, I am not very familiar with this repo -- are there any typing tests?
    
    Closes #33399 from luranhe/patch-1.
    
    Lead-authored-by: Luran He <lu...@gmail.com>
    Co-authored-by: Luran He <lu...@compass.com>
    Signed-off-by: zero323 <ms...@gmail.com>
    (cherry picked from commit ede1bc6b51c23b2d857b497d335b8e7fe3a5e0cc)
    Signed-off-by: zero323 <ms...@gmail.com>
---
 python/pyspark/sql/_typing.pyi   | 12 +++++++++---
 python/pyspark/sql/functions.pyi |  7 ++++---
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/python/pyspark/sql/_typing.pyi b/python/pyspark/sql/_typing.pyi
index 799a732..1a3bd8f 100644
--- a/python/pyspark/sql/_typing.pyi
+++ b/python/pyspark/sql/_typing.pyi
@@ -18,6 +18,7 @@
 
 from typing import (
     Any,
+    Callable,
     List,
     Optional,
     Tuple,
@@ -30,11 +31,10 @@ import datetime
 import decimal
 
 from pyspark._typing import PrimitiveType
-import pyspark.sql.column
 import pyspark.sql.types
 from pyspark.sql.column import Column
 
-ColumnOrName = Union[pyspark.sql.column.Column, str]
+ColumnOrName = Union[Column, str]
 DecimalLiteral = decimal.Decimal
 DateTimeLiteral = Union[datetime.datetime, datetime.date]
 LiteralType = PrimitiveType
@@ -54,4 +54,10 @@ class SupportsClose(Protocol):
     def close(self, error: Exception) -> None: ...
 
 class UserDefinedFunctionLike(Protocol):
-    def __call__(self, *_: ColumnOrName) -> Column: ...
+    func: Callable[..., Any]
+    evalType: int
+    deterministic: bool
+    @property
+    def returnType(self) -> pyspark.sql.types.DataType: ...
+    def __call__(self, *args: ColumnOrName) -> Column: ...
+    def asNondeterministic(self) -> UserDefinedFunctionLike: ...
diff --git a/python/pyspark/sql/functions.pyi b/python/pyspark/sql/functions.pyi
index 5fec6fd..749bcce 100644
--- a/python/pyspark/sql/functions.pyi
+++ b/python/pyspark/sql/functions.pyi
@@ -22,6 +22,7 @@ from typing import Any, Callable, Dict, List, Optional, Union
 from pyspark.sql._typing import (
     ColumnOrName,
     DataTypeOrString,
+    UserDefinedFunctionLike,
 )
 from pyspark.sql.pandas.functions import (  # noqa: F401
     pandas_udf as pandas_udf,
@@ -346,13 +347,13 @@ def variance(col: ColumnOrName) -> Column: ...
 @overload
 def udf(
     f: Callable[..., Any], returnType: DataTypeOrString = ...
-) -> Callable[..., Column]: ...
+) -> UserDefinedFunctionLike: ...
 @overload
 def udf(
     f: DataTypeOrString = ...,
-) -> Callable[[Callable[..., Any]], Callable[..., Column]]: ...
+) -> Callable[[Callable[..., Any]], UserDefinedFunctionLike]: ...
 @overload
 def udf(
     *,
     returnType: DataTypeOrString = ...,
-) -> Callable[[Callable[..., Any]], Callable[..., Column]]: ...
+) -> Callable[[Callable[..., Any]], UserDefinedFunctionLike]: ...

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org