You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/23 23:38:39 UTC

[GitHub] [spark] zero323 opened a new pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

zero323 opened a new pull request #30143:
URL: https://github.com/apache/spark/pull/30143


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   - [x] Expand dictionary definitions into standalone functions.
   - [ ] Add proper NumPy-style docstrings to expanded functions.
   - [ ] Adjust docstring of pre-existing standalone functions.
   - [ ] Group and sort definitions within groups.
   - [x] Fix annotations for ordering functions.
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   Existing tests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716383286






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716417391






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716360611






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715859220






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716395161


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34863/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715833374


   **[Test build #130230 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130230/testReport)** for PR 30143 at commit [`738fdde`](https://github.com/apache/spark/commit/738fdde521a7bcd101a76f6880ec94e94b0e6a91).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716350898


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716417391






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715647106


   Nice, @zero323. Thanks for working on this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715875784






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716395189






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715879506






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715879219


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34831/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #30143:
URL: https://github.com/apache/spark/pull/30143#discussion_r512488156



##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +41,457 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
 
 
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
     """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
 
 
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
 
 
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
         # For legacy reasons, the arguments here can be implicitly converted into floats,
         # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+        _to_java_column(col1) if isinstance(col1, (str, Column)) else float(col1),
+        _to_java_column(col2) if isinstance(col2, (str, Column)) else float(col2)
+    )
 
 
 def _options_to_str(options):
     return {key: to_str(value) for (key, value) in options.items()}
 
-_lit_doc = """
+
+@since(1.3)
+def lit(col):
+    """
     Creates a :class:`Column` of literal value.
 
     >>> df.select(lit(5).alias('height')).withColumn('spark_user', lit(True)).take(1)
     [Row(height=5, spark_user=True)]
     """
-_functions = {
-    'lit': _lit_doc,
-    'col': 'Returns a :class:`Column` based on the given column name.',
-    'column': 'Returns a :class:`Column` based on the given column name.',
-    'asc': 'Returns a sort expression based on the ascending order of the given column name.',
-    'desc': 'Returns a sort expression based on the descending order of the given column name.',
-}
-
-_functions_over_column = {
-    'sqrt': 'Computes the square root of the specified float value.',
-    'abs': 'Computes the absolute value.',
-
-    'max': 'Aggregate function: returns the maximum value of the expression in a group.',
-    'min': 'Aggregate function: returns the minimum value of the expression in a group.',
-    'count': 'Aggregate function: returns the number of items in a group.',
-    'sum': 'Aggregate function: returns the sum of all values in the expression.',
-    'avg': 'Aggregate function: returns the average of the values in a group.',
-    'mean': 'Aggregate function: returns the average of the values in a group.',
-    'sumDistinct': 'Aggregate function: returns the sum of distinct values in the expression.',
-}
-
-_functions_1_4_over_column = {
-    # unary math functions
-    'acos': ':return: inverse cosine of `col`, as if computed by `java.lang.Math.acos()`',
-    'asin': ':return: inverse sine of `col`, as if computed by `java.lang.Math.asin()`',
-    'atan': ':return: inverse tangent of `col`, as if computed by `java.lang.Math.atan()`',
-    'cbrt': 'Computes the cube-root of the given value.',
-    'ceil': 'Computes the ceiling of the given value.',
-    'cos': """:param col: angle in radians
-           :return: cosine of the angle, as if computed by `java.lang.Math.cos()`.""",
-    'cosh': """:param col: hyperbolic angle
-           :return: hyperbolic cosine of the angle, as if computed by `java.lang.Math.cosh()`""",
-    'exp': 'Computes the exponential of the given value.',
-    'expm1': 'Computes the exponential of the given value minus one.',
-    'floor': 'Computes the floor of the given value.',
-    'log': 'Computes the natural logarithm of the given value.',
-    'log10': 'Computes the logarithm of the given value in Base 10.',
-    'log1p': 'Computes the natural logarithm of the given value plus one.',
-    'rint': 'Returns the double value that is closest in value to the argument and' +
-            ' is equal to a mathematical integer.',
-    'signum': 'Computes the signum of the given value.',
-    'sin': """:param col: angle in radians
-           :return: sine of the angle, as if computed by `java.lang.Math.sin()`""",
-    'sinh': """:param col: hyperbolic angle
-           :return: hyperbolic sine of the given value,
-                    as if computed by `java.lang.Math.sinh()`""",
-    'tan': """:param col: angle in radians
-           :return: tangent of the given value, as if computed by `java.lang.Math.tan()`""",
-    'tanh': """:param col: hyperbolic angle
-            :return: hyperbolic tangent of the given value,
-                     as if computed by `java.lang.Math.tanh()`""",
-    'toDegrees': '.. note:: Deprecated in 2.1, use :func:`degrees` instead.',
-    'toRadians': '.. note:: Deprecated in 2.1, use :func:`radians` instead.',
-    'bitwiseNOT': 'Computes bitwise not.',
-}
-
-_functions_2_4 = {
-    'asc_nulls_first': 'Returns a sort expression based on the ascending order of the given' +
-                       ' column name, and null values return before non-null values.',
-    'asc_nulls_last': 'Returns a sort expression based on the ascending order of the given' +
-                      ' column name, and null values appear after non-null values.',
-    'desc_nulls_first': 'Returns a sort expression based on the descending order of the given' +
-                        ' column name, and null values appear before non-null values.',
-    'desc_nulls_last': 'Returns a sort expression based on the descending order of the given' +
-                       ' column name, and null values appear after non-null values',
-}
-
-_collect_list_doc = """
+    return col if isinstance(col, Column) else _invoke_function("lit", col)
+
+
+@since(1.3)
+def col(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return _invoke_function("col", col)
+
+
+@since(1.3)
+def column(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return col(col)
+
+
+@since(1.3)
+def asc(col):
+    """
+    Returns a sort expression based on the ascending order of the given column name.
+    """
+    return _invoke_function("asc", col)

Review comment:
       I wonder if it makes more sense, for the sake of consistency ,to add `Column` signatures on Scala side. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #30143:
URL: https://github.com/apache/spark/pull/30143#discussion_r512367831



##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +41,457 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
 
 
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
     """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
 
 
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
 
 
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
         # For legacy reasons, the arguments here can be implicitly converted into floats,
         # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+        _to_java_column(col1) if isinstance(col1, (str, Column)) else float(col1),
+        _to_java_column(col2) if isinstance(col2, (str, Column)) else float(col2)
+    )
 
 
 def _options_to_str(options):
     return {key: to_str(value) for (key, value) in options.items()}
 
-_lit_doc = """
+
+@since(1.3)
+def lit(col):
+    """
     Creates a :class:`Column` of literal value.
 
     >>> df.select(lit(5).alias('height')).withColumn('spark_user', lit(True)).take(1)
     [Row(height=5, spark_user=True)]
     """
-_functions = {
-    'lit': _lit_doc,
-    'col': 'Returns a :class:`Column` based on the given column name.',
-    'column': 'Returns a :class:`Column` based on the given column name.',
-    'asc': 'Returns a sort expression based on the ascending order of the given column name.',
-    'desc': 'Returns a sort expression based on the descending order of the given column name.',
-}
-
-_functions_over_column = {
-    'sqrt': 'Computes the square root of the specified float value.',
-    'abs': 'Computes the absolute value.',
-
-    'max': 'Aggregate function: returns the maximum value of the expression in a group.',
-    'min': 'Aggregate function: returns the minimum value of the expression in a group.',
-    'count': 'Aggregate function: returns the number of items in a group.',
-    'sum': 'Aggregate function: returns the sum of all values in the expression.',
-    'avg': 'Aggregate function: returns the average of the values in a group.',
-    'mean': 'Aggregate function: returns the average of the values in a group.',
-    'sumDistinct': 'Aggregate function: returns the sum of distinct values in the expression.',
-}
-
-_functions_1_4_over_column = {
-    # unary math functions
-    'acos': ':return: inverse cosine of `col`, as if computed by `java.lang.Math.acos()`',
-    'asin': ':return: inverse sine of `col`, as if computed by `java.lang.Math.asin()`',
-    'atan': ':return: inverse tangent of `col`, as if computed by `java.lang.Math.atan()`',
-    'cbrt': 'Computes the cube-root of the given value.',
-    'ceil': 'Computes the ceiling of the given value.',
-    'cos': """:param col: angle in radians
-           :return: cosine of the angle, as if computed by `java.lang.Math.cos()`.""",
-    'cosh': """:param col: hyperbolic angle
-           :return: hyperbolic cosine of the angle, as if computed by `java.lang.Math.cosh()`""",
-    'exp': 'Computes the exponential of the given value.',
-    'expm1': 'Computes the exponential of the given value minus one.',
-    'floor': 'Computes the floor of the given value.',
-    'log': 'Computes the natural logarithm of the given value.',
-    'log10': 'Computes the logarithm of the given value in Base 10.',
-    'log1p': 'Computes the natural logarithm of the given value plus one.',
-    'rint': 'Returns the double value that is closest in value to the argument and' +
-            ' is equal to a mathematical integer.',
-    'signum': 'Computes the signum of the given value.',
-    'sin': """:param col: angle in radians
-           :return: sine of the angle, as if computed by `java.lang.Math.sin()`""",
-    'sinh': """:param col: hyperbolic angle
-           :return: hyperbolic sine of the given value,
-                    as if computed by `java.lang.Math.sinh()`""",
-    'tan': """:param col: angle in radians
-           :return: tangent of the given value, as if computed by `java.lang.Math.tan()`""",
-    'tanh': """:param col: hyperbolic angle
-            :return: hyperbolic tangent of the given value,
-                     as if computed by `java.lang.Math.tanh()`""",
-    'toDegrees': '.. note:: Deprecated in 2.1, use :func:`degrees` instead.',
-    'toRadians': '.. note:: Deprecated in 2.1, use :func:`radians` instead.',
-    'bitwiseNOT': 'Computes bitwise not.',
-}
-
-_functions_2_4 = {
-    'asc_nulls_first': 'Returns a sort expression based on the ascending order of the given' +
-                       ' column name, and null values return before non-null values.',
-    'asc_nulls_last': 'Returns a sort expression based on the ascending order of the given' +
-                      ' column name, and null values appear after non-null values.',
-    'desc_nulls_first': 'Returns a sort expression based on the descending order of the given' +
-                        ' column name, and null values appear before non-null values.',
-    'desc_nulls_last': 'Returns a sort expression based on the descending order of the given' +
-                       ' column name, and null values appear after non-null values',
-}
-
-_collect_list_doc = """
+    return col if isinstance(col, Column) else _invoke_function("lit", col)
+
+
+@since(1.3)
+def col(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return _invoke_function("col", col)
+
+
+@since(1.3)
+def column(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return col(col)
+
+
+@since(1.3)
+def asc(col):
+    """
+    Returns a sort expression based on the ascending order of the given column name.
+    """
+    return _invoke_function("asc", col)
+
+
+@since(1.3)
+def desc(col):
+    """
+    Returns a sort expression based on the descending order of the given column name.
+    """
+    return _invoke_function("desc", col)

Review comment:
       This one would have to be same as `asc` above.

##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +41,457 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
 
 
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
     """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
 
 
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
 
 
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
         # For legacy reasons, the arguments here can be implicitly converted into floats,
         # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+        _to_java_column(col1) if isinstance(col1, (str, Column)) else float(col1),
+        _to_java_column(col2) if isinstance(col2, (str, Column)) else float(col2)
+    )
 
 
 def _options_to_str(options):
     return {key: to_str(value) for (key, value) in options.items()}
 
-_lit_doc = """
+
+@since(1.3)
+def lit(col):
+    """
     Creates a :class:`Column` of literal value.
 
     >>> df.select(lit(5).alias('height')).withColumn('spark_user', lit(True)).take(1)
     [Row(height=5, spark_user=True)]
     """
-_functions = {
-    'lit': _lit_doc,
-    'col': 'Returns a :class:`Column` based on the given column name.',
-    'column': 'Returns a :class:`Column` based on the given column name.',
-    'asc': 'Returns a sort expression based on the ascending order of the given column name.',
-    'desc': 'Returns a sort expression based on the descending order of the given column name.',
-}
-
-_functions_over_column = {
-    'sqrt': 'Computes the square root of the specified float value.',
-    'abs': 'Computes the absolute value.',
-
-    'max': 'Aggregate function: returns the maximum value of the expression in a group.',
-    'min': 'Aggregate function: returns the minimum value of the expression in a group.',
-    'count': 'Aggregate function: returns the number of items in a group.',
-    'sum': 'Aggregate function: returns the sum of all values in the expression.',
-    'avg': 'Aggregate function: returns the average of the values in a group.',
-    'mean': 'Aggregate function: returns the average of the values in a group.',
-    'sumDistinct': 'Aggregate function: returns the sum of distinct values in the expression.',
-}
-
-_functions_1_4_over_column = {
-    # unary math functions
-    'acos': ':return: inverse cosine of `col`, as if computed by `java.lang.Math.acos()`',
-    'asin': ':return: inverse sine of `col`, as if computed by `java.lang.Math.asin()`',
-    'atan': ':return: inverse tangent of `col`, as if computed by `java.lang.Math.atan()`',
-    'cbrt': 'Computes the cube-root of the given value.',
-    'ceil': 'Computes the ceiling of the given value.',
-    'cos': """:param col: angle in radians
-           :return: cosine of the angle, as if computed by `java.lang.Math.cos()`.""",
-    'cosh': """:param col: hyperbolic angle
-           :return: hyperbolic cosine of the angle, as if computed by `java.lang.Math.cosh()`""",
-    'exp': 'Computes the exponential of the given value.',
-    'expm1': 'Computes the exponential of the given value minus one.',
-    'floor': 'Computes the floor of the given value.',
-    'log': 'Computes the natural logarithm of the given value.',
-    'log10': 'Computes the logarithm of the given value in Base 10.',
-    'log1p': 'Computes the natural logarithm of the given value plus one.',
-    'rint': 'Returns the double value that is closest in value to the argument and' +
-            ' is equal to a mathematical integer.',
-    'signum': 'Computes the signum of the given value.',
-    'sin': """:param col: angle in radians
-           :return: sine of the angle, as if computed by `java.lang.Math.sin()`""",
-    'sinh': """:param col: hyperbolic angle
-           :return: hyperbolic sine of the given value,
-                    as if computed by `java.lang.Math.sinh()`""",
-    'tan': """:param col: angle in radians
-           :return: tangent of the given value, as if computed by `java.lang.Math.tan()`""",
-    'tanh': """:param col: hyperbolic angle
-            :return: hyperbolic tangent of the given value,
-                     as if computed by `java.lang.Math.tanh()`""",
-    'toDegrees': '.. note:: Deprecated in 2.1, use :func:`degrees` instead.',
-    'toRadians': '.. note:: Deprecated in 2.1, use :func:`radians` instead.',
-    'bitwiseNOT': 'Computes bitwise not.',
-}
-
-_functions_2_4 = {
-    'asc_nulls_first': 'Returns a sort expression based on the ascending order of the given' +
-                       ' column name, and null values return before non-null values.',
-    'asc_nulls_last': 'Returns a sort expression based on the ascending order of the given' +
-                      ' column name, and null values appear after non-null values.',
-    'desc_nulls_first': 'Returns a sort expression based on the descending order of the given' +
-                        ' column name, and null values appear before non-null values.',
-    'desc_nulls_last': 'Returns a sort expression based on the descending order of the given' +
-                       ' column name, and null values appear after non-null values',
-}
-
-_collect_list_doc = """
+    return col if isinstance(col, Column) else _invoke_function("lit", col)
+
+
+@since(1.3)
+def col(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return _invoke_function("col", col)
+
+
+@since(1.3)
+def column(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return col(col)
+
+
+@since(1.3)
+def asc(col):
+    """
+    Returns a sort expression based on the ascending order of the given column name.
+    """
+    return _invoke_function("asc", col)

Review comment:
       @zero323, looks like we should call `col.asc if isinstance(col, Column) else _invoke_function("asc", col)` instead to make it support both string and columns in PySpark side (Scala side does not have the Column signature).
   
   It was decided to support both string/column instances where we can in PySpark at SPARK-26979.
   
   Ideally we should do this in a separate JIRA and PR but I don't mind fixing it here while we're here. I will leave it to you.

##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +41,457 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
 
 
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
     """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
 
 
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
 
 
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
         # For legacy reasons, the arguments here can be implicitly converted into floats,
         # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+        _to_java_column(col1) if isinstance(col1, (str, Column)) else float(col1),
+        _to_java_column(col2) if isinstance(col2, (str, Column)) else float(col2)

Review comment:
       Much more neat!

##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +41,457 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
 
 
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
     """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
 
 
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
 
 
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
         # For legacy reasons, the arguments here can be implicitly converted into floats,
         # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+        _to_java_column(col1) if isinstance(col1, (str, Column)) else float(col1),
+        _to_java_column(col2) if isinstance(col2, (str, Column)) else float(col2)
+    )
 
 
 def _options_to_str(options):
     return {key: to_str(value) for (key, value) in options.items()}
 
-_lit_doc = """
+
+@since(1.3)
+def lit(col):
+    """
     Creates a :class:`Column` of literal value.
 
     >>> df.select(lit(5).alias('height')).withColumn('spark_user', lit(True)).take(1)
     [Row(height=5, spark_user=True)]
     """
-_functions = {
-    'lit': _lit_doc,
-    'col': 'Returns a :class:`Column` based on the given column name.',
-    'column': 'Returns a :class:`Column` based on the given column name.',
-    'asc': 'Returns a sort expression based on the ascending order of the given column name.',
-    'desc': 'Returns a sort expression based on the descending order of the given column name.',
-}
-
-_functions_over_column = {
-    'sqrt': 'Computes the square root of the specified float value.',
-    'abs': 'Computes the absolute value.',
-
-    'max': 'Aggregate function: returns the maximum value of the expression in a group.',
-    'min': 'Aggregate function: returns the minimum value of the expression in a group.',
-    'count': 'Aggregate function: returns the number of items in a group.',
-    'sum': 'Aggregate function: returns the sum of all values in the expression.',
-    'avg': 'Aggregate function: returns the average of the values in a group.',
-    'mean': 'Aggregate function: returns the average of the values in a group.',
-    'sumDistinct': 'Aggregate function: returns the sum of distinct values in the expression.',
-}
-
-_functions_1_4_over_column = {
-    # unary math functions
-    'acos': ':return: inverse cosine of `col`, as if computed by `java.lang.Math.acos()`',
-    'asin': ':return: inverse sine of `col`, as if computed by `java.lang.Math.asin()`',
-    'atan': ':return: inverse tangent of `col`, as if computed by `java.lang.Math.atan()`',
-    'cbrt': 'Computes the cube-root of the given value.',
-    'ceil': 'Computes the ceiling of the given value.',
-    'cos': """:param col: angle in radians
-           :return: cosine of the angle, as if computed by `java.lang.Math.cos()`.""",
-    'cosh': """:param col: hyperbolic angle
-           :return: hyperbolic cosine of the angle, as if computed by `java.lang.Math.cosh()`""",
-    'exp': 'Computes the exponential of the given value.',
-    'expm1': 'Computes the exponential of the given value minus one.',
-    'floor': 'Computes the floor of the given value.',
-    'log': 'Computes the natural logarithm of the given value.',
-    'log10': 'Computes the logarithm of the given value in Base 10.',
-    'log1p': 'Computes the natural logarithm of the given value plus one.',
-    'rint': 'Returns the double value that is closest in value to the argument and' +
-            ' is equal to a mathematical integer.',
-    'signum': 'Computes the signum of the given value.',
-    'sin': """:param col: angle in radians
-           :return: sine of the angle, as if computed by `java.lang.Math.sin()`""",
-    'sinh': """:param col: hyperbolic angle
-           :return: hyperbolic sine of the given value,
-                    as if computed by `java.lang.Math.sinh()`""",
-    'tan': """:param col: angle in radians
-           :return: tangent of the given value, as if computed by `java.lang.Math.tan()`""",
-    'tanh': """:param col: hyperbolic angle
-            :return: hyperbolic tangent of the given value,
-                     as if computed by `java.lang.Math.tanh()`""",
-    'toDegrees': '.. note:: Deprecated in 2.1, use :func:`degrees` instead.',
-    'toRadians': '.. note:: Deprecated in 2.1, use :func:`radians` instead.',
-    'bitwiseNOT': 'Computes bitwise not.',
-}
-
-_functions_2_4 = {
-    'asc_nulls_first': 'Returns a sort expression based on the ascending order of the given' +
-                       ' column name, and null values return before non-null values.',
-    'asc_nulls_last': 'Returns a sort expression based on the ascending order of the given' +
-                      ' column name, and null values appear after non-null values.',
-    'desc_nulls_first': 'Returns a sort expression based on the descending order of the given' +
-                        ' column name, and null values appear before non-null values.',
-    'desc_nulls_last': 'Returns a sort expression based on the descending order of the given' +
-                       ' column name, and null values appear after non-null values',
-}
-
-_collect_list_doc = """
+    return col if isinstance(col, Column) else _invoke_function("lit", col)
+
+
+@since(1.3)
+def col(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return _invoke_function("col", col)
+
+
+@since(1.3)
+def column(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return col(col)
+
+
+@since(1.3)
+def asc(col):
+    """
+    Returns a sort expression based on the ascending order of the given column name.
+    """
+    return _invoke_function("asc", col)
+
+
+@since(1.3)
+def desc(col):
+    """
+    Returns a sort expression based on the descending order of the given column name.
+    """
+    return _invoke_function("desc", col)
+
+
+@since(1.3)
+def sqrt(col):
+    """
+    Computes the square root of the specified float value.
+    """
+    return _invoke_function_over_column("sqrt", col)
+
+
+@since(1.3)
+def abs(col):
+    """
+    Computes the absolute value.
+    """
+    return _invoke_function_over_column("abs", col)
+
+
+@since(1.3)
+def max(col):
+    """
+    Aggregate function: returns the maximum value of the expression in a group.
+    """
+    return _invoke_function_over_column("max", col)
+
+
+@since(1.3)
+def min(col):
+    """
+    Aggregate function: returns the minimum value of the expression in a group.
+    """
+    return _invoke_function_over_column("min", col)
+
+
+@since(1.3)
+def count(col):
+    """
+    Aggregate function: returns the number of items in a group.
+    """
+    return _invoke_function_over_column("count", col)
+
+
+@since(1.3)
+def sum(col):
+    """
+    Aggregate function: returns the sum of all values in the expression.
+    """
+    return _invoke_function_over_column("sum", col)
+
+
+@since(1.3)
+def avg(col):
+    """
+    Aggregate function: returns the average of the values in a group.
+    """
+    return _invoke_function_over_column("avg", col)
+
+
+@since(1.3)
+def mean(col):
+    """
+    Aggregate function: returns the average of the values in a group.
+    """
+    return _invoke_function_over_column("mean", col)
+
+
+@since(1.3)
+def sumDistinct(col):
+    """
+    Aggregate function: returns the sum of distinct values in the expression.
+    """
+    return _invoke_function_over_column("sumDistinct", col)
+
+
+@since(1.4)
+def acos(col):
+    """
+    :return: inverse cosine of `col`, as if computed by `java.lang.Math.acos()`
+    """
+    return _invoke_function_over_column("acos", col)
+
+
+@since(1.4)
+def asin(col):
+    """
+    :return: inverse sine of `col`, as if computed by `java.lang.Math.asin()`
+    """
+    return _invoke_function_over_column("asin", col)
+
+
+@since(1.4)
+def atan(col):
+    """
+    :return: inverse tangent of `col`, as if computed by `java.lang.Math.atan()`
+    """
+    return _invoke_function_over_column("atan", col)
+
+
+@since(1.4)
+def cbrt(col):
+    """
+    Computes the cube-root of the given value.
+    """
+    return _invoke_function_over_column("cbrt", col)
+
+
+@since(1.4)
+def ceil(col):
+    """
+    Computes the ceiling of the given value.
+    """
+    return _invoke_function_over_column("ceil", col)
+
+
+@since(1.4)
+def cos(col):
+    """
+    :param col: angle in radians
+    :return: cosine of the angle, as if computed by `java.lang.Math.cos()`.
+    """
+    return _invoke_function_over_column("cos", col)
+
+
+@since(1.4)
+def cosh(col):
+    """
+    :param col: hyperbolic angle
+    :return: hyperbolic cosine of the angle, as if computed by `java.lang.Math.cosh()`
+    """
+    return _invoke_function_over_column("cosh", col)
+
+
+@since(1.4)
+def exp(col):
+    """
+    Computes the exponential of the given value.
+    """
+    return _invoke_function_over_column("exp", col)
+
+
+@since(1.4)
+def expm1(col):
+    """
+    Computes the exponential of the given value minus one.
+    """
+    return _invoke_function_over_column("expm1", col)
+
+
+@since(1.4)
+def floor(col):
+    """
+    Computes the floor of the given value.
+    """
+    return _invoke_function_over_column("floor", col)
+
+
+@since(1.4)
+def log(col):
+    """
+    Computes the natural logarithm of the given value.
+    """
+    return _invoke_function_over_column("log", col)
+
+
+@since(1.4)
+def log10(col):
+    """
+    Computes the logarithm of the given value in Base 10.
+    """
+    return _invoke_function_over_column("log10", col)
+
+
+@since(1.4)
+def log1p(col):
+    """
+    Computes the natural logarithm of the given value plus one.
+    """
+    return _invoke_function_over_column("log1p", col)
+
+
+@since(1.4)
+def rint(col):
+    """
+    Returns the double value that is closest in value to the argument and
+    is equal to a mathematical integer.
+    """
+    return _invoke_function_over_column("rint", col)
+
+
+@since(1.4)
+def signum(col):
+    """
+    Computes the signum of the given value.
+    """
+    return _invoke_function_over_column("signum", col)
+
+
+@since(1.4)
+def sin(col):
+    """
+    :param col: angle in radians
+    :return: sine of the angle, as if computed by `java.lang.Math.sin()`
+    """
+    return _invoke_function_over_column("sin", col)
+
+
+@since(1.4)
+def sinh(col):
+    """
+    :param col: hyperbolic angle
+    :return: hyperbolic sine of the given value,
+             as if computed by `java.lang.Math.sinh()`
+    """
+    return _invoke_function_over_column("sinh", col)
+
+
+@since(1.4)
+def tan(col):
+    """
+    :param col: angle in radians
+    :return: tangent of the given value, as if computed by `java.lang.Math.tan()`
+    """
+    return _invoke_function_over_column("tan", col)
+
+
+@since(1.4)
+def tanh(col):
+    """
+    :param col: hyperbolic angle
+    :return: hyperbolic tangent of the given value
+             as if computed by `java.lang.Math.tanh()`
+    """
+    return _invoke_function_over_column("tanh", col)
+
+
+@since(1.4)
+def toDegrees(col):
+    """
+    .. note:: Deprecated in 2.1, use :func:`degrees` instead.
+    """
+    warnings.warn("Deprecated in 2.1, use degrees instead.", DeprecationWarning)
+    return degrees(col)
+
+
+@since(1.4)
+def toRadians(col):
+    """
+    .. note:: Deprecated in 2.1, use :func:`radians` instead.
+    """
+    warnings.warn("Deprecated in 2.1, use radians instead.", DeprecationWarning)
+    return radians(col)
+
+
+@since(1.4)
+def bitwiseNOT(col):
+    """
+    Computes bitwise not.
+    """
+    return _invoke_function_over_column("bitwiseNOT", col)
+
+
+@since(2.4)
+def asc_nulls_first(col):
+    """
+    Returns a sort expression based on the ascending order of the given
+    column name, and null values return before non-null values.
+    """
+    return _invoke_function("asc_nulls_first", col)

Review comment:
       `asc_nulls_first`, `asc_nulls_last`, `desc_nulls_first` and `desc_nulls_last` look the same case above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716930118


   Merged to master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716383286






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715646170






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715647432


   > Add proper NumPy-style docstrings to expanded functions.
   
   Oh, let's don't do this in this PR. It should add some dependencies into GitHub Actions, fix the script about `numpydoc`, etc. We could add a dedicated PR for numpydoc only later. Actually I am working (and thinking) on making an initial PR to start this numpydoc style.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715876868


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34830/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716402283


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34866/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715879506






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #30143:
URL: https://github.com/apache/spark/pull/30143#discussion_r511749264



##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +42,455 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
-    """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
-
-
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
-        # For legacy reasons, the arguments here can be implicitly converted into floats,
-        # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
+
+
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
+    """
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
+
+
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
+
+
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
+        float(col1) if isinstance(col1, numbers.Real) else _to_java_column(col1),

Review comment:
       Good catch, thank you!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716348937






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715641175


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34818/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716350906


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130261/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #30143:
URL: https://github.com/apache/spark/pull/30143#discussion_r511675622



##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +42,455 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
-    """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
-
-
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
-        # For legacy reasons, the arguments here can be implicitly converted into floats,
-        # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
+
+
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
+    """
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
+
+
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
+
+
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
+        float(col1) if isinstance(col1, numbers.Real) else _to_java_column(col1),

Review comment:
       I think you can keep the comment about this - converting floats due to the legacy behavior reason.
   Also, I think we should keep the previous checking as is. For example, now it won't work with decimals.
   
   ```python
   >>> isinstance(decimal.Decimal(1), numbers.Real)
   False
   >>> float(decimal.Decimal(1))
   1.0
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715639068






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716345197


   **[Test build #130261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130261/testReport)** for PR 30143 at commit [`9d4fdc4`](https://github.com/apache/spark/commit/9d4fdc4fdf4225d5102f5ab6628128c9544a0c6c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715875784






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715646170


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715876900


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34831/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716384398


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34863/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716417357


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34866/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 closed pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 closed pull request #30143:
URL: https://github.com/apache/spark/pull/30143


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716368205


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34861/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716352745


   > Yeah, I think we can just keep docstrings as-is. For grouping stuff, I think it's okay to don't change since it's mainly for code readers.
   
   That's true, though I'd argue that it matters, given the size of the module (it is large enough, so `log` defined twice, once through dict and once through def, went unnoticed for years) Not to mention it was clearly the original intention, things just diverged a bit over time.
   
   > There were similar changes proposed (and abandoned) before, e.g.) #27976.
   
   Fair enough. Let's not complicate things here, but I'll allow myself another PR later to see if I can make the case :) 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715879227






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715858168


   **[Test build #130230 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130230/testReport)** for PR 30143 at commit [`738fdde`](https://github.com/apache/spark/commit/738fdde521a7bcd101a76f6880ec94e94b0e6a91).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715879227






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715639068






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716360173


   **[Test build #130263 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130263/testReport)** for PR 30143 at commit [`2c99821`](https://github.com/apache/spark/commit/2c99821f3c8630ece8c2cc149ca43667a4be8b52).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716350829


   **[Test build #130261 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130261/testReport)** for PR 30143 at commit [`9d4fdc4`](https://github.com/apache/spark/commit/9d4fdc4fdf4225d5102f5ab6628128c9544a0c6c).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716348937


   **[Test build #130263 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130263/testReport)** for PR 30143 at commit [`2c99821`](https://github.com/apache/spark/commit/2c99821f3c8630ece8c2cc149ca43667a4be8b52).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715646164


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34818/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715833374


   **[Test build #130230 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130230/testReport)** for PR 30143 at commit [`738fdde`](https://github.com/apache/spark/commit/738fdde521a7bcd101a76f6880ec94e94b0e6a91).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716345197


   **[Test build #130261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130261/testReport)** for PR 30143 at commit [`9d4fdc4`](https://github.com/apache/spark/commit/9d4fdc4fdf4225d5102f5ab6628128c9544a0c6c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715634800


   **[Test build #130218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130218/testReport)** for PR 30143 at commit [`ae98400`](https://github.com/apache/spark/commit/ae98400f660700b13d03b992de2f3bf50fc82502).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716395189






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715638863


   **[Test build #130218 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130218/testReport)** for PR 30143 at commit [`ae98400`](https://github.com/apache/spark/commit/ae98400f660700b13d03b992de2f3bf50fc82502).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716360611






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715851068


   **[Test build #130231 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130231/testReport)** for PR 30143 at commit [`79fd14e`](https://github.com/apache/spark/commit/79fd14e5b029f1526be8ba32040bfc0a1e676fbc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715646173


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34818/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715859220






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716352978


   Jenkins, retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715634800


   **[Test build #130218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130218/testReport)** for PR 30143 at commit [`ae98400`](https://github.com/apache/spark/commit/ae98400f660700b13d03b992de2f3bf50fc82502).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716370033


   **[Test build #130267 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130267/testReport)** for PR 30143 at commit [`2c99821`](https://github.com/apache/spark/commit/2c99821f3c8630ece8c2cc149ca43667a4be8b52).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716383275


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34861/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716370510






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #30143:
URL: https://github.com/apache/spark/pull/30143


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716245920


   Yeah, I think we can just keep docstrings as-is. For grouping stuff, I think it's okay to don't change since it's mainly for code readers. There were similar changes proposed (and abandoned) before, e.g.) https://github.com/apache/spark/pull/27976.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716107740


   > > Add proper NumPy-style docstrings to expanded functions.
   > 
   > Oh, let's don't do this in this PR.
   
   OK. Shall we extend existing docstrings here at all?
   
   And what about the remaining points? Right now functions are in groups, some are not, some ordered by name, some not. Since we introduce quite a few `defs` here, I thought it might be a good idea to sort this stuff.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #30143:
URL: https://github.com/apache/spark/pull/30143#discussion_r512545029



##########
File path: python/pyspark/sql/functions.py
##########
@@ -42,154 +41,457 @@
 # since it requires to make every single overridden definition.
 
 
-def _create_function(name, doc=""):
-    """Create a PySpark function by its name"""
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+def _get_get_jvm_function(name, sc):
+    """
+    Retrieves JVM function identified by name from
+    Java gateway associated with sc.
+    """
+    return getattr(sc._jvm.functions, name)
 
 
-def _create_function_over_column(name, doc=""):
-    """Similar with `_create_function` but creates a PySpark function that takes a column
-    (as string as well). This is mainly for PySpark functions to take strings as
-    column names.
+def _invoke_function(name, *args):
+    """
+    Invokes JVM function identified by name with args
+    and wraps the result with :class:`Column`.
     """
-    def _(col):
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)(_to_java_column(col))
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
+    jf = _get_get_jvm_function(name, SparkContext._active_spark_context)
+    return Column(jf(*args))
 
 
-def _wrap_deprecated_function(func, message):
-    """ Wrap the deprecated function to print out deprecation warnings"""
-    def _(col):
-        warnings.warn(message, DeprecationWarning)
-        return func(col)
-    return functools.wraps(func)(_)
+def _invoke_function_over_column(name, col):
+    """
+    Invokes unary JVM function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(name, _to_java_column(col))
 
 
-def _create_binary_mathfunction(name, doc=""):
-    """ Create a binary mathfunction by name"""
-    def _(col1, col2):
-        sc = SparkContext._active_spark_context
+def _invoke_binary_math_function(name, col1, col2):
+    """
+    Invokes binary JVM math function identified by name
+    and wraps the result with :class:`Column`.
+    """
+    return _invoke_function(
+        name,
         # For legacy reasons, the arguments here can be implicitly converted into floats,
         # if they are not columns or strings.
-        if isinstance(col1, Column):
-            arg1 = col1._jc
-        elif isinstance(col1, str):
-            arg1 = _create_column_from_name(col1)
-        else:
-            arg1 = float(col1)
-
-        if isinstance(col2, Column):
-            arg2 = col2._jc
-        elif isinstance(col2, str):
-            arg2 = _create_column_from_name(col2)
-        else:
-            arg2 = float(col2)
-
-        jc = getattr(sc._jvm.functions, name)(arg1, arg2)
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = doc
-    return _
-
-
-def _create_window_function(name, doc=''):
-    """ Create a window function by name """
-    def _():
-        sc = SparkContext._active_spark_context
-        jc = getattr(sc._jvm.functions, name)()
-        return Column(jc)
-    _.__name__ = name
-    _.__doc__ = 'Window function: ' + doc
-    return _
+        _to_java_column(col1) if isinstance(col1, (str, Column)) else float(col1),
+        _to_java_column(col2) if isinstance(col2, (str, Column)) else float(col2)
+    )
 
 
 def _options_to_str(options):
     return {key: to_str(value) for (key, value) in options.items()}
 
-_lit_doc = """
+
+@since(1.3)
+def lit(col):
+    """
     Creates a :class:`Column` of literal value.
 
     >>> df.select(lit(5).alias('height')).withColumn('spark_user', lit(True)).take(1)
     [Row(height=5, spark_user=True)]
     """
-_functions = {
-    'lit': _lit_doc,
-    'col': 'Returns a :class:`Column` based on the given column name.',
-    'column': 'Returns a :class:`Column` based on the given column name.',
-    'asc': 'Returns a sort expression based on the ascending order of the given column name.',
-    'desc': 'Returns a sort expression based on the descending order of the given column name.',
-}
-
-_functions_over_column = {
-    'sqrt': 'Computes the square root of the specified float value.',
-    'abs': 'Computes the absolute value.',
-
-    'max': 'Aggregate function: returns the maximum value of the expression in a group.',
-    'min': 'Aggregate function: returns the minimum value of the expression in a group.',
-    'count': 'Aggregate function: returns the number of items in a group.',
-    'sum': 'Aggregate function: returns the sum of all values in the expression.',
-    'avg': 'Aggregate function: returns the average of the values in a group.',
-    'mean': 'Aggregate function: returns the average of the values in a group.',
-    'sumDistinct': 'Aggregate function: returns the sum of distinct values in the expression.',
-}
-
-_functions_1_4_over_column = {
-    # unary math functions
-    'acos': ':return: inverse cosine of `col`, as if computed by `java.lang.Math.acos()`',
-    'asin': ':return: inverse sine of `col`, as if computed by `java.lang.Math.asin()`',
-    'atan': ':return: inverse tangent of `col`, as if computed by `java.lang.Math.atan()`',
-    'cbrt': 'Computes the cube-root of the given value.',
-    'ceil': 'Computes the ceiling of the given value.',
-    'cos': """:param col: angle in radians
-           :return: cosine of the angle, as if computed by `java.lang.Math.cos()`.""",
-    'cosh': """:param col: hyperbolic angle
-           :return: hyperbolic cosine of the angle, as if computed by `java.lang.Math.cosh()`""",
-    'exp': 'Computes the exponential of the given value.',
-    'expm1': 'Computes the exponential of the given value minus one.',
-    'floor': 'Computes the floor of the given value.',
-    'log': 'Computes the natural logarithm of the given value.',
-    'log10': 'Computes the logarithm of the given value in Base 10.',
-    'log1p': 'Computes the natural logarithm of the given value plus one.',
-    'rint': 'Returns the double value that is closest in value to the argument and' +
-            ' is equal to a mathematical integer.',
-    'signum': 'Computes the signum of the given value.',
-    'sin': """:param col: angle in radians
-           :return: sine of the angle, as if computed by `java.lang.Math.sin()`""",
-    'sinh': """:param col: hyperbolic angle
-           :return: hyperbolic sine of the given value,
-                    as if computed by `java.lang.Math.sinh()`""",
-    'tan': """:param col: angle in radians
-           :return: tangent of the given value, as if computed by `java.lang.Math.tan()`""",
-    'tanh': """:param col: hyperbolic angle
-            :return: hyperbolic tangent of the given value,
-                     as if computed by `java.lang.Math.tanh()`""",
-    'toDegrees': '.. note:: Deprecated in 2.1, use :func:`degrees` instead.',
-    'toRadians': '.. note:: Deprecated in 2.1, use :func:`radians` instead.',
-    'bitwiseNOT': 'Computes bitwise not.',
-}
-
-_functions_2_4 = {
-    'asc_nulls_first': 'Returns a sort expression based on the ascending order of the given' +
-                       ' column name, and null values return before non-null values.',
-    'asc_nulls_last': 'Returns a sort expression based on the ascending order of the given' +
-                      ' column name, and null values appear after non-null values.',
-    'desc_nulls_first': 'Returns a sort expression based on the descending order of the given' +
-                        ' column name, and null values appear before non-null values.',
-    'desc_nulls_last': 'Returns a sort expression based on the descending order of the given' +
-                       ' column name, and null values appear after non-null values',
-}
-
-_collect_list_doc = """
+    return col if isinstance(col, Column) else _invoke_function("lit", col)
+
+
+@since(1.3)
+def col(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return _invoke_function("col", col)
+
+
+@since(1.3)
+def column(col):
+    """
+    Returns a :class:`Column` based on the given column name.'
+    """
+    return col(col)
+
+
+@since(1.3)
+def asc(col):
+    """
+    Returns a sort expression based on the ascending order of the given column name.
+    """
+    return _invoke_function("asc", col)

Review comment:
       Let's handle this in SPARK-33257




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716356529


   **[Test build #130267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130267/testReport)** for PR 30143 at commit [`2c99821`](https://github.com/apache/spark/commit/2c99821f3c8630ece8c2cc149ca43667a4be8b52).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715879502


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34830/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716246037


   The current changes look pretty good to go. Let me know when you think it's ready. I'll take one more look and merge it in.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715875705


   **[Test build #130231 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130231/testReport)** for PR 30143 at commit [`79fd14e`](https://github.com/apache/spark/commit/79fd14e5b029f1526be8ba32040bfc0a1e676fbc).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-716350898






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-715851068


   **[Test build #130231 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130231/testReport)** for PR 30143 at commit [`79fd14e`](https://github.com/apache/spark/commit/79fd14e5b029f1526be8ba32040bfc0a1e676fbc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on pull request #30143: [SPARK-32084][PYTHON][SQL] Expand dictionary functions

Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #30143:
URL: https://github.com/apache/spark/pull/30143#issuecomment-717117214


   > Merged to master.
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org