You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xinrong Meng (Jira)" <ji...@apache.org> on 2022/04/27 17:35:00 UTC
[jira] [Created] (SPARK-39048) Refactor GroupBy._reduce_for_stat_function on accepted data types
Xinrong Meng created SPARK-39048:
------------------------------------
Summary: Refactor GroupBy._reduce_for_stat_function on accepted data types
Key: SPARK-39048
URL: https://issues.apache.org/jira/browse/SPARK-39048
Project: Spark
Issue Type: Improvement
Components: PySpark
Affects Versions: 3.4.0
Reporter: Xinrong Meng
`Groupby._reduce_for_stat_function` is a common helper function leveraged by multiple statistical functions of GroupBy objects.
It defines parameters `only_numeric` and `bool_as_numeric` to control accepted Spark types.
To be consistent with pandas API, we may also have to introduce `str_as_numeric` for `sum` for example.
Instead of introducing parameters designated for each Spark type, the PR is proposed to introduce a parameter `accepted_spark_types` to specify accepted types of Spark columns to be aggregated.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org