You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2018/08/04 07:11:00 UTC

[jira] [Updated] (SPARK-23911) High-order function: aggregate(array, initialState S, inputFunction, outputFunction) → R

     [ https://issues.apache.org/jira/browse/SPARK-23911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li updated SPARK-23911:
----------------------------
    Summary: High-order function: aggregate(array<T>, initialState S, inputFunction<S, T, S>, outputFunction<S, R>) → R  (was: High-order function: reduce(array<T>, initialState S, inputFunction<S, T, S>, outputFunction<S, R>) → R)

> High-order function: aggregate(array<T>, initialState S, inputFunction<S, T, S>, outputFunction<S, R>) → R
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23911
>                 URL: https://issues.apache.org/jira/browse/SPARK-23911
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Assignee: Herman van Hovell
>            Priority: Major
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Returns a single value reduced from array. inputFunction will be invoked for each element in array in order. In addition to taking the element, inputFunction takes the current state, initially initialState, and returns the new state. outputFunction will be invoked to turn the final state into the result value. It may be the identity function (i -> i).
> {noformat}
> SELECT reduce(ARRAY [], 0, (s, x) -> s + x, s -> s); -- 0
> SELECT reduce(ARRAY [5, 20, 50], 0, (s, x) -> s + x, s -> s); -- 75
> SELECT reduce(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + x, s -> s); -- NULL
> SELECT reduce(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + COALESCE(x, 0), s -> s); -- 75
> SELECT reduce(ARRAY [5, 20, NULL, 50], 0, (s, x) -> IF(x IS NULL, s, s + x), s -> s); -- 75
> SELECT reduce(ARRAY [2147483647, 1], CAST (0 AS BIGINT), (s, x) -> s + x, s -> s); -- 2147483648
> SELECT reduce(ARRAY [5, 6, 10, 20], -- calculates arithmetic average: 10.25
>               CAST(ROW(0.0, 0) AS ROW(sum DOUBLE, count INTEGER)),
>               (s, x) -> CAST(ROW(x + s.sum, s.count + 1) AS ROW(sum DOUBLE, count INTEGER)),
>               s -> IF(s.count = 0, NULL, s.sum / s.count));
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org