You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Takuya Ueshin (JIRA)" <ji...@apache.org> on 2018/08/05 00:00:00 UTC

[jira] [Resolved] (SPARK-23911) High-order function: aggregate(array, initialState S, inputFunction, outputFunction) → R

     [ https://issues.apache.org/jira/browse/SPARK-23911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Takuya Ueshin resolved SPARK-23911.
-----------------------------------
       Resolution: Fixed
         Assignee: Takuya Ueshin  (was: Herman van Hovell)
    Fix Version/s: 2.4.0

Issue resolved by pull request 21982
https://github.com/apache/spark/pull/21982

> High-order function: aggregate(array<T>, initialState S, inputFunction<S, T, S>, outputFunction<S, R>) → R
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23911
>                 URL: https://issues.apache.org/jira/browse/SPARK-23911
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Assignee: Takuya Ueshin
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Ref: https://prestodb.io/docs/current/functions/array.html
> Returns a single value reduced from array. inputFunction will be invoked for each element in array in order. In addition to taking the element, inputFunction takes the current state, initially initialState, and returns the new state. outputFunction will be invoked to turn the final state into the result value. It may be the identity function (i -> i).
> {noformat}
> SELECT aggregate(ARRAY [], 0, (s, x) -> s + x, s -> s); -- 0
> SELECT aggregate(ARRAY [5, 20, 50], 0, (s, x) -> s + x, s -> s); -- 75
> SELECT aggregate(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + x, s -> s); -- NULL
> SELECT aggregate(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + COALESCE(x, 0), s -> s); -- 75
> SELECT aggregate(ARRAY [5, 20, NULL, 50], 0, (s, x) -> IF(x IS NULL, s, s + x), s -> s); -- 75
> SELECT aggregate(ARRAY [2147483647, 1], CAST (0 AS BIGINT), (s, x) -> s + x, s -> s); -- 2147483648
> SELECT aggregate(ARRAY [5, 6, 10, 20], -- calculates arithmetic average: 10.25
>               CAST(ROW(0.0, 0) AS ROW(sum DOUBLE, count INTEGER)),
>               (s, x) -> CAST(ROW(x + s.sum, s.count + 1) AS ROW(sum DOUBLE, count INTEGER)),
>               s -> IF(s.count = 0, NULL, s.sum / s.count));
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org