You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Aljoscha Krettek (Jira)" <ji...@apache.org> on 2020/07/28 07:43:00 UTC

[jira] [Closed] (FLINK-3613) Add standard deviation, mean, variance to list of Aggregations

     [ https://issues.apache.org/jira/browse/FLINK-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aljoscha Krettek closed FLINK-3613.
-----------------------------------
    Resolution: Abandoned

I'm closing this as "Abandoned", since there is no more activity and the code base has moved on quite a bit. Please re-open this if you feel otherwise and work should continue.

> Add standard deviation, mean, variance to list of Aggregations
> --------------------------------------------------------------
>
>                 Key: FLINK-3613
>                 URL: https://issues.apache.org/jira/browse/FLINK-3613
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataSet
>            Reporter: Todd Lisonbee
>            Priority: Minor
>         Attachments: DataSet-Aggregation-Design-March2016-v1.txt
>
>
> Implement standard deviation, mean, variance for org.apache.flink.api.java.aggregation.Aggregations
> Ideally implementation should be single pass and numerically stable.
> References:
> "Scalable and Numerically Stable Descriptive Statistics in SystemML", Tian et al, International Conference on Data Engineering 2012
> http://dl.acm.org/citation.cfm?id=2310392
> "The Kahan summation algorithm (also known as compensated summation) reduces the numerical errors that occur when adding a sequence of finite precision floating point numbers. Numerical errors arise due to truncation and rounding. These errors can lead to numerical instability when calculating variance."
> https://en.wikipedia.org/wiki/Kahan_summation_algorithm



--
This message was sent by Atlassian Jira
(v8.3.4#803005)