You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (JIRA)" <ji...@apache.org> on 2017/10/09 12:48:00 UTC

[jira] [Created] (FLINK-7781) Support sime on-demand metrics aggregation

Chesnay Schepler created FLINK-7781:
---------------------------------------

             Summary: Support sime on-demand metrics aggregation
                 Key: FLINK-7781
                 URL: https://issues.apache.org/jira/browse/FLINK-7781
             Project: Flink
          Issue Type: Improvement
          Components: Metrics, REST
    Affects Versions: 1.4.0
            Reporter: Chesnay Schepler
             Fix For: 1.4.0


We should support aggregations (min, max, avg, sum) of metrics in the REST API. This is primarily about aggregating across subtasks, for example the number of incoming records across all subtasks.

This is useful for simple use-cases where a dedicated metrics backend is overkill, and will allow us to provide better metrics in the web UI (since we can expose these aggregated as well).

I propose to add a new query parameter "agg=[min,max,avg,sum]". As a start this parameter should only be used for task metrics. (This is simply the main use-case i have in mind)

The aggregation should (naturally) only work for numeric metrics.

We will need a HashSet of metrics that exist for subtasks of a given tasks that has to be updated in {{MetricStore#add}}.

All task metrics are either stored as
# {{<subtask-index>.<metric>}} or
# {{<subtask-index>.<operator-name>.<metric>}}.

If a user sends a request {{get=mymetric,agg=sum}}, only the metrics of the first kind are to be considered. Similarly, given a request {{get=myoperator.mymetric,agg=sum}} only metrics of the second kind are to be considered.

Ideally, the name of the aggregated metric (i.e. the original name without subtask index) is also contained in the list of available metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)