You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2017/01/18 11:46:26 UTC

[jira] [Updated] (SOLR-9857) Collect aggregated metrics from replicas in shard leader

     [ https://issues.apache.org/jira/browse/SOLR-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated SOLR-9857:
------------------------------------
    Attachment: SOLR-9857.patch

Initial version of the reporting and aggregation of replica metrics.

The design reuses {{SolrMetricReporter}} API - it implements a {{SolrReplicaReporter}} which is scheduled to report a relevant subset of metrics every N seconds to the shard leader. It uses javabin format for sending serialized metrics data.

There is also a new handler at the {{CoreContainer}} level under {{/admin/metricsCollector}}, which aggregates reports sent from {{SolrReplicaReporter}}-s. This runs at a {{CoreContainer}} level instead of the core level because I hope to reuse it for aggregating also node statistics in SOLR-9858. Partial metrics from replicas are then added to a registry that has the name of the shard with a ".leader" suffix.

I spent some time thinking about how to best aggregate partial metrics. In general case it's not possible to do this in a meaningful way, and the Metrics API doesn't offer any help here. In the end I implemented {{AggregateMetric}}, which maintains all partial numbers for a selected metric and provides only basic statistics (average, min/max, stddev) - and I left it to the user to decide which statistic is most meaningful, if at all.

These aggregated metrics are kept in a regular {{MetricRegistry}} on the shard leader, so they are also reported by {{/admin/metrics}}.

Comments and suggestions are welcome :)

> Collect aggregated metrics from replicas in shard leader
> --------------------------------------------------------
>
>                 Key: SOLR-9857
>                 URL: https://issues.apache.org/jira/browse/SOLR-9857
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: metrics
>    Affects Versions: master (7.0)
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>         Attachments: SOLR-9857.patch
>
>
> Shard leaders can collect metrics from replicas in order to learn about their load and the progress of replication. These per-replica metrics need to be aggregated (if possible) in order to report cluster-wide per-shard metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org