You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Timothee Maret (JIRA)" <ji...@apache.org> on 2019/05/31 19:23:00 UTC
[jira] [Resolved] (SLING-8447) Provide current-retries metric for
journaled distribution
[ https://issues.apache.org/jira/browse/SLING-8447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Timothee Maret resolved SLING-8447.
-----------------------------------
Resolution: Fixed
> Provide current-retries metric for journaled distribution
> ---------------------------------------------------------
>
> Key: SLING-8447
> URL: https://issues.apache.org/jira/browse/SLING-8447
> Project: Sling
> Issue Type: New Feature
> Components: Content Distribution
> Affects Versions: Content Distribution Journal Core 0.1.0
> Reporter: Christian Schneider
> Assignee: Timothee Maret
> Priority: Major
> Fix For: Content Distribution Journal Core 0.1.2
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> For operating a sling system with content distribution it is important to detect when a publisher is stuck.
> A good indicator for this is if the same package is retried for more than a certain number of times.
> Currently there only is an absolute metric of failed packages. When doing a derivation of that total metric it is possible to detect a growing number of failed packages. Unfortunately you can not distinguish between one package being retried 10 times and 10 packages being retried once each.
> So I propose to create a new metric of current-retries as a gauge. This metric reports how often the current package is retried. So it grows while the same package is retried and resets to 0 when the package is successfully applied or when the server is restarted.
> With this metric it is very easy to detect a blocked publisher as you simply need to check if the metric exceeds a limit.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)