You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "José Correia (Jira)" <ji...@apache.org> on 2022/04/04 13:18:00 UTC
[jira] [Commented] (SLING-11181) Emit metrics that distinguish transient and permanent distribution failures
[ https://issues.apache.org/jira/browse/SLING-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516809#comment-17516809 ]
José Correia commented on SLING-11181:
--------------------------------------
The PR implementing these changes: [PR-105|https://github.com/apache/sling-org-apache-sling-distribution-journal/pull/105]
[~marett] can you take a look whenever possible? 🙏
> Emit metrics that distinguish transient and permanent distribution failures
> ---------------------------------------------------------------------------
>
> Key: SLING-11181
> URL: https://issues.apache.org/jira/browse/SLING-11181
> Project: Sling
> Issue Type: Improvement
> Components: Content Distribution
> Reporter: José Correia
> Priority: Major
>
> h3. Context
> Currently, our error metrics don't distinguish between distribution failures that are permanent and will fail even if retried, or failures that succeed after being retried.
> We want to improve this in order to be able to differentiate both scenarios.
> h3. Solution
> Failure metric should be labeled by:
> * {{Transient failure}}
> * {{Permanent failure}}
> h3. Proposed approach
> We can distinguish both these scenarios by using the following rationale:
> * Transient failures happen whenever a package is distributed successfully but had more than 1 attempt at being distributed: {{retries > 0}}
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)