You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2018/03/22 01:10:00 UTC

[jira] [Updated] (KUDU-2287) Add replica metric tracking time since there was a valid leader

     [ https://issues.apache.org/jira/browse/KUDU-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated KUDU-2287:
------------------------------
    Component/s: ksck

> Add replica metric tracking time since there was a valid leader
> ---------------------------------------------------------------
>
>                 Key: KUDU-2287
>                 URL: https://issues.apache.org/jira/browse/KUDU-2287
>             Project: Kudu
>          Issue Type: New Feature
>          Components: ksck, metrics, supportability
>    Affects Versions: 1.7.0
>            Reporter: Todd Lipcon
>            Priority: Major
>
> Currently monitoring systems can report that the Kudu cluster is perfectly healthy when in fact some tablet has gotten "stuck" with no leader (eg due to some network connectivity problem or a bug). If we exposed a numeric metric on a tablet indicating the time since a replica was healthy, or number of failed election attempts, etc, we could easily monitor for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)