You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Andrey Kuznetsov (JIRA)" <ji...@apache.org> on 2018/05/03 13:03:00 UTC

[jira] [Assigned] (IGNITE-6587) Ignite watchdog service

     [ https://issues.apache.org/jira/browse/IGNITE-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrey Kuznetsov reassigned IGNITE-6587:
----------------------------------------

    Assignee: Andrey Kuznetsov  (was: Andrey Gura)

> Ignite watchdog service
> -----------------------
>
>                 Key: IGNITE-6587
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6587
>             Project: Ignite
>          Issue Type: Improvement
>          Components: general
>    Affects Versions: 2.2
>            Reporter: Alexey Goncharuk
>            Assignee: Andrey Kuznetsov
>            Priority: Major
>              Labels: IEP-5
>             Fix For: 2.6
>
>         Attachments: watchdog.sh
>
>
> As described in [1], each Ignite node has a number of system-critical threads. We should implement a periodic check that calls failure handler when one of the following conditions has been detected:
> # Critical thread is not alive anymore.
> # Critical thread 'hangs' for a long time, e.g. while executing a task extracted from task queue. 
> Actual list of system-critical threads can be found at [1].
> [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-14+Ignite+failures+handling



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)