You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Andrey Kuznetsov (JIRA)" <ji...@apache.org> on 2018/10/18 16:36:00 UTC
[jira] [Updated] (IGNITE-9679) Document critical workers liveness
checking implementation
[ https://issues.apache.org/jira/browse/IGNITE-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Kuznetsov updated IGNITE-9679:
-------------------------------------
Description:
Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows.
Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked:
- whether it's alive;
- whether it updates its internal heartbeat timestamp.
Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured {{FailureHandler}}.
{{IgniteConfiguration.SystemWorkerBlockedTimeout}} configuration property affects monitoring behavior. At runtime monitoring settings can be changed via {{FailureHandlingMxBean}}.
By default, liveness checks are enabled, but blocked system worker detection will not lead to failure handler invocation, see {{FailureProcessor#getDefaultFailureHandler}} .
was:
Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows.
Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked:
- whether it's alive;
- whether it updates its internal heartbeat timestamp.
Both checks use {{IgniteConfiguration.failureDetectionTimeout}} property as a threshold value.
Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured {{FailureHandler}}.
Liveness checks are enabled by default, but can be disabled through {{WorkersControlMXBean.healthMonitoringEnabled}} property.
> Document critical workers liveness checking implementation
> ----------------------------------------------------------
>
> Key: IGNITE-9679
> URL: https://issues.apache.org/jira/browse/IGNITE-9679
> Project: Ignite
> Issue Type: Task
> Components: documentation
> Reporter: Andrey Kuznetsov
> Priority: Major
> Fix For: 2.7
>
>
> Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows.
> Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked:
> - whether it's alive;
> - whether it updates its internal heartbeat timestamp.
> Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured {{FailureHandler}}.
> {{IgniteConfiguration.SystemWorkerBlockedTimeout}} configuration property affects monitoring behavior. At runtime monitoring settings can be changed via {{FailureHandlingMxBean}}.
> By default, liveness checks are enabled, but blocked system worker detection will not lead to failure handler invocation, see {{FailureProcessor#getDefaultFailureHandler}} .
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)