You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Andrew Onischuk (JIRA)" <ji...@apache.org> on 2015/10/15 16:40:05 UTC
[jira] [Created] (AMBARI-13434) Expose Alert Grace Period Setting
in Agents
Andrew Onischuk created AMBARI-13434:
----------------------------------------
Summary: Expose Alert Grace Period Setting in Agents
Key: AMBARI-13434
URL: https://issues.apache.org/jira/browse/AMBARI-13434
Project: Ambari
Issue Type: Bug
Reporter: Andrew Onischuk
Assignee: Andrew Onischuk
Fix For: 2.1.3
On some deployments, hosts may be required to run many alerts depending on the
number of components installed. If the number of components is large, it's
possible that alert jobs may miss their scheduled intervals. The default grace
period set by APS is 1 second, which is rather aggressive.
WARNING 2015-07-29 20:59:50,733 scheduler.py:496 - Run time of job "947770c6-424a-4ef8-9a46-19eca8fd080b (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.309353)" was missed by 0:00:01.423766
WARNING 2015-07-29 20:59:50,734 scheduler.py:496 - Run time of job "005b1d50-2aca-4af2-a3b4-bc39e6f65ede (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.309646)" was missed by 0:00:01.424313
WARNING 2015-07-29 20:59:50,734 scheduler.py:496 - Run time of job "6950ff19-c26c-46b7-8bac-1869773f1380 (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.309840)" was missed by 0:00:01.424364
WARNING 2015-07-29 20:59:50,735 scheduler.py:496 - Run time of job "d986b9eb-bfd4-400f-b107-5640495eeece (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.310025)" was missed by 0:00:01.425144
WARNING 2015-07-29 20:59:50,735 scheduler.py:496 - Run time of job "3589154e-a8e3-441d-b3cb-a93fd49e1dfe (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.310204)" was missed by 0:00:01.425600
WARNING 2015-07-29 20:59:50,736 scheduler.py:496 - Run time of job "04a7f393-800b-4728-95be-28c2ca091ade (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.310380)" was missed by 0:00:01.425769
WARNING 2015-07-29 20:59:50,737 scheduler.py:496 - Run time of job "f0e2a065-af36-476c-b6b9-b662471c3f22 (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.310759)" was missed by 0:00:01.426607
WARNING 2015-07-29 20:59:50,738 scheduler.py:496 - Run time of job "76accffd-e390-4aaa-8b35-8219ef4b3057 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.311118)" was missed by 0:00:01.427039
WARNING 2015-07-29 20:59:50,738 scheduler.py:496 - Run time of job "e0ce4088-2f0c-4f6d-8642-26ba94b3c66a (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.311297)" was missed by 0:00:01.426953
WARNING 2015-07-29 20:59:50,739 scheduler.py:496 - Run time of job "9cb39eb2-8ce4-408e-8030-a36362d5b5af (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.311501)" was missed by 0:00:01.427677
WARNING 2015-07-29 20:59:50,740 scheduler.py:496 - Run time of job "c299b3ab-ced6-4423-8f39-e16427157d98 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.312033)" was missed by 0:00:01.427972
WARNING 2015-07-29 20:59:50,740 scheduler.py:496 - Run time of job "cd444594-7859-482d-ae04-348ee7653da2 (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.312208)" was missed by 0:00:01.428285
WARNING 2015-07-29 20:59:50,741 scheduler.py:496 - Run time of job "9afd8b3e-8850-4f2d-9ce7-a130be6b933b (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.312385)" was missed by 0:00:01.428689
WARNING 2015-07-29 20:59:50,741 scheduler.py:496 - Run time of job "be140827-a21f-4782-a109-bde8bcbc35c2 (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.312574)" was missed by 0:00:01.429298
WARNING 2015-07-29 20:59:50,742 scheduler.py:496 - Run time of job "e009b685-717f-4552-8dfb-35a4d9d3d658 (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.312751)" was missed by 0:00:01.429906
WARNING 2015-07-29 20:59:50,743 scheduler.py:496 - Run time of job "f42e635f-ce2d-47b6-8da3-10c7bfef7c3c (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.312927)" was missed by 0:00:01.430541
WARNING 2015-07-29 20:59:50,744 scheduler.py:496 - Run time of job "ace91b40-28e2-472a-ac97-8b01dc3bd976 (trigger: interval[0:01:00], next run at: 2015-07-29 20:59:49.313280)" was missed by 0:00:01.430793
WARNING 2015-07-29 20:59:50,744 scheduler.py:496 - Run time of job "77ea324a-a836-4f32-a751-1a596417bc11 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.313461)" was missed by 0:00:01.431357
WARNING 2015-07-29 20:59:50,745 scheduler.py:496 - Run time of job "e74f63b0-4143-4ebb-9adc-8e124eae1f99 (trigger: interval[0:02:00], next run at: 2015-07-29 20:59:49.313642)" was missed by 0:00:01.431588
WARNING 2015-07-29 20:59:50,746 scheduler.py:496 - Run time of job "3640c1eb-e7a2-4783-9480-e7f2129a4093 (trigger: interval[0:02:00], next run at: 2015-07-29 21:01:49.313817)" was missed by 0:00:01.432356
WARNING 2015-07-29 20:59:50,746 scheduler.py:496 - Run time of job "5b1fb2e8-8488-429b-9310-ca882b775c25 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.314182)" was missed by 0:00:01.432292
WARNING 2015-07-29 20:59:50,746 scheduler.py:496 - Run time of job "509bb649-e065-492a-a258-9a8e48e5d79c (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.314359)" was missed by 0:00:01.432485
WARNING 2015-07-29 20:59:50,747 scheduler.py:496 - Run time of job "211e7885-368e-415d-8875-a5abb66071c3 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.314546)" was missed by 0:00:01.432553
WARNING 2015-07-29 20:59:50,747 scheduler.py:496 - Run time of job "239e8d13-1f31-4b2d-ac6f-b66294700814 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.314722)" was missed by 0:00:01.432682
WARNING 2015-07-29 20:59:50,747 scheduler.py:496 - Run time of job "bc300bfc-7f4f-4015-84a6-4bfe761f4167 (trigger: interval[0:02:00], next run at: 2015-07-29 21:01:49.314897)" was missed by 0:00:01.432882
WARNING 2015-07-29 20:59:50,748 scheduler.py:496 - Run time of job "0e800a78-48fa-4738-8bab-dc0b57ecc6fa (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.315072)" was missed by 0:00:01.433000
WARNING 2015-07-29 20:59:50,748 scheduler.py:496 - Run time of job "19190cfd-d9b4-4869-81ec-0bdce227540e (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.315246)" was missed by 0:00:01.433040
WARNING 2015-07-29 20:59:50,748 scheduler.py:496 - Run time of job "7f102c1d-3e4e-4b46-b89d-f6df4c231591 (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.315782)" was missed by 0:00:01.432642
WARNING 2015-07-29 20:59:50,749 scheduler.py:496 - Run time of job "8ef15a08-698b-429f-8925-4d6e5c49c01d (trigger: interval[0:01:00], next run at: 2015-07-29 21:00:49.315959)" was missed by 0:00:01.433006
The setting can be exposed in
[AlertSchedulerHandler.py](https://github.com/apache/ambari/blob/trunk/ambari-
agent/src/main/python/ambari_agent/AlertSchedulerHandler.py#L46) by adding
`misfire_grace_time`:
APS_CONFIG = {
'threadpool.core_threads': 3,
'coalesce': True,
'standalone': False,
'misfire_grace_time': 5
}
* Expose the ability to set this grace period via the agent's configuration file
* Increase the default amount from 1 second to 5 seconds
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)