You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by wu...@apache.org on 2018/11/30 14:11:50 UTC

[incubator-skywalking] branch endpoint-and-instance-alarm updated: Fix alarm default settings and document.

This is an automated email from the ASF dual-hosted git repository.

wusheng pushed a commit to branch endpoint-and-instance-alarm
in repository https://gitbox.apache.org/repos/asf/incubator-skywalking.git


The following commit(s) were added to refs/heads/endpoint-and-instance-alarm by this push:
     new f6e7de8  Fix alarm default settings and document.
f6e7de8 is described below

commit f6e7de86c4fd19e4717bee0d9cae96803d4523a9
Author: Wu Sheng <wu...@foxmail.com>
AuthorDate: Fri Nov 30 22:11:32 2018 +0800

    Fix alarm default settings and document.
---
 docs/en/setup/backend/backend-alarm.md                 | 11 ++++++++++-
 .../src/main/assembly/alarm-settings.yml               | 18 +++++++++---------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/docs/en/setup/backend/backend-alarm.md b/docs/en/setup/backend/backend-alarm.md
index c4ff6f5..aa22b9f 100644
--- a/docs/en/setup/backend/backend-alarm.md
+++ b/docs/en/setup/backend/backend-alarm.md
@@ -49,9 +49,18 @@ rules:
     count: 4
 ```
 
+## Default alarm rules
+We provided a default `alarm-setting.yml` in our distribution only for convenience, which including following rules
+1. Service average response time over 1s in last 3 minutes.
+1. Service success rate lower than 80% in last 2 minutes.
+1. Service 90% response time is lower than 1000ms in last 3 minutes
+1. Service Instance average response time over 1s in last 2 minutes.
+1. Endpoint average response time over 1s in last 2 minutes.
+ 
+
 
 ## List of all potential metric name
 The metric names are defined in official [OAL scripts](../../guides/backend-oal-scripts.md), right now 
-only metric from **Service** scope could be used in Alarm, we will extend in further versions. 
+metric from **Service**, **Service Instance**, **Endpoint** scopes could be used in Alarm, we will extend in further versions. 
 
 Submit issue or pull request if you want to support any other scope in alarm.
diff --git a/oap-server/server-starter/src/main/assembly/alarm-settings.yml b/oap-server/server-starter/src/main/assembly/alarm-settings.yml
index 5fdaa7b..3cd65a9 100644
--- a/oap-server/server-starter/src/main/assembly/alarm-settings.yml
+++ b/oap-server/server-starter/src/main/assembly/alarm-settings.yml
@@ -36,7 +36,7 @@ rules:
     count: 2
     # How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
     silence-period: 3
-    message: Successful rate of service {name} is lower than 80% in last 2 minuts.
+    message: Successful rate of service {name} is lower than 80% in last 2 minutes.
   service_p90_sla_rule:
     # Indicator value need to be long, double or int
     indicator-name: service_p90
@@ -50,17 +50,17 @@ rules:
     indicator-name: service_instance_resp_time
     op: ">"
     period: 10
-    count: 3
+    count: 2
     silence-period: 5
-    message: Response time of service instance {name} is more than 1000ms in last 3 minutes.
-  endpoint_sla_rule:
-    indicator-name: endpoint_sla
-    op: "<"
-    threshold: 8000
+    message: Response time of service instance {name} is more than 1000ms in last 2 minutes.
+  endpoint_avg_rule:
+    indicator-name: endpoint_avg
+    op: ">"
+    threshold: 1000
     period: 10
     count: 2
-    silence-period: 3
-    message: Successful rate of endpoint {name} is lower than 80% in last 2 minuts.
+    silence-period: 5
+    message: Response time of endpoint {name} is more than 1000ms in last 2 minutes.
 
 webhooks:
 #  - http://127.0.0.1/notify/