You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2022/10/14 09:52:52 UTC

[GitHub] [skywalking] yangyiweigege opened a new issue, #9788: [Bug] Alarm Rule Warn is not accurate

yangyiweigege opened a new issue, #9788:
URL: https://github.com/apache/skywalking/issues/9788

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Apache SkyWalking Component
   
   OAP server (apache/skywalking)
   
   ### What happened
   
   when i use the Alarm Rule to monitor  metrics,i find the alarm rule is not accurately.
   
   ### What you expected to happen
   
   hope zhe alarm rule is  accurately.
   
   ### How to reproduce
   
   all 
   
   ### Anything else
   
   when i find the alarm rule is not accurately, i read the source code and add logs in org.apache.skywalking.oap.server.core.alarm.provider.RunningRule.Window#isMatch like this
   ![image](https://user-images.githubusercontent.com/23202824/195814238-9ab2fef0-ca27-42fc-9748-51b50d166f82.png)
   
    i find the same bucket metrics in the windows.
   
   so i analysis  the code of org.apache.skywalking.oap.server.core.alarm.provider.RunningRule.Window#add,
   ![image](https://user-images.githubusercontent.com/23202824/195815975-95dddc2b-8ac4-4a93-9a62-10af3d20326e.png)
   
   the endtime in this code  will have seconds data, such as:20221014174301,Assuming that:
   
   endtime  is 20221014174301
   two metrics  bucket  is    [202210141743, 202210141744]
   the  code minutes will  all return   0, and set the tow metrics  in the same index.
   ```
    int minutes = Minutes.minutesBetween(timeBucket, this.endTime).getMinutes();
    this.values.set(values.size() - minutes - 1, metrics);
   ```
   so  it will cause the windows data error.
   
   
   
   
   
   
   
   
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278992963

    
   
   > 
   
   yes, i add the logs in org.apache.skywalking.oap.server.core.alarm.provider.RunningRule.Window#isMatch, print all metrics buckets in windows, sometimes i can see the log like this:
   ![75F43DB2-52F7-455B-8990-6E6C378D1D68](https://user-images.githubusercontent.com/23202824/195855593-d0de7e77-25ec-4089-898c-ad8fb59d1985.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278897079

   All time buckets for alerting and moving windows should be in minutes actually. 
   I am not sure why this happens, could you explain more?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279182456

   OK, sorry for misleading, I just wanted to mention using withMinute  for better readability. You could check whether we have with millisecond.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279007730

   I would like to recommend you change in another way but with similar logic,
   
   `AlarmCore#start`(L58), set `LocalDateTime checkTime = LocalDateTime.now();`'s second to `0`. Because generally, Line 75(`lastExecuteTime = checkTime.minusSeconds(checkTime.getSecondOfMinute());`) has executed the same logic.
   And we could keep this in the same place.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278892063

   I am not 100% following, when would the `endTime` be in the second?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279033475

   > Are you going to submit a pull request to fix?
   yes。  i will subimit a pull request  tommorrow.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279159577

   > yes , this is better for read. but one point should be noticed,we must remove the second and millisSecond both
   > 
   > ```java
   >     final LocalDateTime target =     targetTime.withSecondOfMinute(0).withMillisOfSecond(0);
   >      windows.values().forEach(window -> window.moveTo(target));
   > ```
   > 
   > <img alt="image" width="995" src="https://user-images.githubusercontent.com/23202824/195883672-c458a7cd-9504-4eb0-b09c-09d46e9baaa5.png">
   
   if  only remove seconds,   this code also have bug:
   ```java
   int minutes = Minutes.minutesBetween(endTime, current).getMinutes();
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279077772

   > I would like to recommend you change in another way but with similar logic,
   > 
   > `AlarmCore#start`(L58), set `LocalDateTime checkTime = LocalDateTime.now();`'s second to `0`. Because generally, Line 75(`lastExecuteTime = checkTime.minusSeconds(checkTime.getSecondOfMinute());`) has executed the same logic. And we could keep this in the same place.
   
   Sorry, this fix is wrong. I think yours is better, but I think using `withSecondOfMinute` is better for read?
   
   ```java
       public void moveTo(LocalDateTime targetTime) {
           final LocalDateTime target = targetTime.withSecondOfMinute(0);
           windows.values().forEach(window -> window.moveTo(target));
       }
   ```
   
   Meanwhile, AlarmCore#start(L75) should use the `withSecondOfMinute(0)` too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278968310

   > All time buckets for alerting and moving windows should be in minutes actually. I am not sure why this happens, could you explain more?
   
   endTime is changed in org.apache.skywalking.oap.server.core.alarm.provider.RunningRule.Window#moveTo ,and moveTo  is called by two method,one is org.apache.skywalking.oap.server.core.alarm.provider.RunningRule.Window#add, another is 
   org.apache.skywalking.oap.server.core.alarm.provider.RunningRule#moveTo.
   when  org.apache.skywalking.oap.server.core.alarm.provider.RunningRule#moveTo is called by  schedule task, the endTime will be updated,and the value contains seconds,like  20221014174301, code  in org.apache.skywalking.oap.server.core.alarm.provider.AlarmCore#start:
   <img width="1248" alt="image" src="https://user-images.githubusercontent.com/23202824/195851375-f6c1d0a1-4b59-4593-a9af-12c3a34026e9.png">
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278976211

   OK, I think I know the point. The timestamp includes the second, but the bucket slot doesn't. 
   Due to our calculation(`int minutes = Minutes.minutesBetween(endTime, current).getMinutes();`) is involved the timestamp(in second), the duration(in minute) could be mis-calculated to get a wrong slot to save the value, such as 
   From`11:40:14` to `11:44:01`, we expected `4` minutes, but due to the second involved, the value should be `3`, then we got the wrong slot.
   
   Could you confirm the case? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng closed issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #9788: [Bug] Alarm Rule Warn is not  accurate
URL: https://github.com/apache/skywalking/issues/9788


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278998600

   > I think we just need to set the second to `0` before `moveTo`.
   
   yes i have change 
   
   > I think we just need to set the second to `0` before `moveTo`.
   
   yes ,i have changed the code in org.apache.skywalking.oap.server.core.alarm.provider.RunningRule#moveTo in my local code,like this:
   ```
       public void moveTo(LocalDateTime targetTime) {
           // 去掉秒,去掉毫秒,时间取整到分钟
           LocalDateTime targetTimeMinute = targetTime.minusSeconds(targetTime.getSecondOfMinute()).minusMillis(targetTime.getMillisOfSecond());
           windows.values().forEach(window -> window.moveTo(targetTimeMinute));
       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1278994163

   I think we just need to set the second to `0` before `moveTo`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279028958

   Are you going to submit a pull request to fix?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #9788: [Bug] Alarm Rule Warn is not accurate

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #9788:
URL: https://github.com/apache/skywalking/issues/9788#issuecomment-1279156166

   
   yes , this is better for read. but one point should be noticed,we must remove the second and millisSecond both
   ```java
       final LocalDateTime target =     targetTime.withSecondOfMinute(0).withMillisOfSecond(0);
        windows.values().forEach(window -> window.moveTo(target));
   ```
   
   
   <img width="995" alt="image" src="https://user-images.githubusercontent.com/23202824/195883672-c458a7cd-9504-4eb0-b09c-09d46e9baaa5.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org