You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2022/11/29 16:31:17 UTC

[GitHub] [skywalking] wu-sheng opened a new issue, #10051: [Feature] [OAP] Consider removing initial load in persistent at the minute dimensionality

wu-sheng opened a new issue, #10051:
URL: https://github.com/apache/skywalking/issues/10051

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   In 9.3.0, we enhanced the cache mechanism in #10021 and verified through #10025. The `ID` read by the persistent worker at the minute dimensionality now has 60% cached. Only the first bulk of every minute would require `ID` read.
   
   When we look deeper into the logic, we would find out actually that, **from the minute dimensionality, this `ID` read is not required**.
   We are concerned that no metric in the cache doesn't mean there is no metric in the database. Especially when 
   1. The timestamp is not synced in the cluster, so, timestamps of telemetry are not ordered by the time series.
   2. OAP is booting/rebooting, and the cache is cold.
   
   About <1>, we don't expect this anymore. Our TTL and metric/topo analysis are all relying on timestamps generally synced. It doesn't have to be synced in the `ms` level, but at least with only a 3-5s gap.
   
   About <2>, we only should try to load metrics from DB in the 1 minute after rebooting, considering the assumption about time synced in the <1>. So, the metrics would overlap existing metrics generated in one booting period. 
   There is little chance we faced data conflicts, even if we faced them, we just generate metrics at the booting minute inaccurate. In the best practice, we could keep loading metrics from the database when metrics timestamps are before the **OAP started timestamp** as a fail-safe.
   
   Regarding hour and day dimensionalities, there is nothing different. We just should keep `loading metrics from database` when the hour/day time bucket before **OAP started hour / day**.
   
   @hanahmily @kezhenxu94 @wankai123 PTAL. Considering 9.3.0 releasing soon, I don't want to take a risk to change for now.
   But the theory should be correct, please help on rechecking.
   
   ### Use case
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wankai123 commented on issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent

Posted by GitBox <gi...@apache.org>.
wankai123 commented on issue #10051:
URL: https://github.com/apache/skywalking/issues/10051#issuecomment-1332104268

   This is a great idea to reduce the load on the storage side! I think we can start with the minute dimensionality, it's the most frequent. And for the hour/day dimensionality it's difficult to estimate the expiration time of the cache.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #10051:
URL: https://github.com/apache/skywalking/issues/10051#issuecomment-1339034602

   I hope #10111 provides the final fix for unexpected metrics expired.
   
   There were some metrics reloaded from the database(before #10111)
   <img width="425" alt="image" src="https://user-images.githubusercontent.com/5441976/205874135-df4788cc-f51e-4b63-bc2c-d53c4fa5bfee.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] hanahmily commented on issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent

Posted by GitBox <gi...@apache.org>.
hanahmily commented on issue #10051:
URL: https://github.com/apache/skywalking/issues/10051#issuecomment-1331530846

   I'm delighted to see this will happen. BanyanDB is based on the idea that we read less than write. Reducing reading operations will improve interactive performance technically. 
   
   The timeline is also suitable for me. The BanyanDB storage plugin of OAP needs more enhancements from my recent works on the showcase env. We could make it happen in the next iteration. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng closed issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent
URL: https://github.com/apache/skywalking/issues/10051


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #10051:
URL: https://github.com/apache/skywalking/issues/10051#issuecomment-1330920915

   FYI @apache/skywalking-committers if you have interests in SkyWalking OAP kernel logic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #10051: [Feature] [OAP] Consider removing metrics reload in persistent

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #10051:
URL: https://github.com/apache/skywalking/issues/10051#issuecomment-1330927169

   Ideally, due to this enhancement, the cached metrics should go closing to 95%+, and highly reduce the IOPS of the database(read part). 
   The OAP would only read from database when CLI/UI applies for query metrics/traces/logs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org