You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by "AnonHxy (via GitHub)" <gi...@apache.org> on 2023/04/03 10:12:33 UTC

[GitHub] [pulsar] AnonHxy opened a new pull request, #20001: [improve][broker] Read cache misses metric for topic

AnonHxy opened a new pull request, #20001:
URL: https://github.com/apache/pulsar/pull/20001

   ### Motivation
   
   When threre are too many read request from bookie server,  we need find which topic is catch up reading.  So here we add  a `storage_read_cache_misses_rate` for topic level.
   
   ### Modifications
   
   Add a metric `storage_read_cache_misses_rate` for topic level.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   - Add UT `org.apache.pulsar.broker.stats.PrometheusMetricsTest#testStorageReadCacheMissesRate` to verify it
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: <!-- ENTER URL HERE -->
   
   <!--
   After opening this PR, the build in apache/pulsar will fail and instructions will
   be provided for opening a PR in the PR author's forked repository.
   
   apache/pulsar pull requests should be first tested in your own fork since the 
   apache/pulsar CI based on GitHub Actions has constrained resources and quota.
   GitHub Actions provides separate quota for pull requests that are executed in 
   a forked repository.
   
   The tests will be run in the forked repository until all PR review comments have
   been handled, the tests pass and the PR is approved by a reviewer.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on a diff in pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on code in PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#discussion_r1160725228


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/metrics/ManagedLedgerMetrics.java:
##########
@@ -108,6 +108,8 @@ private List<Metrics> aggregate(Map<Metrics, List<ManagedLedgerImpl>> ledgersByD
                         (double) lStats.getReadEntriesErrors());
                 populateAggregationMapWithSum(tempAggregatedMetricsMap, "brk_ml_ReadEntriesRate",
                         lStats.getReadEntriesRate());
+                populateAggregationMapWithSum(tempAggregatedMetricsMap, "brk_ml_ReadEntriesOpsCacheMissesRate",
+                        lStats.getReadEntriesOpsCacheMissesRate());

Review Comment:
   Note that these cache metrics already have a well known format.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy merged pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "AnonHxy (via GitHub)" <gi...@apache.org>.
AnonHxy merged PR #20001:
URL: https://github.com/apache/pulsar/pull/20001


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#issuecomment-1500320329

   > When there are too many read request to bookie server, we need find out which ledger is reading without cache. 
   
   @AnonHxy - was there not a bookie metric for this? Also, can you clarify why the existing cache metrics are insufficient?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on a diff in pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "michaeljmarshall (via GitHub)" <gi...@apache.org>.
michaeljmarshall commented on code in PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#discussion_r1160723377


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/metrics/ManagedLedgerMetrics.java:
##########
@@ -108,6 +108,8 @@ private List<Metrics> aggregate(Map<Metrics, List<ManagedLedgerImpl>> ledgersByD
                         (double) lStats.getReadEntriesErrors());
                 populateAggregationMapWithSum(tempAggregatedMetricsMap, "brk_ml_ReadEntriesRate",
                         lStats.getReadEntriesRate());
+                populateAggregationMapWithSum(tempAggregatedMetricsMap, "brk_ml_ReadEntriesOpsCacheMissesRate",
+                        lStats.getReadEntriesOpsCacheMissesRate());

Review Comment:
   Why didn't this go with the other cache metrics?
   
   https://github.com/apache/pulsar/blob/82237d3684fe506bcb6426b3b23f413422e6e4fb/pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/metrics/ManagedLedgerCacheMetrics.java#L49-L58



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "AnonHxy (via GitHub)" <gi...@apache.org>.
AnonHxy commented on PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#issuecomment-1507987446

   > > When there are too many read request to bookie server, we need find out which ledger is reading without cache.
   > 
   > @AnonHxy - was there not a bookie metric for this? Also, can you clarify why the existing cache metrics are insufficient?
   
   Sorry for the late reply @michaeljmarshall .  It seems that  we couldn't find out which topic or ledgerId is reading without cache from the bookie metric.  Because there is no ledger tag in the booke metric.
   
   So This added metric will be useful to find out which topic is reading without cache.  For example, a catch-up reading  suddenly happens in our pulsar cluster. And too many reading requests will slow down the bookie cluster.  From this added metric we could find out the topic and do something to deal with it
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codecov-commenter commented on pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#issuecomment-1499935476

   ## [Codecov](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#20001](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1bab3a3) into [master](https://codecov.io/gh/apache/pulsar/commit/68c10eed7604aa3dcc3a6d8b548575e99b94dca2?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (68c10ee) will **decrease** coverage by `0.06%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pulsar/pull/20001/graphs/tree.svg?width=650&height=150&src=pr&token=acYqCpsK9J&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #20001      +/-   ##
   ============================================
   - Coverage     72.89%   72.84%   -0.06%     
   + Complexity    31619    31604      -15     
   ============================================
     Files          1861     1861              
     Lines        137356   137383      +27     
     Branches      15117    15118       +1     
   ============================================
   - Hits         100131   100075      -56     
   - Misses        29271    29356      +85     
   + Partials       7954     7952       -2     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | inttests | `24.37% <27.27%> (-0.07%)` | :arrow_down: |
   | systests | `24.96% <27.27%> (-0.08%)` | :arrow_down: |
   | unittests | `72.13% <100.00%> (-0.05%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ookkeeper/mledger/impl/ManagedLedgerMBeanImpl.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFuYWdlZC1sZWRnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2Jvb2trZWVwZXIvbWxlZGdlci9pbXBsL01hbmFnZWRMZWRnZXJNQmVhbkltcGwuamF2YQ==) | `89.31% <100.00%> (+0.42%)` | :arrow_up: |
   | [...kkeeper/mledger/impl/cache/EntryCacheDisabled.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFuYWdlZC1sZWRnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2Jvb2trZWVwZXIvbWxlZGdlci9pbXBsL2NhY2hlL0VudHJ5Q2FjaGVEaXNhYmxlZC5qYXZh) | `70.83% <100.00%> (+1.26%)` | :arrow_up: |
   | [...keeper/mledger/impl/cache/RangeEntryCacheImpl.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFuYWdlZC1sZWRnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2Jvb2trZWVwZXIvbWxlZGdlci9pbXBsL2NhY2hlL1JhbmdlRW50cnlDYWNoZUltcGwuamF2YQ==) | `58.75% <100.00%> (+0.34%)` | :arrow_up: |
   | [...sar/broker/stats/metrics/ManagedLedgerMetrics.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHVsc2FyLWJyb2tlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcHVsc2FyL2Jyb2tlci9zdGF0cy9tZXRyaWNzL01hbmFnZWRMZWRnZXJNZXRyaWNzLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...broker/stats/prometheus/AggregatedBrokerStats.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHVsc2FyLWJyb2tlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcHVsc2FyL2Jyb2tlci9zdGF0cy9wcm9tZXRoZXVzL0FnZ3JlZ2F0ZWRCcm9rZXJTdGF0cy5qYXZh) | `100.00% <100.00%> (ø)` | |
   | [...ker/stats/prometheus/AggregatedNamespaceStats.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHVsc2FyLWJyb2tlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcHVsc2FyL2Jyb2tlci9zdGF0cy9wcm9tZXRoZXVzL0FnZ3JlZ2F0ZWROYW1lc3BhY2VTdGF0cy5qYXZh) | `100.00% <100.00%> (ø)` | |
   | [...ar/broker/stats/prometheus/ManagedLedgerStats.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHVsc2FyLWJyb2tlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcHVsc2FyL2Jyb2tlci9zdGF0cy9wcm9tZXRoZXVzL01hbmFnZWRMZWRnZXJTdGF0cy5qYXZh) | `100.00% <100.00%> (ø)` | |
   | [...ker/stats/prometheus/NamespaceStatsAggregator.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHVsc2FyLWJyb2tlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcHVsc2FyL2Jyb2tlci9zdGF0cy9wcm9tZXRoZXVzL05hbWVzcGFjZVN0YXRzQWdncmVnYXRvci5qYXZh) | `89.78% <100.00%> (+0.16%)` | :arrow_up: |
   | [...che/pulsar/broker/stats/prometheus/TopicStats.java](https://codecov.io/gh/apache/pulsar/pull/20001?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHVsc2FyLWJyb2tlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcHVsc2FyL2Jyb2tlci9zdGF0cy9wcm9tZXRoZXVzL1RvcGljU3RhdHMuamF2YQ==) | `93.97% <100.00%> (+0.04%)` | :arrow_up: |
   
   ... and [74 files with indirect coverage changes](https://codecov.io/gh/apache/pulsar/pull/20001/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "AnonHxy (via GitHub)" <gi...@apache.org>.
AnonHxy commented on PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#issuecomment-1498378661

   @codelipenghui @congbobo184 @Technoboy-  PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on a diff in pull request #20001: [improve][ml] Add Read cache misses metric for ledger

Posted by "AnonHxy (via GitHub)" <gi...@apache.org>.
AnonHxy commented on code in PR #20001:
URL: https://github.com/apache/pulsar/pull/20001#discussion_r1161046362


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/metrics/ManagedLedgerMetrics.java:
##########
@@ -108,6 +108,8 @@ private List<Metrics> aggregate(Map<Metrics, List<ManagedLedgerImpl>> ledgersByD
                         (double) lStats.getReadEntriesErrors());
                 populateAggregationMapWithSum(tempAggregatedMetricsMap, "brk_ml_ReadEntriesRate",
                         lStats.getReadEntriesRate());
+                populateAggregationMapWithSum(tempAggregatedMetricsMap, "brk_ml_ReadEntriesOpsCacheMissesRate",
+                        lStats.getReadEntriesOpsCacheMissesRate());

Review Comment:
   > Why didn't this go with the other cache metrics?
   > 
   
   I think that it's more reasonable that here It follows the style of other `ManagedLedgerMetrics`, like `brk_ml_ReadEntriesRate` and so on.  They all are aggregation by namespace, not aggregation by ledger.
   
   In other place we go with the well know format, like `pulsar_storage_read_cache_misses_rate` and `pulsar_broker_storage_read_cache_misses_rate`
   
   https://github.com/apache/pulsar/pull/20001/files#diff-79bde2a6feed80b6b6b4b6950bc7d4ccd93946a8ec0b40cc1bef2e7b3e498f21R155-R157
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org