You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by "teabot (via GitHub)" <gi...@apache.org> on 2023/02/15 12:41:53 UTC

[GitHub] [pulsar-site] teabot opened a new pull request, #412: [fix][doc] Describe approximate behavior of time-based quotas

teabot opened a new pull request, #412:
URL: https://github.com/apache/pulsar-site/pull/412

   Time-based quotas are applied approximately. Yet the documentation supposes they are strict. It would be useful to set proper user expectations.
   
   ### Documentation
   
   ![image](https://user-images.githubusercontent.com/228950/219029726-87c1c209-35e3-46ff-8545-0bc78129f3ac.png)
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [x] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `./preview.sh` at root path) to your PR description, or else your PR might not get merged. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
   - [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] Anonymitaet commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "Anonymitaet (via GitHub)" <gi...@apache.org>.
Anonymitaet commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1128935530


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -473,3 +479,6 @@ The diagram below illustrates one of the cases that the consumed storage size is
 ![](/assets/retention-storage-size.svg)
 
 If you do not have any retention period and you never have much of a backlog, the upper limit for retained messages, which are acknowledged, equals the Pulsar segment rollover period + entry log rollover period + (garbage collection interval * garbage collection ratios).
+
+[backlogquotacheckintervalinseconds]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=backlogquotacheckintervalinseconds
+[precisetimebasedbacklogquotacheck]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=precisetimebasedbacklogquotacheck

Review Comment:
   [This comment](https://github.com/apache/pulsar-site/pull/412#discussion_r1113751246) was left two weeks ago. The reason for my suggestion (as a temporary workaround) at that time was:
   
   For example, if we want to publish a new next code release (2.12.0) with docs, the whole 2.12.x doc set is copied from `master`. If we use `https://pulsar.apache.org/reference/#/next/xxxxxxxx` in links, then the links in the old doc set will always point to the latest Reference site, which is incorrect.
   
    < < < < < < < < < < < < < < < <
   
   But now you can write it as `[precisetimebasedbacklogquotacheck]: https://pulsar.apache.org/reference/#/@pulsar:version_origin@/config/reference-configuration-broker?id=precisetimebasedbacklogquotacheck` since https://github.com/apache/pulsar-site/pull/456 was merged yesterday.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] teabot commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "teabot (via GitHub)" <gi...@apache.org>.
teabot commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1126226304


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,8 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+Note that by default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.

Review Comment:
   sure



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] tisonkun commented on pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "tisonkun (via GitHub)" <gi...@apache.org>.
tisonkun commented on PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#issuecomment-1562174190

   Note that it doesn't mean this PR is rejected - it's good to have. Just a notification to roll up the stale PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] Anonymitaet commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "Anonymitaet (via GitHub)" <gi...@apache.org>.
Anonymitaet commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1113744231


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,8 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+Note that by default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.

Review Comment:
   This suggestions apply to other md files as well. Can you update them all?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] Anonymitaet commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "Anonymitaet (via GitHub)" <gi...@apache.org>.
Anonymitaet commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1113751246


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -473,3 +479,6 @@ The diagram below illustrates one of the cases that the consumed storage size is
 ![](/assets/retention-storage-size.svg)
 
 If you do not have any retention period and you never have much of a backlog, the upper limit for retained messages, which are acknowledged, equals the Pulsar segment rollover period + entry log rollover period + (garbage collection interval * garbage collection ratios).
+
+[backlogquotacheckintervalinseconds]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=backlogquotacheckintervalinseconds
+[precisetimebasedbacklogquotacheck]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=precisetimebasedbacklogquotacheck

Review Comment:
   1. What's the context of these two paras? Do you intend mean this?
   
   ```suggestion
   To configure checks, you can use `backlogquotacheckintervalinseconds` and 
   `precisetimebasedbacklogquotacheck` parameters. Details see **Configuration > Plusar > Broker** on [Pulsar Reference Site](https://pulsar.apache.org/reference).
   ```
   
   2. Suggest using general link (https://pulsar.apache.org/reference) instead of specific links to reduce maintenance costs and potential errors. We've applied this strategy across all docs.
   



##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,12 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+::: note 
+
+By default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior, you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.

Review Comment:
   what does this mean?
   
   > [`xxx`][xxx] 
   
   <img width="733" alt="image" src="https://user-images.githubusercontent.com/50226895/220500831-bbca54e4-1568-4416-970f-e4986148aaec.png">
   
   Do you want to highlight the para? You can use `xxx` instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] Anonymitaet commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "Anonymitaet (via GitHub)" <gi...@apache.org>.
Anonymitaet commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1128935530


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -473,3 +479,6 @@ The diagram below illustrates one of the cases that the consumed storage size is
 ![](/assets/retention-storage-size.svg)
 
 If you do not have any retention period and you never have much of a backlog, the upper limit for retained messages, which are acknowledged, equals the Pulsar segment rollover period + entry log rollover period + (garbage collection interval * garbage collection ratios).
+
+[backlogquotacheckintervalinseconds]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=backlogquotacheckintervalinseconds
+[precisetimebasedbacklogquotacheck]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=precisetimebasedbacklogquotacheck

Review Comment:
   [This comment](https://github.com/apache/pulsar-site/pull/412#discussion_r1113751246) was left two weeks ago. The reason for my suggestion (as a temporary workaround) at that time was:
   
   For example, if we want to publish a new next code release (2.12.0) with docs, the whole 2.12.x doc set is copied from `master`. If we use `xxx/next/xxx` in links, then the links in the old doc set will always point to the latest Reference site, which is incorrect.
   
    < < < < < < < < < < < < < < < <
   
   But now you can write it as `[precisetimebasedbacklogquotacheck]: https://pulsar.apache.org/reference/#/@pulsar:version_origin@/config/reference-configuration-broker?id=precisetimebasedbacklogquotacheck` since https://github.com/apache/pulsar-site/pull/456 was merged yesterday.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] tisonkun commented on pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "tisonkun (via GitHub)" <gi...@apache.org>.
tisonkun commented on PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#issuecomment-1562173280

   Code conflict and stale for a while. I'd close this PR and feel free to recreate a PR based on the latest master and ping me for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] tisonkun closed pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "tisonkun (via GitHub)" <gi...@apache.org>.
tisonkun closed pull request #412: [fix][doc] Describe approximate behavior of time-based quotas
URL: https://github.com/apache/pulsar-site/pull/412


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] Anonymitaet commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "Anonymitaet (via GitHub)" <gi...@apache.org>.
Anonymitaet commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1111410824


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,8 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+Note that by default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.

Review Comment:
   ```suggestion
   ::: note 
   
   By default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior, you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.
   
   :::
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] teabot commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "teabot (via GitHub)" <gi...@apache.org>.
teabot commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1126239862


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -473,3 +479,6 @@ The diagram below illustrates one of the cases that the consumed storage size is
 ![](/assets/retention-storage-size.svg)
 
 If you do not have any retention period and you never have much of a backlog, the upper limit for retained messages, which are acknowledged, equals the Pulsar segment rollover period + entry log rollover period + (garbage collection interval * garbage collection ratios).
+
+[backlogquotacheckintervalinseconds]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=backlogquotacheckintervalinseconds
+[precisetimebasedbacklogquotacheck]: https://pulsar.apache.org/reference/#/next/config/reference-configuration-broker?id=precisetimebasedbacklogquotacheck

Review Comment:
   I disagree with this practice. It may be easier for authors, but it's really difficult for readers to find what they need. The target page has 100s of elements, is nested, is not searchable, and is not sorted.
   
   I see in your suggestion that you still provide context to the reader "see **Configuration > Plusar > Broker**" — doesn't this too suffer from the same issues of maintenance and errors, while also being less useful by not being encoded in the link?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] teabot commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "teabot (via GitHub)" <gi...@apache.org>.
teabot commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1126227911


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,12 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+::: note 
+
+By default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior, you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.

Review Comment:
   Those are links:
   https://www.markdownguide.org/basic-syntax/#reference-style-links



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] teabot commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "teabot (via GitHub)" <gi...@apache.org>.
teabot commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1126227911


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,12 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+::: note 
+
+By default, time-based backlogs are enforced periodically using an approximate method. This avoids a potentially costly scan of the backlog each time a message is produced. However, it does mean that in some cases you may observe a lack of strict enforcement. To tune this behavior, you should consider using the [`backlogQuotaCheckIntervalInSeconds`][backlogquotacheckintervalinseconds] and [`preciseTimeBasedBacklogQuotaCheck`][precisetimebasedbacklogquotacheck] broker options.

Review Comment:
   Those are links



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar-site] momo-jun commented on a diff in pull request #412: [fix][doc] Describe approximate behavior of time-based quotas

Posted by "momo-jun (via GitHub)" <gi...@apache.org>.
momo-jun commented on code in PR #412:
URL: https://github.com/apache/pulsar-site/pull/412#discussion_r1129084110


##########
docs/cookbooks-retention-expiry.md:
##########
@@ -214,6 +214,12 @@ Backlog quotas are handled at the namespace level. They can be managed via:
 
 You can set a size and/or time threshold and backlog retention policy for all of the topics in a [namespace](reference-terminology.md#namespace) by specifying the namespace, a size limit and/or a time limit in second, and a policy by name.
 
+::: note 

Review Comment:
   ```suggestion
   :::note 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org