You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/03/14 02:15:26 UTC

[GitHub] [pulsar] momo-jun commented on a change in pull request #14546: [Doc] Add more explanations and images for retention, backlog and TTL

momo-jun commented on a change in pull request #14546:
URL: https://github.com/apache/pulsar/pull/14546#discussion_r825549537



##########
File path: site2/docs/cookbooks-retention-expiry.md
##########
@@ -358,11 +371,22 @@ admin.namespaces().removeNamespaceMessageTTL(namespace)
 
 ## Delete messages from namespaces
 
-If you do not have any retention period and that you never have much of a backlog, the upper limit for retaining messages, which are acknowledged, equals to the Pulsar segment rollover period + entry log rollover period + (garbage collection interval * garbage collection ratios).
+When it comes to the physical storage size, message expiry and retention are just like two sides of the same coin.
+* The backlog quota and TTL parameters prevent disk size from growing indefinitely, as Pulsar’s default behaviour is to persist unacknowledged messages. 
+* The retention policy allocates storage space to accommodate the messages that are supposed to be deleted by Pulsar by default.
+
+As a conclusion, the size of your physical storage should accommodate the sum of the backlog quota and the retention size. 
+
+The message deletion rate (releasing rate of disk space) can be determined by multiple factors. 
 
 - **Segment rollover period**: basically, the segment rollover period is how often a new segment is created. Once a new segment is created, the old segment will be deleted. By default, this happens either when you have written 50,000 entries (messages) or have waited 240 minutes. You can tune this in your broker.
 
 - **Entry log rollover period**: multiple ledgers in BookKeeper are interleaved into an [entry log](https://bookkeeper.apache.org/docs/4.11.1/getting-started/concepts/#entry-logs). In order for a ledger that has been deleted, the entry log must all be rolled over.
 The entry log rollover period is configurable, but is purely based on the entry log size. For details, see [here](https://bookkeeper.apache.org/docs/4.11.1/reference/config/#entry-log-settings). Once the entry log is rolled over, the entry log can be garbage collected.
 
 - **Garbage collection interval**: because entry logs have interleaved ledgers, to free up space, the entry logs need to be rewritten. The garbage collection interval is how often BookKeeper performs garbage collection. which is related to minor compaction and major compaction of entry logs. For details, see [here](https://bookkeeper.apache.org/docs/4.11.1/reference/config/#entry-log-compaction-settings).
+
+The diagram below illustrates one of the cases that the consumed storage size is larger than the given limits for backlog and retention, because messages over the retention limit are kept because other messages in the same segment are still within retention period.

Review comment:
       Updated. Thanks, Yu.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org