You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by hj...@apache.org on 2020/10/26 01:01:32 UTC

[pulsar] branch master updated: [Issue 8345][Documentation] Improve retention policy documentation (#8356)

This is an automated email from the ASF dual-hosted git repository.

hjf pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git


The following commit(s) were added to refs/heads/master by this push:
     new a9633b5  [Issue 8345][Documentation] Improve retention policy documentation (#8356)
a9633b5 is described below

commit a9633b5b95ebe37c333c8eac59634a123ae2f2e3
Author: Lari Hotari <lh...@users.noreply.github.com>
AuthorDate: Mon Oct 26 03:01:18 2020 +0200

    [Issue 8345][Documentation] Improve retention policy documentation (#8356)
    
    * Clarify javadoc documentation of ManagedLedger retention time and retention size
    
    - explain the settings in other words to clarify the meaning
    
    * Clarify retention policy documentation
    
    Fixes #8345
    
    - the retention policy is based on both size and time
      - documentation was misleading and not accurate
    - setting either limit to 0 disables retention policy
      - add this also to the documentation explicitly
    
    * Fix checkstyle violation
---
 .../bookkeeper/mledger/ManagedLedgerConfig.java    | 23 +++++++-------
 .../common/policies/data/RetentionPolicies.java    |  6 ++++
 site2/docs/cookbooks-retention-expiry.md           | 35 +++++++++++++++++-----
 site2/docs/reference-terminology.md                |  2 +-
 4 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedgerConfig.java b/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedgerConfig.java
index 7a6dcbf..23dff99 100644
--- a/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedgerConfig.java
+++ b/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedgerConfig.java
@@ -378,15 +378,16 @@ public class ManagedLedgerConfig {
     }
 
     /**
-     * Set the retention time for the ManagedLedger
+     * Set the retention time for the ManagedLedger.
      * <p>
-     * Retention time will prevent data from being deleted for at least the specified amount of time, even if no cursors
-     * are created, or if all the cursors have marked the data for deletion.
+     * Retention time and retention size ({@link #setRetentionSizeInMB(long)}) are together used to retain the
+     * ledger data when when there are no cursors or when all the cursors have marked the data for deletion.
+     * Data will be deleted in this case when both retention time and retention size settings don't prevent deleting
+     * the data marked for deletion.
      * <p>
-     * A retention time of 0 (the default), will to have no time based retention.
+     * A retention time of 0 (default) will make data to be deleted immediately.
      * <p>
-     * Specifying a negative retention time will make the data to be retained indefinitely, based on the
-     * {@link #setRetentionSizeInMB(long)} value.
+     * A retention time of -1 , means to have an unlimited retention time.
      *
      * @param retentionTime
      *            duration for which messages should be retained
@@ -409,12 +410,14 @@ public class ManagedLedgerConfig {
     /**
      * The retention size is used to set a maximum retention size quota on the ManagedLedger.
      * <p>
-     * This setting works in conjuction with {@link #setRetentionSizeInMB(long)} and places a max size for retention,
-     * after which the data is deleted.
+     * Retention size and retention time ({@link #setRetentionTime(int, TimeUnit)}) are together used to retain the
+     * ledger data when when there are no cursors or when all the cursors have marked the data for deletion.
+     * Data will be deleted in this case when both retention time and retention size settings don't prevent deleting
+     * the data marked for deletion.
      * <p>
-     * A retention size of 0, will make data to be deleted immediately.
+     * A retention size of 0 (default) will make data to be deleted immediately.
      * <p>
-     * A retention size of -1, means to have an unlimited retention size.
+     * A retention size of -1 , means to have an unlimited retention size.
      *
      * @param retentionSizeInMB
      *            quota for message retention
diff --git a/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/RetentionPolicies.java b/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/RetentionPolicies.java
index c4b8688..4049708 100644
--- a/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/RetentionPolicies.java
+++ b/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/RetentionPolicies.java
@@ -20,6 +20,12 @@ package org.apache.pulsar.common.policies.data;
 
 /**
  * Definition of the retention policy.
+ *
+ * <p>When you set a retention policy you must set **both** a *size limit* and a *time limit*.
+ * In the case where you don't want to limit by either time or set, the value must be set to `-1`.
+ * Retention policy will be effectively disabled and it won't prevent the deletion of acknowledged
+ * messages when either size or time limit is set to `0`.
+ * Infinite retention can be achieved by setting both time and size limits to `-1`.
  */
 public class RetentionPolicies {
     private int retentionTimeInMinutes;
diff --git a/site2/docs/cookbooks-retention-expiry.md b/site2/docs/cookbooks-retention-expiry.md
index ce40e65..286f0ff 100644
--- a/site2/docs/cookbooks-retention-expiry.md
+++ b/site2/docs/cookbooks-retention-expiry.md
@@ -27,15 +27,17 @@ Pulsar's [admin interface](admin-api-overview.md) enables you to manage both ret
 
 ## Retention policies
 
-By default, when a Pulsar message arrives at a broker it will be stored until it has been acknowledged on all subscriptions, at which point it will be marked for deletion. You can override this behavior and retain even messages that have already been acknowledged on all subscriptions by setting a *retention policy* for all topics in a given namespace. Retention policies are either a *size limit* or a *time limit*.
+By default, when a Pulsar message arrives at a broker it will be stored until it has been acknowledged on all subscriptions, at which point it will be marked for deletion. You can override this behavior and retain even messages that have already been acknowledged on all subscriptions by setting a *retention policy* for all topics in a given namespace. Retention is based on both a *size limit* and a *time limit*.
 
 Retention policies are particularly useful if you intend to exclusively use the Reader interface. Because the Reader interface does not use acknowledgements, messages will never exist within backlogs. Most realistic Reader-only use cases require that retention be configured.
 
-When you set a size limit of, say, 10 gigabytes, then acknowledged messages in all topics in the namespace will be retained until the size limit for the topic is reached; if you set a time limit of, say, 1 day, then acknowledged messages for all topics in the namespace will be retained for 24 hours. The retention settings apply to all messages on topics that do not have any subscriptions, or if there are subscriptions, to messages that have been acked by all subscriptions. The retention  [...]
+When you set a retention policy you must set **both** a *size limit* and a *time limit*. In the case where you don't want to limit by either time or set, the value must be set to `-1`. Retention policy will be effectively disabled and it won't prevent the deletion of acknowledged messages when either size or time limit is set to `0`. Infinite retention can be achieved by setting both time and size limits to `-1`.
 
-When a retention limit is exceeded, the oldest message is marked for deletion until the set of retained messages falls within the specified limits again.
+When you set a size limit of, say, 10 gigabytes, and the time limit to `-1` then acknowledged messages in all topics in the namespace will be retained until the size limit for the topic is reached; if you set a time limit of, say, 1 day, and the size limit to `-1` then acknowledged messages for all topics in the namespace will be retained for 24 hours.
+
+The retention settings apply to all messages on topics that do not have any subscriptions, or if there are subscriptions, to messages that have been acked by all subscriptions. The retention policy settings do not affect unacknowledged messages on topics with subscriptions -- these are instead controlled by the backlog quota (see below).
 
-It is also possible to set *unlimited* retention time or size by setting `-1` for either time or size retention.
+When a retention limit is exceeded, the oldest message is marked for deletion until the set of retained messages falls within the specified limits again.
 
 ### Defaults
 
@@ -49,7 +51,9 @@ You can set a retention policy for a namespace by specifying the namespace as we
 
 #### pulsar-admin
 
-Use the [`set-retention`](reference-pulsar-admin.md#namespaces-set-retention) subcommand and specify a namespace, a size limit using the `-s`/`--size` flag, and a time limit using the `-t`/`--time` flag.
+Use the [`set-retention`](reference-pulsar-admin.md#namespaces-set-retention) subcommand and specify a namespace, a size limit using the `-s`/`--size` flag, and a time limit using the `-t`/`--time` flag. 
+
+You must set **both** a *size limit* and a *time limit*. In the case where you don't want to limit by either time or set, the value must be set to `-1`. Retention policy will be effectively disabled and it won't prevent the deletion of acknowledged messages when either size or time limit is set to 0.
 
 ##### Examples
 
@@ -61,7 +65,7 @@ $ pulsar-admin namespaces set-retention my-tenant/my-ns \
   --time 3h
 ```
 
-To set retention with a size limit but without a time limit:
+To set retention where time limit is ignored and the size limit of 1 terabyte determines retention:
 
 ```shell
 $ pulsar-admin namespaces set-retention my-tenant/my-ns \
@@ -69,7 +73,15 @@ $ pulsar-admin namespaces set-retention my-tenant/my-ns \
   --time -1
 ```
 
-Retention can be configured to be unlimited both in size and time:
+To set retention where size limit is ignored and the time limit of 3 hours determines retention:
+
+```shell
+$ pulsar-admin namespaces set-retention my-tenant/my-ns \
+  --size -1 \
+  --time 3h
+```
+
+To set infinite retention:
 
 ```shell
 $ pulsar-admin namespaces set-retention my-tenant/my-ns \
@@ -77,6 +89,13 @@ $ pulsar-admin namespaces set-retention my-tenant/my-ns \
   --time -1
 ```
 
+To disable the retention policy
+
+```shell
+$ pulsar-admin namespaces set-retention my-tenant/my-ns \
+  --size 0 \
+  --time 0
+```
 
 
 #### REST API
@@ -106,7 +125,7 @@ Use the [`get-retention`](reference-pulsar-admin.md#namespaces) subcommand and s
 $ pulsar-admin namespaces get-retention my-tenant/my-ns
 {
   "retentionTimeInMinutes": 10,
-  "retentionSizeInMB": 0
+  "retentionSizeInMB": 500
 }
 ```
 
diff --git a/site2/docs/reference-terminology.md b/site2/docs/reference-terminology.md
index 6b4845c..c6d40aa 100644
--- a/site2/docs/reference-terminology.md
+++ b/site2/docs/reference-terminology.md
@@ -88,7 +88,7 @@ A message that has been delivered to a consumer for processing but not yet confi
 
 #### Retention Policy
 
-Size and/or time limits that you can set on a [namespace](#namespace) to configure retention of [messages](#message)
+Size and time limits that you can set on a [namespace](#namespace) to configure retention of [messages](#message)
 that have already been [acknowledged](#acknowledgement-ack).
 
 #### Multi-Tenancy