You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@geode.apache.org by db...@apache.org on 2020/10/01 15:23:48 UTC

[geode] branch support/1.13 updated: GEODE-8533: Docs - compaction-threshold mechanism description are wrong (#5549)

This is an automated email from the ASF dual-hosted git repository.

dbarnes pushed a commit to branch support/1.13
in repository https://gitbox.apache.org/repos/asf/geode.git


The following commit(s) were added to refs/heads/support/1.13 by this push:
     new 09bccd5  GEODE-8533: Docs - compaction-threshold mechanism description are wrong (#5549)
09bccd5 is described below

commit 09bccd5cc70a874c479f353724bbd01feaaa7015
Author: Dave Barnes <db...@apache.org>
AuthorDate: Thu Oct 1 08:19:05 2020 -0700

    GEODE-8533: Docs - compaction-threshold mechanism description are wrong (#5549)
    
    * GEODE-8533: User Guide - compaction-threshold is properly described as percentage of live data, below which an OpLog is marked for compaction
---
 .../disk_storage/compacting_disk_stores.html.md.erb     | 17 +++++++++++++----
 .../disk_store_configuration_params.html.md.erb         |  7 +++++--
 .../managing/disk_storage/using_disk_stores.html.md.erb |  6 +++---
 .../gfsh/command-pages/compact.html.md.erb              | 12 ++++--------
 .../tools_modules/gfsh/command-pages/create.html.md.erb |  2 +-
 5 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/geode-docs/managing/disk_storage/compacting_disk_stores.html.md.erb b/geode-docs/managing/disk_storage/compacting_disk_stores.html.md.erb
index 518aabd..eed516b 100644
--- a/geode-docs/managing/disk_storage/compacting_disk_stores.html.md.erb
+++ b/geode-docs/managing/disk_storage/compacting_disk_stores.html.md.erb
@@ -20,23 +20,32 @@ limitations under the License.
 -->
 
 <a id="compacting_disk_stores__section_64BA304595364E38A28098EB09494531"></a>
-When a cache operation is added to a disk store, any preexisting operation record for the same entry becomes obsolete, and <%=vars.product_name_long%> marks it as garbage. For example, when you create an entry, the create operation is added to the store. If you update the entry later, the update operation is added and the create operation becomes garbage. <%=vars.product_name%> does not remove garbage records as it goes, but it tracks the percentage of garbage in each operation log, and  [...]
+When a cache operation is added to a disk store, any preexisting operation record for the same entry
+becomes obsolete, and <%=vars.product_name_long%> marks it as garbage. For example, when you create
+an entry, the create operation is added to the store. If you update the entry later, the update
+operation is added and the create operation becomes garbage. <%=vars.product_name%> does not remove
+garbage records as it goes, but it tracks the percentage of non-garbage (live data) remaining in each operation log, and
+provides mechanisms for removing garbage to compact your log files.
 
 <%=vars.product_name%> compacts an old operation log by copying all non-garbage records into the current log and discarding the old files. As with logging, oplogs are rolled as needed during compaction to stay within the max oplog setting.
 
-You can configure the system to automatically compact any closed operation log when its garbage content reaches a certain percentage. You can also manually request compaction for online and offline disk stores. For the online disk store, the current operation log is not available for compaction, no matter how much garbage it contains.
+The system is configured by default to automatically compact any closed operation log when its non-garbage
+content drops below a certain percentage. This automatic compaction is well suited to most <%=vars.product_name%> implementations.
+In some circumstances, you may choose to  manually initiate compaction for online and
+offline disk stores. For the online disk store, the current operation log is not available for
+compaction, no matter how much garbage it contains.
 
 ## <a id="compacting_disk_stores__section_98C6B6F48E4F4F0CB7749E426AF4D647" class="no-quick-link"></a>Log File Compaction for the Online Disk Store
 
 <img src="../../images/diskStores-3.gif" id="compacting_disk_stores__image_7E34CC58B13548B196DAA15F5B0A0ECA" class="image" />
 
-Offline compaction runs essentially in the same way, but without the incoming cache operations. Also, because there is no current open log, the compaction creates a new one to get started.
+Offline compaction runs essentially in the same way, but without the incoming cache operations. Also, because there is no currently open log, the compaction creates a new one to get started.
 
 ## <a id="compacting_disk_stores__section_96E774B5502648458E7742B37CA235FF" class="no-quick-link"></a>Run Online Compaction
 
 Old log files become eligible for online compaction when their garbage content surpasses a configured percentage of the total file. A record is garbage when its operation is superseded by a more recent operation for the same object. During compaction, the non-garbage records are added to the current log along with new cache operations. Online compaction does not block current system operations.
 
--   **Automatic compaction**. When `auto-compact` is true, <%=vars.product_name%> automatically compacts each oplog when its garbage content surpasses the `compaction-threshold`. This takes cycles from your other operations, so you may want to disable this and only do manual compaction, to control the timing.
+-   **Automatic compaction**. When `auto-compact` is true, <%=vars.product_name%> automatically compacts each oplog when its non-garbage (live data) content drops below the `compaction-threshold`. This takes cycles from your other operations, so you may want to disable this and only do manual compaction, to control the timing.
 -   **Manual compaction**. To run manual compaction:
     -   Set the disk store attribute `allow-force-compaction` to true. This causes <%=vars.product_name%> to maintain extra data about the files so it can compact on demand. This is disabled by default to save space. You can run manual online compaction at any time while the system is running. Oplogs eligible for compaction based on the `compaction-threshold` are compacted into the current oplog.
     -   Run manual compaction as needed. <%=vars.product_name%> has two types of manual compaction:
diff --git a/geode-docs/managing/disk_storage/disk_store_configuration_params.html.md.erb b/geode-docs/managing/disk_storage/disk_store_configuration_params.html.md.erb
index bfb4d55..cdecd63 100644
--- a/geode-docs/managing/disk_storage/disk_store_configuration_params.html.md.erb
+++ b/geode-docs/managing/disk_storage/disk_store_configuration_params.html.md.erb
@@ -51,12 +51,15 @@ These `<disk-store>` attributes and subelements have corresponding `gfsh create
 </tr>
 <tr>
 <td><code class="ph codeph">auto-compact</code></td>
-<td>Boolean indicating whether to automatically compact a file when it reaches the <code class="ph codeph">compaction-threshold</code>.</td>
+<td>Boolean indicating whether to automatically compact a file when its live data content percentage drops below the <code class="ph codeph">compaction-threshold</code>.</td>
 <td>true</td>
 </tr>
 <tr>
 <td><code class="ph codeph">compaction-threshold</code></td>
-<td>Percentage of garbage allowed in the file before it is eligible for compaction. Garbage is created by entry destroys, entry updates, and region destroys and creates. Surpassing this percentage does not make compaction occur—it makes the file eligible to be compacted when a compaction is done.</td>
+<td>Percentage (0..100) of live data (non-garbage content) remaining in the operation log, below which it is eligible for
+compaction. As garbage is created (by entry destroys, entry updates, and region destroys and
+creates) the percentage of remaining live data declines. Falling below this percentage initiates compaction
+if auto-compaction is turned on. If not, the file will be eligible for manual compaction at a later time.</td>
 <td>50</td>
 </tr>
 <tr>
diff --git a/geode-docs/managing/disk_storage/using_disk_stores.html.md.erb b/geode-docs/managing/disk_storage/using_disk_stores.html.md.erb
index dcce9fb..6fa31f2 100644
--- a/geode-docs/managing/disk_storage/using_disk_stores.html.md.erb
+++ b/geode-docs/managing/disk_storage/using_disk_stores.html.md.erb
@@ -35,13 +35,13 @@ Before you begin, you should understand <%=vars.product_name%> [Basic Configurat
 
 1.  Work with your system designers and developers to plan for anticipated disk storage requirements in your testing and production caching systems. Take into account space and functional requirements.
     -   For efficiency, separate data that is only overflowed in separate disk stores from data that is persisted or persisted and overflowed. Regions can be overflowed, persisted, or both. Server subscription queues are only overflowed.
-    -   When calculating your disk requirements, figure in your data modification patterns and your compaction strategy. <%=vars.product_name%> creates each oplog file at the max-oplog-size, which defaults to 1 GB. Obsolete operations are only removed from the oplogs during compaction, so you need enough space to store all operations that are done between compactions. For regions where you are doing a mix of updates and deletes, if you use automatic compaction, a good upper bound for the [...]
+    -   When calculating your disk requirements, figure in your data modification patterns and your compaction strategy. <%=vars.product_name%> creates each oplog file at the max-oplog-size, which defaults to 1 GB. Obsolete operations are removed from the oplogs only during compaction, so you need enough space to store all operations that are done between compactions. For regions where you are doing a mix of updates and deletes, if you use automatic compaction, a good upper bound for the [...]
 
         ``` pre
-        (1 / (1 - (compaction_threshold/100)) ) * data size
+        (1 / (compaction_threshold/100) ) * data size
         ```
 
-        where data size is the total size of all the data you store in the disk store. So, for the default compaction-threshold of 50, the disk space is roughly twice your data size. Note that the compaction thread could lag behind other operations, causing disk use to rise above the threshold temporarily. If you disable automatic compaction, the amount of disk required depends on how many obsolete operations accumulate between manual compactions.
+        where data size is the total size of all the data you store in the disk store. So, for the default compaction-threshold of 50, the disk space is roughly twice your data size. Note that the compaction thread could lag behind other operations, causing disk use to rise temporarily above the upper bound. If you disable automatic compaction, the amount of disk required depends on how many obsolete operations accumulate between manual compactions.
 
 2.  Work with your host system administrators to determine where to place your disk store directories, based on your anticipated disk storage requirements and the available disks on your host systems.
     -   Make sure the new storage does not interfere with other processes that use disk on your systems. If possible, store your files to disks that are not used by other processes, including virtual memory or swap space. If you have multiple disks available, for the best performance, place one directory on each disk.
diff --git a/geode-docs/tools_modules/gfsh/command-pages/compact.html.md.erb b/geode-docs/tools_modules/gfsh/command-pages/compact.html.md.erb
index 400103d..870307d 100644
--- a/geode-docs/tools_modules/gfsh/command-pages/compact.html.md.erb
+++ b/geode-docs/tools_modules/gfsh/command-pages/compact.html.md.erb
@@ -37,7 +37,7 @@ Compact a disk store on all members with that disk store.
 
 This command uses the compaction threshold that each member has configured for its disk stores. The disk store must have the `allow-force-compaction` property set to `true`.
 
-See [Running Compaction on Disk Store Log Files](../../../managing/disk_storage/compacting_disk_stores.html#compacting_disk_stores) for more information.
+See [Running Compaction on Disk Store Log Files](../../../managing/disk_storage/compacting_disk_stores.html) for more information.
 
 **Availability:** Online. You must be connected in `gfsh` to a JMX Manager member to use this command.
 
@@ -47,15 +47,13 @@ See [Running Compaction on Disk Store Log Files](../../../managing/disk_storage/
 compact disk-store --name=value [--groups=value(,value)*]
 ```
 
-<a id="topic_F113C95C076F424E9AA8AC4F1F6324CC__table_7039256EA2014AE5BFAB63697FF35AB6"></a>
+**Parameters, compact disk-store**
 
 | Name                                          | Description                                                                                                                  |
 |-----------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
 | <span class="keyword parmname">\\-\\-name</span>  | *Required.* Name of the disk store to be compacted.                                                                          |
 | <span class="keyword parmname">\\-\\-groups</span> | Group(s) of members that perform disk compaction. If no group is specified, then the disk store is compacted by all members. |
 
-<span class="tablecap">Table 1. Compact Disk-Store Parameters</span>
-
 **Example Commands:**
 
 ``` pre
@@ -79,7 +77,7 @@ Compact an offline disk store.
 
 If the disk store is large, you may need to allocate additional memory to the process by using the `--J=-XmxNNNm` parameter.
 
-See [Running Compaction on Disk Store Log Files](../../../managing/disk_storage/compacting_disk_stores.html#compacting_disk_stores) for more information.
+See [Running Compaction on Disk Store Log Files](../../../managing/disk_storage/compacting_disk_stores.html) for more information.
 
 **Note:**
 Do not perform offline compaction on the baseline directory of an incremental backup.
@@ -93,7 +91,7 @@ compact offline-disk-store --name=value --disk-dirs=value(,value)*
 [--max-oplog-size=value] [--J=value(,value)*]
 ```
 
-<a id="topic_9CCFCB2FA2154E16BD775439C8ABC8FB__table_BDB9B26709D841F08BCD75087AF596D8"></a>
+**Parameters, compact offline-disk-store**
 
 | Name                                                   | Description                                                                                                                   | Default Value |
 |--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|---------------|
@@ -102,8 +100,6 @@ compact offline-disk-store --name=value --disk-dirs=value(,value)*
 | <span class="keyword parmname">\\-\\-max-oplog-size</span> | Maximum size (in megabytes) of the oplogs created by compaction.                                                              | -1            |
 | <span class="keyword parmname">\\-\\-J</span>              | Arguments passed to the Java Virtual Machine performing the compact operation on the disk store. For example: `-J=-Xmx1024m`. |               |
 
-<span class="tablecap">Table 2. Compact Offline-Disk-Store Parameters</span>
-
 **Example Commands:**
 
 ``` pre
diff --git a/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb b/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb
index 0c56b00..622421a 100644
--- a/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb
+++ b/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb
@@ -300,7 +300,7 @@ If the specified directory does not exist, the command will create the directory
 </tr>
 <tr>
 <td><span class="keyword parmname">\-\-compaction-threshold</span></td>
-<td>Percentage of garbage allowed before the disk store is eligible for compaction.</td>
+<td>Percentage of non-garbage remaining, below which the disk store is eligible for compaction.</td>
 <td>50</td>
 </tr>
 <tr>