You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by ko...@apache.org on 2019/09/11 17:48:39 UTC
[couchdb-documentation] 01/02: Add documentation for smoosh, including migration

This is an automated email from the ASF dual-hosted git repository.

kocolosk pushed a commit to branch smoosh-documentation
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git

commit 7d55002b46eda75d432edf3c9e96a00d5fd9db9a
Author: Adam Kocoloski <ko...@apache.org>
AuthorDate: Thu Sep 5 15:55:21 2019 -0400

    Add documentation for smoosh, including migration
---
 src/config/compaction.rst      | 197 +++++++++++++++++------------------------
 src/maintenance/compaction.rst | 158 ++++++++++++++++++++++++++++++---
 2 files changed, 231 insertions(+), 124 deletions(-)

diff --git a/src/config/compaction.rst b/src/config/compaction.rst
index e9f2623..e2f147c 100644
--- a/src/config/compaction.rst
+++ b/src/config/compaction.rst
@@ -39,160 +39,129 @@ Database Compaction Options
             [database_compaction]
             checkpoint_after = 5242880
 
-.. _config/compactions:
-
-Compaction Daemon Rules
-=======================
-
-.. config:section:: compactions :: Compaction Daemon Rules
-
-    A list of rules to determine when to run automatic compaction. The
-    :option:`daemons/compaction_daemon` compacts databases and their respective
-    view groups when all the condition parameters are satisfied. Configuration
-    can be per-database or global, and it has the following format::
-
-        [compactions]
-        database_name = [ {ParamName, ParamValue}, {ParamName, ParamValue}, ... ]
-        _default = [ {ParamName, ParamValue}, {ParamName, ParamValue}, ... ]
-
-    For example::
-
-      [compactions]
-      _default = [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "23:00"}, {to, "04:00"}]
-
-    - ``db_fragmentation``: If the ratio of legacy data, including metadata, to
-      current data in the database file size is equal to or greater than this
-      value, this condition is satisfied. The percentage is expressed as an
-      integer percentage. This value is computed as:
-
-      .. code-block:: none
+.. _config/view_compaction:
 
-          (sizes.disk - sizes.active) / sizes.disk * 100
+View Compaction Options
+========================
 
-      The sizes.active and sizes.disk values can be obtained when
-      querying :http:get:`/{db}`.
+.. config:section:: view_compaction :: Views Compaction Options
 
-    - ``view_fragmentation``: If the ratio of legacy data, including metadata,
-      to current data in a view index file size is equal to or greater then
-      this value, this database compaction condition is satisfied. The
-      percentage is expressed as an integer percentage. This value is computed
-      as:
+    .. config:option:: keyvalue_buffer_size :: Key-Values buffer size
 
-      .. code-block:: none
+        Specifies maximum copy buffer size in bytes used during compaction::
 
-          (sizes.disk - sizes.active) / sizes.disk * 100
+            [view_compaction]
+            keyvalue_buffer_size = 2097152
 
-      The sizes.active and sizes.disk values can be obtained when querying a
-      :ref:`view group's information URI <api/ddoc/info>`.
+.. _config/compactions:
 
-    - ``from`` and ``to``: The period for which a database (and its view group)
-      compaction is allowed. The value for these parameters must obey the
-      format:
+Compaction Daemon
+=================
 
-      .. code-block:: none
+CouchDB ships with an automated, event-driven daemon that continuously
+re-prioritizes the database and secondary index files on each node and
+automatically compacts the files that will recover the most free space according
+to the following parameters.
 
-          HH:MM - HH:MM  (HH in [0..23], MM in [0..59])
+.. config:section:: smoosh :: Compaction Daemon Rules
 
-    - ``strict_window``: If a compaction is still running after the end of the
-      allowed period, it will be canceled if this parameter is set to `true`.
-      It defaults to `false` and is meaningful only if the *period* parameter
-      is also specified.
+    .. config:option:: db_channels :: Active database channels
 
-    - ``parallel_view_compaction``: If set to `true`, the database and its
-      views are compacted in parallel. This is only useful on certain setups,
-      like for example when the database and view index directories point to
-      different disks. It defaults to `false`.
+        A comma-delimited list of channels that are sent the names of database
+        files when those files are updated. Each channel can choose whether to
+        enqueue the database for compaction; once a channel has enqueued the
+        database, no additional channel in the list will be given the
+        opportunity to do so.
 
-    Before a compaction is triggered, an estimation of how much free disk space
-    is needed is computed. This estimation corresponds to two times the data
-    size of the database or view index. When there's not enough free disk space
-    to compact a particular database or view index, a warning message is
-    logged.
+    .. config:option:: view_channels :: Active secondary index channels
 
-    Examples:
+        A comma-delimited list of channels that are sent the names of secondary
+        index files when those files are updated. Each channel can choose
+        whether to enqueue the index for compaction; once a channel has enqueued
+        the index, no additional channel in the list will be given the
+        opportunity to do so.
 
-    #.
-        ::
+    .. config:option:: staleness :: Minimum time between priority calculations
 
-            [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}]
+        The number of minutes that the (expensive) priority calculation on an
+        individual can be stale for before it is recalculated. Defaults to 5.
 
-       The `foo` database is compacted if its fragmentation is 70% or more. Any
-       view index of this database is compacted only if its fragmentation is
-       60% or more.
+    .. config:option:: cleanup_index_files :: Automatically delete orphaned index files
 
-    #.
-        ::
+        If set to true, the compaction daemon will delete the files for indexes
+        that are no longer associated with any design document. Defaults to
+        `false` and probably shouldn't be changed unless the node is running low
+        on disk space, and only after considering the ramifications.
 
-            [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}]
+    .. config:option:: wait_secs :: Warmup period before triggering first compaction
 
-       Similar to the preceding example but a compaction (database or view
-       index) is only triggered if the current time is between midnight and 4
-       AM.
+        The time a channel waits before starting compactions to allow time to
+        observe the system and make a smarter decision about what to compact
+        first. Hardly ever changed from the default of 30 (seconds).
 
-    #.
-        ::
+.. config:section:: smoosh.<channel> :: Per-channel configuration
 
-            [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}, {strict_window, true}]
+The following settings control the resource allocation for a given compaction
+channel.
 
-       Similar to the preceding example - a compaction (database or view index)
-       is only triggered if the current time is between midnight and 4 AM. If
-       at 4 AM the database or one of its views is still compacting, the
-       compaction process will be canceled.
+    .. config:option:: capacity
 
-    #.
-        ::
+        The maximum number of items the channel can hold (lowest priority item
+        is removed to make room for new items). Defaults to 9999.
 
-            [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}, {strict_window, true}, {parallel_view_compaction, true}]
+    .. config:option:: concurrency
 
-       Similar to the preceding example, but a database and its views can be
-       compacted in parallel.
+        The maximum number of jobs that can run concurrently in this channel.
+        Defaults to 1.
 
-.. _config/compaction_daemon:
+    .. config: option:: from
 
-Configuration of Compaction Daemon
-==================================
+    .. config: option:: to
 
-.. config:section:: compaction_daemon :: Configuration of Compaction Daemon
+        The time period during which this channel is allowed to execute
+        compactions. The value for each of these parameters must obey the format
+        `HH:MM` with HH in [0..23] and MM in [0..59]. Each channel listed in the
+        top-level daemon configuration continuously builds its priority queue
+        regardless of the period defined here. The default is to allow the
+        channel to execute compactions all the time.
 
-    .. config:option:: check_interval
+    .. config: option:: strict_window
 
-        The delay, in seconds, between each check for which database and view
-        indexes need to be compacted. In other words, this delay will occur
-        after *all* databases and views are compacted (or at least checked)::
+        If set to `true`, any compaction that is still running after the end of
+        the allowed perio will be suspended, and then resumed during the next
+        window. It defaults to `false`, in which case any running compactions
+        will be allowed to finish, but no new ones will be started.
 
-            [compaction_daemon]
-            check_interval = 3600
+There are also several settings that collectively control whether a channel will
+enqueue a file for compaction and how it prioritizes files within its queue:
 
-    .. config:option:: min_file_size
+    .. config:option:: max_priority
 
-        If a database or view index file is smaller than this value (in bytes),
-        compaction will not happen. Very small files always have high
-        fragmentation, so compacting them is inefficient. ::
+        Each item must have a priority lower than this to be enqueued. Defaults
+        to infinity.
 
-            [compaction_daemon]
-            min_file_size = 131072
+    .. config:option:: max_size
 
-    .. config:option:: snooze_period_ms
+        The item must be no larger than this many bytes in length to be
+        enqueued. Defaults to infinity.
 
-        With lots of databases and/or with lots of design docs in one or more
-        databases, the compaction_daemon can create significant CPU load when
-        checking whether databases and view indexes need compacting. The
-        ``snooze_period_ms`` setting ensures a smoother CPU load. Defaults to
-        3000 milliseconds wait.
+    .. config:option:: min_priority
 
-            [compaction_daemon]
-            snooze_period_ms = 3000
+        The item must have a priority at least this high to be enqueued.
+        Defaults to 5.0 for ratio and 16 MB for slack.
 
-.. _config/view_compaction:
+    .. config:option:: min_changes
 
-Views Compaction Options
-========================
+        The minimum number of changes since last compaction before the item will
+        be enqueued. Defaults to 0. Currently only works for databases.
 
-.. config:section:: view_compaction :: Views Compaction Options
+    .. config:option:: min_size
 
-    .. config:option:: keyvalue_buffer_size :: Key-Values buffer size
+        The item must be at least this many bytes in length to be enqueued.
+        Defaults to 1mb (1048576 bytes).
 
-        Specifies maximum copy buffer size in bytes used during compaction::
+    .. config:option:: priority
 
-            [view_compaction]
-            keyvalue_buffer_size = 2097152
+        The method used to calculate priority. Can be ratio (calculated as
+        `sizes.file/sizes.active`) or slack (calculated as `sizes.file -
+        sizes.active`). Defaults to ratio.
diff --git a/src/maintenance/compaction.rst b/src/maintenance/compaction.rst
index cd89fe5..7692385 100644
--- a/src/maintenance/compaction.rst
+++ b/src/maintenance/compaction.rst
@@ -43,7 +43,7 @@ resolution during replication. The number of stored revisions
 (and their `tombstones`) can be configured by using the :get:`_revs_limit
 </{db}/_revs_limit>` URL endpoint.
 
-Compaction is manually triggered operation per database and runs as a background
+Compaction can be manually triggered per database and runs as a background
 task. To start it for specific database there is need to send HTTP
 :post:`/{db}/_compact` sub-resource of the target database::
 
@@ -180,13 +180,151 @@ exist anymore) you can trigger a :ref:`view cleanup <api/db/view_cleanup>`::
 Automatic Compaction
 ====================
 
-While both :ref:`database <compact/db>` and :ref:`views <compact/views>`
-compactions are required be manually triggered, it is also possible to configure
-automatic compaction, so that compaction of databases and views is automatically
-triggered based on various criteria. Automatic compaction is configured in
-CouchDB's :ref:`configuration files <config/intro>`.
+CouchDB's automatic compaction daemon, internally known as "smoosh", will
+trigger compaction jobs for both databases and views based on configurable
+thresholds for the sparseness of a file and the total amount of space that can
+be recovered.
 
-The :config:option:`daemons/compaction_daemon` is responsible for triggering
-the compaction. It is enabled by default and automatically started.
-The criteria for triggering the compactions is configured in the
-:config:section:`compactions` section.
+Channels
+--------
+
+Smoosh works using the concept of channels. A channel is essentially a queue of
+pending compactions. There are separate sets of active channels for databases
+and views. Each channel is assigned a configuration which defines whether a
+compaction ends up in the channel's queue and how compactions are prioritized
+within that queue.
+
+Smoosh takes each channel and works through the compactions queued in each in
+priority order. Each channel is processed concurrently, so the priority levels
+only matter within a given channel. Each channel has an assigned number of
+active compactions, which defines how many compactions happen for that channel
+in parallel. For example, a cluster with a lot of database churn but few views
+might require more active compactions in the database channel(s).
+
+It's important to remember that a channel is local to a CouchDB node; that is,
+each node maintains and processes an independent set of compactions. Channels
+are defined as either "ratio" channels or "slack" channels, depending on the
+type of algorithm used for prioritization:
+
+-   Ratio: uses the ratio of sizes.file / sizes.active as its driving
+    calculation. The result X must be greater than some configurable value Y for
+    a compaction to be added to the queue. Compactions are then prioritised for
+    higher values of X.
+
+-   Slack: uses the difference of sizes.file - sizes.active as its driving
+    calculation. The result X must be greater than some configurable value Y for
+    a compaction to be added to the queue. Compactions are prioritised for
+    higher values of X.
+
+In both cases, Y is set using the `min_priority` configuration variable. CouchDB
+ships with four channels pre-configured: one channel of each type for databases,
+and another one for views.
+
+Channel Configuration
+---------------------
+
+Channels are defined using `[smoosh.<channel_name>]` configuration blocks, and
+activated by naming the channel in the `db_channels` or `view_channels`
+configuration setting in the `[smoosh]` block. The default configuration is
+
+.. code-block:: ini
+
+    [smoosh]
+    db_channels = upgrade_dbs,ratio_dbs,slack_dbs
+    view_channels = upgrade_views,ratio_views,slack_views
+
+    [smoosh.ratio_dbs]
+    priority = ratio
+    min_priority = 5.0
+
+    [smoosh.ratio_views]
+    priority = ratio
+    min_priority = 5.0
+
+    [smoosh.slack_dbs]
+    priority = slack
+    min_priority = 16777216
+
+    [smoosh.slack_views]
+    priority = slack
+    min_priority = 16777216
+
+The "upgrade" channels are a special pair of channels that only check whether
+the `disk_format_version` for the file matches the current version, and enqueue
+the file for compaction (which has the side effect of upgrading the file format)
+if that's not the case. There are several additional properties that can be
+configured for each channel; these are documented in the :ref:`configuration API
+<config/compactions>`
+
+Scheduling Windows
+------------------
+
+Each compaction channel can be configured to run only during certain hours of
+the day. The channel-specific `from`, `to`, and `strict_window` configuration
+settings control this behavior. For example
+
+.. code-block:: ini
+
+    [smoosh.overnight_channel]
+    from = 20:00
+    to = 06:00
+    strict_window = true
+
+The `strict_window` setting will cause the compaction daemon to suspend all
+active compactions in this channel when exiting the window, and resume them when
+re-entering. If `strict_window` is left at its default of false, the active
+compactions will be allowed to complete but no new compactions will be started.
+
+Migration Guide
+---------------
+
+Previous versions of CouchDB shipped with a simpler compaction daemon. The
+configuration system for the new daemon is not backwards-compatible with the old
+one, so users with customized compaction configurations will need to port them
+to the new setup. The old daemon's compaction rules configuration looked like
+
+.. code-block:: ini
+
+    [compaction_daemon]
+    min_file_size = 131072
+    check_interval = 3600
+    snooze_period_ms = 3000
+
+    [compactions]
+    mydb = [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {parallel_view_compaction, true}]
+    _default = [{db_fragmentation, "50%"}, {view_fragmentation, "55%"}, {from, "20:00"}, {to, "06:00"}, {strict_window, true}]
+
+Many of the elements of this configuration can be ported over to the new system.
+Examining each in detail:
+
+*   ``min_file_size`` is now configured on a per-channel basis using the
+    min_size config setting.
+
+*   ``db_fragmentation`` is equivalent to configuring a priority = ratio
+    channel with min_priority set to 1.0 / (1 - db_fragmentation/100)
+    and then listing that channel in the [smoosh] db_channels config
+    setting.
+
+*   ``view_fragmention`` is likewise equivalent to configuring a priority = ratio
+    channel with min_priority set to 1.0 / (1 - view_fragmentation/100)
+    and then listing that channel in the [smoosh] view_channels config
+    setting.
+
+*   ``from`` / ``to`` / ``strict_window``: each of these settings can be applied
+    on a per-channel basis in the new daemon. The one behavior change is that
+    the new daemon will suspend compactions upon exiting the allowed window
+    instead of canceling them outright, and resume them when re-entering.
+
+*   ``parallel_view_compaction``: each compaction channel has a concurrency
+    setting that controls how many compactions will execute in parallel in that
+    channel. The total parallelism is the sum of the concurrency settings of all
+    active channels. This is a departure from the previous behavior, in which
+    the daemon would only focus on one database and/or its views (depending on
+    the value of this flag) at a time.
+
+The ``check_interval`` and ``snooze_period_ms`` settings are obsolete in the
+event-driven design of the new daemon. The new daemon does not support setting
+database-specific thresholds as in the ``mydb`` setting above. Rather, channels
+can be configured to focus on specific classes of files: large databases, small
+view indexes, and so on. Most cases of named database compaction rules can be
+expressed using properties of those databases and/or their associated views.