You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Alexander Shorin (JIRA)" <ji...@apache.org> on 2013/06/19 11:30:21 UTC
[jira] [Resolved] (COUCHDB-1153) Database and view index compaction
daemon
[ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Shorin resolved COUCHDB-1153.
---------------------------------------
Resolution: Fixed
Fix Version/s: 1.2
Fixed in [ac0946a|https://git-wip-us.apache.org/repos/asf?p=couchdb.git;a=commit;h=ac0946a]
> Database and view index compaction daemon
> -----------------------------------------
>
> Key: COUCHDB-1153
> URL: https://issues.apache.org/jira/browse/COUCHDB-1153
> Project: CouchDB
> Issue Type: New Feature
> Environment: trunk
> Reporter: Filipe Manana
> Assignee: Filipe Manana
> Priority: Minor
> Labels: compaction
> Fix For: 1.2
>
>
> I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and "strict_window" (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added "data_size" parameter to the database and view group information URIs (COUCHDB-1132).
> I've documented the .ini configuration, as a comment in default.ini, which I paste here:
> [compaction_daemon]
> ; The delay, in seconds, between each check for which database and view indexes
> ; need to be compacted.
> check_interval = 60
> ; If a database or view index file is smaller then this value (in bytes),
> ; compaction will not happen. Very small files always have a very high
> ; fragmentation therefore it's not worth to compact them.
> min_file_size = 131072
> [compactions]
> ; List of compaction rules for the compaction daemon.
> ; The daemon compacts databases and they're respective view groups when all the
> ; condition parameters are satisfied. Configuration can be per database or
> ; global, and it has the following format:
> ;
> ; database_name = parameter=value [, parameter=value]*
> ; _default = parameter=value [, parameter=value]*
> ;
> ; Possible parameters:
> ;
> ; * db_fragmentation - If the ratio (as an integer percentage), of the amount
> ; of old data (and its supporting metadata) over the database
> ; file size is equal to or greater then this value, this
> ; database compaction condition is satisfied.
> ; This value is computed as:
> ;
> ; (file_size - data_size) / file_size * 100
> ;
> ; The data_size and file_size values can be obtained when
> ; querying a database's information URI (GET /dbname/).
> ;
> ; * view_fragmentation - If the ratio (as an integer percentage), of the amount
> ; of old data (and its supporting metadata) over the view
> ; index (view group) file size is equal to or greater then
> ; this value, then this view index compaction condition is
> ; satisfied. This value is computed as:
> ;
> ; (file_size - data_size) / file_size * 100
> ;
> ; The data_size and file_size values can be obtained when
> ; querying a view group's information URI
> ; (GET /dbname/_design/groupname/_info).
> ;
> ; * period - The period for which a database (and its view groups) compaction
> ; is allowed. This value must obey the following format:
> ;
> ; HH:MM - HH:MM (HH in [0..23], MM in [0..59])
> ;
> ; * strict_window - If a compaction is still running after the end of the allowed
> ; period, it will be canceled if this parameter is set to "yes".
> ; It defaults to "no" and it's meaningful only if the *period*
> ; parameter is also specified.
> ;
> ; * parallel_view_compaction - If set to "yes", the database and its views are
> ; compacted in parallel. This is only useful on
> ; certain setups, like for example when the database
> ; and view index directories point to different
> ; disks. It defaults to "no".
> ;
> ; Before a compaction is triggered, an estimation of how much free disk space is
> ; needed is computed. This estimation corresponds to 2 times the data size of
> ; the database or view index. When there's not enough free disk space to compact
> ; a particular database or view index, a warning message is logged.
> ;
> ; Examples:
> ;
> ; 1) foo = db_fragmentation = 70%, view_fragmentation = 60%
> ; The `foo` database is compacted if its fragmentation is 70% or more.
> ; Any view index of this database is compacted only if its fragmentation
> ; is 60% or more.
> ;
> ; 2) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00
> ; Similar to the preceding example but a compaction (database or view index)
> ; is only triggered if the current time is between midnight and 4 AM.
> ;
> ; 3) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00, strict_window = yes
> ; Similar to the preceding example - a compaction (database or view index)
> ; is only triggered if the current time is between midnight and 4 AM. If at
> ; 4 AM the database or one of its views is still compacting, the compaction
> ; process will be canceled.
> ;
> ;_default = db_fragmentation = 70%, view_fragmentation = 60%, period = 23:00 - 04:00
> (from https://github.com/fdmanana/couchdb/compare/compaction_daemon#L0R195)
> The full patch is mostly a new module but also does some minimal changes and a small refactoring to the view compaction code, not changing the current behaviour.
> Patch is at:
> https://github.com/fdmanana/couchdb/compare/compaction_daemon.patch
> By default the daemon is idle, without any configuration enabled. I'm open to suggestions on additional parameters and a better configuration system.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira