You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Filipe Manana (JIRA)" <ji...@apache.org> on 2011/08/16 03:30:29 UTC

[jira] [Updated] (COUCHDB-1153) Database and view index compaction daemon

     [ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Filipe Manana updated COUCHDB-1153:
-----------------------------------

    Description: 
I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and "strict_window" (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added "data_size" parameter to the database and view group information URIs (COUCHDB-1132).

I've documented the .ini configuration, as a comment in default.ini, which I paste here:

[compaction_daemon]
; The delay, in seconds, between each check for which database and view indexes
; need to be compacted.
check_interval = 60
; If a database or view index file is smaller then this value (in bytes),
; compaction will not happen. Very small files always have a very high
; fragmentation therefore it's not worth to compact them.
min_file_size = 131072

[compactions]
; List of compaction rules for the compaction daemon.
; The daemon compacts databases and they're respective view groups when all the
; condition parameters are satisfied. Configuration can be per database or
; global, and it has the following format:
;
; database_name = parameter=value [, parameter=value]*
; _default = parameter=value [, parameter=value]*
;
; Possible parameters:
;
; * db_fragmentation - If the ratio (as an integer percentage), of the amount
;                      of old data (and its supporting metadata) over the database
;                      file size is equal to or greater then this value, this
;                      database compaction condition is satisfied.
;                      This value is computed as:
;
;                           (file_size - data_size) / file_size * 100
;
;                      The data_size and file_size values can be obtained when
;                      querying a database's information URI (GET /dbname/).
;
; * view_fragmentation - If the ratio (as an integer percentage), of the amount
;                        of old data (and its supporting metadata) over the view
;                        index (view group) file size is equal to or greater then
;                        this value, then this view index compaction condition is
;                        satisfied. This value is computed as:
;
;                            (file_size - data_size) / file_size * 100
;
;                        The data_size and file_size values can be obtained when
;                        querying a view group's information URI
;                        (GET /dbname/_design/groupname/_info).
;
; * period - The period for which a database (and its view groups) compaction
;            is allowed. This value must obey the following format:
;
;                HH:MM - HH:MM  (HH in [0..23], MM in [0..59])
;
; * strict_window - If a compaction is still running after the end of the allowed
;                   period, it will be canceled if this parameter is set to "yes".
;                   It defaults to "no" and it's meaningful only if the *period*
;                   parameter is also specified.
;
; * parallel_view_compaction - If set to "yes", the database and its views are
;                              compacted in parallel. This is only useful on
;                              certain setups, like for example when the database
;                              and view index directories point to different
;                              disks. It defaults to "no".
;
; Before a compaction is triggered, an estimation of how much free disk space is
; needed is computed. This estimation corresponds to 2 times the data size of
; the database or view index. When there's not enough free disk space to compact
; a particular database or view index, a warning message is logged.
;
; Examples:
;
; 1) foo = db_fragmentation = 70%, view_fragmentation = 60%
;    The `foo` database is compacted if its fragmentation is 70% or more.
;    Any view index of this database is compacted only if its fragmentation
;    is 60% or more.
;
; 2) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00
;    Similar to the preceding example but a compaction (database or view index)
;    is only triggered if the current time is between midnight and 4 AM.
;
; 3) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00, strict_window = yes
;    Similar to the preceding example - a compaction (database or view index)
;    is only triggered if the current time is between midnight and 4 AM. If at
;    4 AM the database or one of its views is still compacting, the compaction
;    process will be canceled.
;
;_default = db_fragmentation = 70%, view_fragmentation = 60%, period = 23:00 - 04:00


(from https://github.com/fdmanana/couchdb/compare/compaction_daemon#L0R195)

The full patch is mostly a new module but also does some minimal changes and a small refactoring to the view compaction code, not changing the current behaviour.
Patch is at:

https://github.com/fdmanana/couchdb/compare/compaction_daemon.patch

By default the daemon is idle, without any configuration enabled. I'm open to suggestions on additional parameters and a better configuration system.

  was:
I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and "abortion" (whether an ongoing compaction should be stopped if it doesn't finish within the allowed period). These fragmentation values are based on the recently added "data_size" parameter to the database and view group information URIs (COUCHDB-1132).

I've documented the .ini configuration here:  https://github.com/fdmanana/couchdb/compare/compaction_daemon#diff-0

The full patch is mostly a new module but also does some minimal changes and a small refactoring to the view compaction code, not changing the current behaviour.
Patch is at:

https://github.com/fdmanana/couchdb/compare/compaction_daemon

By default the daemon is idle, without any configuration enabled. I'm open to suggestions on additional parameters and a better configuration system.


> Database and view index compaction daemon
> -----------------------------------------
>
>                 Key: COUCHDB-1153
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1153
>             Project: CouchDB
>          Issue Type: New Feature
>         Environment: trunk
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>            Priority: Minor
>              Labels: compaction
>
> I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and "strict_window" (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added "data_size" parameter to the database and view group information URIs (COUCHDB-1132).
> I've documented the .ini configuration, as a comment in default.ini, which I paste here:
> [compaction_daemon]
> ; The delay, in seconds, between each check for which database and view indexes
> ; need to be compacted.
> check_interval = 60
> ; If a database or view index file is smaller then this value (in bytes),
> ; compaction will not happen. Very small files always have a very high
> ; fragmentation therefore it's not worth to compact them.
> min_file_size = 131072
> [compactions]
> ; List of compaction rules for the compaction daemon.
> ; The daemon compacts databases and they're respective view groups when all the
> ; condition parameters are satisfied. Configuration can be per database or
> ; global, and it has the following format:
> ;
> ; database_name = parameter=value [, parameter=value]*
> ; _default = parameter=value [, parameter=value]*
> ;
> ; Possible parameters:
> ;
> ; * db_fragmentation - If the ratio (as an integer percentage), of the amount
> ;                      of old data (and its supporting metadata) over the database
> ;                      file size is equal to or greater then this value, this
> ;                      database compaction condition is satisfied.
> ;                      This value is computed as:
> ;
> ;                           (file_size - data_size) / file_size * 100
> ;
> ;                      The data_size and file_size values can be obtained when
> ;                      querying a database's information URI (GET /dbname/).
> ;
> ; * view_fragmentation - If the ratio (as an integer percentage), of the amount
> ;                        of old data (and its supporting metadata) over the view
> ;                        index (view group) file size is equal to or greater then
> ;                        this value, then this view index compaction condition is
> ;                        satisfied. This value is computed as:
> ;
> ;                            (file_size - data_size) / file_size * 100
> ;
> ;                        The data_size and file_size values can be obtained when
> ;                        querying a view group's information URI
> ;                        (GET /dbname/_design/groupname/_info).
> ;
> ; * period - The period for which a database (and its view groups) compaction
> ;            is allowed. This value must obey the following format:
> ;
> ;                HH:MM - HH:MM  (HH in [0..23], MM in [0..59])
> ;
> ; * strict_window - If a compaction is still running after the end of the allowed
> ;                   period, it will be canceled if this parameter is set to "yes".
> ;                   It defaults to "no" and it's meaningful only if the *period*
> ;                   parameter is also specified.
> ;
> ; * parallel_view_compaction - If set to "yes", the database and its views are
> ;                              compacted in parallel. This is only useful on
> ;                              certain setups, like for example when the database
> ;                              and view index directories point to different
> ;                              disks. It defaults to "no".
> ;
> ; Before a compaction is triggered, an estimation of how much free disk space is
> ; needed is computed. This estimation corresponds to 2 times the data size of
> ; the database or view index. When there's not enough free disk space to compact
> ; a particular database or view index, a warning message is logged.
> ;
> ; Examples:
> ;
> ; 1) foo = db_fragmentation = 70%, view_fragmentation = 60%
> ;    The `foo` database is compacted if its fragmentation is 70% or more.
> ;    Any view index of this database is compacted only if its fragmentation
> ;    is 60% or more.
> ;
> ; 2) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00
> ;    Similar to the preceding example but a compaction (database or view index)
> ;    is only triggered if the current time is between midnight and 4 AM.
> ;
> ; 3) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00, strict_window = yes
> ;    Similar to the preceding example - a compaction (database or view index)
> ;    is only triggered if the current time is between midnight and 4 AM. If at
> ;    4 AM the database or one of its views is still compacting, the compaction
> ;    process will be canceled.
> ;
> ;_default = db_fragmentation = 70%, view_fragmentation = 60%, period = 23:00 - 04:00
> (from https://github.com/fdmanana/couchdb/compare/compaction_daemon#L0R195)
> The full patch is mostly a new module but also does some minimal changes and a small refactoring to the view compaction code, not changing the current behaviour.
> Patch is at:
> https://github.com/fdmanana/couchdb/compare/compaction_daemon.patch
> By default the daemon is idle, without any configuration enabled. I'm open to suggestions on additional parameters and a better configuration system.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira