You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/03/31 18:12:25 UTC

[jira] [Commented] (COUCHDB-2979) Replicator manager attempts to checkpoint too frequently

    [ https://issues.apache.org/jira/browse/COUCHDB-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220104#comment-15220104 ] 

ASF GitHub Bot commented on COUCHDB-2979:
-----------------------------------------

GitHub user nickva opened a pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/34

    Better handling of multiple concurrent replications

    * Reduce checkpoint frequency from 5 to 30 seconds
    
    * Avoid calling replicator manager for ownership checks. During attempted checkpoints each replication made a gen_server call to replication manager to decide if replication should still be running on that node. Use a function instead.
    
    
    COUCHDB-2979

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch-replicator couch-2979

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch-replicator/pull/34.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #34
    
----
commit 43f3c720811be0ad757cd9b9b823b30104b6913c
Author: Nick Vatamaniuc <va...@gmail.com>
Date:   2016-03-31T15:50:53Z

    Reduce checkpoint frequency from 5 to 30 seconds
    
    Use a macro to avoid hard-coding magic number
    in two places.
    
    COUCHDB-2979

commit 1a2eb18f0ed1a2dbca6dd6b346ae7e0f96deed95
Author: Nick Vatamaniuc <va...@gmail.com>
Date:   2016-03-31T15:58:01Z

    Avoid calling replicator manager for ownership checks
    
    During attempted checkpoint each replication made
    a gen_server call to replication manager to decide
    if replication should still be running that node.
    
    The call is generally fast (10s to 100s of microseconds),
    however if replication manager is blocked,
    for example, fetching large filter functions,
    none of the replications could checkpoint &
    make progress.
    
    Now replications call the ownership function
    without going through a gen_server call.
    
    COUCHDB-2979

----


> Replicator manager attempts to checkpoint too frequently
> --------------------------------------------------------
>
>                 Key: COUCHDB-2979
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2979
>             Project: CouchDB
>          Issue Type: Bug
>            Reporter: Nick Vatamaniuc
>
> Current checkpoint interval is set to 5 seconds. That works well for a few replications but when there are thousands of them it ends up being an attempt every few milliseconds or so.
> Moreover to decide on ownership (in order to keep on replication running per cluster) each replication during an attempted checkpoint uses a gen_server call to replicator manager. Those usually are fast (I bench-marked at a 100-200 usec) however if replicator manager is busy (say stuck fetching large filter documents when computing replication ids), none of the replication would be able to checkpoint and make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)