You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/01/24 19:33:26 UTC

[jira] [Commented] (COUCHDB-3277) Replication manager when it finds _replicator db shards which are not part of a mem3 db

    [ https://issues.apache.org/jira/browse/COUCHDB-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15836496#comment-15836496 ] 

ASF subversion and git services commented on COUCHDB-3277:
----------------------------------------------------------

Commit b281d2bb320ed6e6d8226765315a40637ba91a46 in couchdb-couch-replicator's branch refs/heads/master from [~vatamane]
[ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=b281d2b ]

Use mem3 to discover all _replicator shards in replicator manager

Previously this was done via recursive db directory traversal, looking for
shards names ending in `_replicator`. However, if there are orphanned shard
files (not associated with a clustered db), replicator manager crashes. It
restarts eventually, but as long as the orphanned shard file
without an entry in dbs db is present on the file system, replicator manager
will keep crashing and never reach some replication documents in shards which
would be traversed after the problematic shard. The user-visible effect of this
is some replication documents are never triggered.

To fix, use mem3 to traverse and discover `_replicator` shards. This was used
Cloudant's production code for many years it is battle-tested and it doesn't
suffer from file system vs mem3 inconsistency.

Local `_replicator` db is a special case. Since it is not clustered it will
not appear in the clustered db list. However it is already handled as a special
case in `init(_)` so that behavior is not affected by this change.

COUCHDB-3277


> Replication manager when it finds _replicator db shards which are not part of a mem3 db
> ---------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-3277
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-3277
>             Project: CouchDB
>          Issue Type: Bug
>            Reporter: Nick Vatamaniuc
>
> Currently replication manager scans the file system for shards which have a {{_replicator}} suffix when it starts up and discovers all replicator dbs.
> However, in the case if there is a {{_replicator}} shard without a corresponding mem3 dbs db entry, replicator manager crashes.
> These "orphan" replicator shards could be created during db creation, as shards are created first then an entry in the {{dbs}} db is added. Or if there is a move or backup process which might leave some db shards around.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)