You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "Nick Vatamaniuc (JIRA)" <ji...@apache.org> on 2016/03/07 17:32:40 UTC

[jira] [Created] (COUCHDB-2965) Race condition in replicator rescan logic

Nick Vatamaniuc created COUCHDB-2965:
----------------------------------------

             Summary: Race condition in replicator rescan logic
                 Key: COUCHDB-2965
                 URL: https://issues.apache.org/jira/browse/COUCHDB-2965
             Project: CouchDB
          Issue Type: Bug
          Components: Replication
            Reporter: Nick Vatamaniuc


There is race condition between the full rescan and regular change feed processing in the couch_replicator_manger code.

This race condition would lead to replication docs left in untriggered state when a rescan of all the docs is performed. The rescan might happen when nodes connect and disconnect. The likelihood of this race condition appear goes up if a lot of documents are updated and there is a back-up of messages in the replicator manager's mailbox.

The race condition happens in the following way:

* A full rescan is initiated here:

https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_manager.erl#L424

It clears the db_to_seq ets table which holds the latest change sequence for each replicator database. Then launches a scan_all_dbs process.

 * scan_all_dbs will find all replicator-looking-like database and for each send a {resume_scan, DbName} message to the main couch_replicator_manager process.

 * {resume_scan, DbName} message is handled here:

https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_manager.erl#L233

The expectation is because db_to_seq was reset it ends up not finding a sequence checkpoint in db_to_seq, so start 0 and spawns a new change feed, which will rescan all documents (since we need to determine ownership for them).

But the race condition occurs because when change feeds stop, they call  replicator manager with {rep_db_checkpoint, DbName} message. That will update db_to_seq ets table with the latest change sequence.

https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_manager.erl#L225

Which means this sequence of operations could happen:

 * db_to_seq is reset to 0, scan_all_dbs is spawned

 * change feed stops at sequence 1042, it calls {rep_db_checkpoint, <<"_replicator">>}

 * {rep_db_checkpoint, <<"_replicator">>} call is handled, now latest db_to_seq for _replicator is 1042

 * {resume, <<"_replicator">>} is sent from scan_all_dbs process

 * {resume, <<"_replicator">>} is received by replicator manager. It sees that db_to_seq has _replicator with latest sequence 1042, so it will either start from that instead of 0, thus skipping updates from 0 to 1042.

This was seen by running the experiment with1000 replication documents were being updated. Around document 700 or so , node1 was killed (pkill -f node1) . node2 experienced the race condition on rescan and never picked up a bunch of document that should have belong to it. didn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)