You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by "Christopher D. Malon" <ma...@groupring.net> on 2017/03/09 22:08:30 UTC

incomplete replication under 2.0.0

I replicated a database (continuously), but ended up with fewer
documents in the target than in the source.  Even if I wait,
the remaining documents don't appear.

1. Here's the DB entry on the source machine, showing 12 documents:

{"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}

2. Here's the DB entry on the target machine, showing 6 documents:

{"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}

3. Here's _active_tasks for the task, converted to YAML for readability:

- changes_pending: 0
  checkpoint_interval: 30000
  checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
  continuous: !!perl/scalar:JSON::PP::Boolean 1
  database: shards/00000000-1fffffff/_replicator.1489086006
  doc_id: 172.16.100.222_library
  doc_write_failures: 0
  docs_read: 12
  docs_written: 12
  missing_revisions_found: 12
  node: couchdb@localhost
  pid: <0.5521.0>
  replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
  revisions_checked: 12
  source: http://172.16.100.222:5984/library/
  source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
  started_on: 1489086008
  target: http://localhost:5984/library/
  through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
  type: replication
  updated_on: 1489096815
  user: peer

4. Here's the _replicator record for the task:

{"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}

There should have been no conflicting transactions on the target host.
The appearance of "61-*" in through_seq of the _active_tasks entry
gives me a false sense of security; I only noticed the missing documents
by chance.

A fresh replication to a different target succeeded without any
missing documents.

Is there anything here that would tip me off that the target wasn't
in sync with the source?  Is there a good way to resolve the condition?

Thanks,
Christopher

Re: incomplete replication under 2.0.0

Posted by Robert Samuel Newson <rn...@apache.org>.
sorry for late reply.

That's very curious. Can you file a JIRA for this? If the replicator says it replicated to the target, that should always be true. I can't immediately think why emfile would wreck that (I'd expect the writes to either fail or succeed and for the replicator to agree).

B.


> On 21 Mar 2017, at 16:26, Christopher D. Malon <ma...@groupring.net> wrote:
> 
> These problems appear to be due to the replicator crashing
> with {error,{conn_failed,{error,emfile}}}, which apparently
> means that I surpassed an open file limit.
> 
> The replications were successful if I executed
> 
> ulimit -Sn 4096
> 
> prior to launching CouchDB, in the same shell.
> 
> I'm a bit surprised the replication can't recover after some
> files are closed; regular DB gets and puts still worked.
> 
> 
> On Wed, 15 Mar 2017 19:43:27 -0400
> "Christopher D. Malon" <ma...@groupring.net> wrote:
> 
>> Those both return 
>> 
>> {"error":"not_found","reason":"missing"}
>> 
>> In the latest example, I have a database where the source has
>> doc_count 226, the target gets doc_count 222, and the task reports
>> 
>>  docs_read: 230
>>  docs_written: 230
>>  missing_revisions_found: 230
>>  revisions_checked: 231
>> 
>> but the missing documents don't show up as deleted.
>> 
>> 
>> On Wed, 15 Mar 2017 23:13:57 +0000
>> Robert Samuel Newson <rn...@apache.org> wrote:
>> 
>>> Hi,
>>> 
>>> the presence of;
>>> 
>>>>>> docs_read: 12
>>>>>> docs_written: 12
>>> 
>>> Is what struck me here. the replicator claims to have replicated 12 docs, which is your expectation and mine, and yet you say they don't appear in the target.
>>> 
>>> Do you know the doc ids of these missing documents? if so, try GET /dbname/docid?deleted=true and GET /dbname/docid?open_revs=all
>>> 
>>> B.
>>> 
>>>> On 15 Mar 2017, at 18:45, Christopher D. Malon <ma...@groupring.net> wrote:
>>>> 
>>>> Could you explain the meaning of source_seq, checkpointed_source_seq,
>>>> and through_seq in more detail?  This problem has happened several times,
>>>> with slightly different statuses in _active_tasks, and slightly different
>>>> numbers of documents succesfully copied.  On the most recent attempt,
>>>> checkpointed_source_seq and through_seq are 61-* (matching the source's
>>>> update_seq), but source_seq is 0, and just 9 of the 12 documents are copied.
>>>> 
>>>> When a replication task is in _replicator but is not listed in _active_tasks
>>>> within two minutes, a script of mine deletes the job from _replicator
>>>> and re-submits it.  In Couch DB 1.6, this seemed to resolve some kinds
>>>> of stalled replications.  Now I wonder if the replication is not resuming
>>>> properly after the deletion and resubmission.
>>>> 
>>>> Christopher
>>>> 
>>>> 
>>>> On Fri, 10 Mar 2017 06:40:49 +0000
>>>> Robert Newson <rn...@apache.org> wrote:
>>>> 
>>>>> Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
>>>>>> 
>>>>>> I replicated a database (continuously), but ended up with fewer
>>>>>> documents in the target than in the source.  Even if I wait,
>>>>>> the remaining documents don't appear.
>>>>>> 
>>>>>> 1. Here's the DB entry on the source machine, showing 12 documents:
>>>>>> 
>>>>>> {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
>>>>>> 
>>>>>> 2. Here's the DB entry on the target machine, showing 6 documents:
>>>>>> 
>>>>>> {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
>>>>>> 
>>>>>> 3. Here's _active_tasks for the task, converted to YAML for readability:
>>>>>> 
>>>>>> - changes_pending: 0
>>>>>> checkpoint_interval: 30000
>>>>>> checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
>>>>>> pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
>>>>>> RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
>>>>>> 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>>>>>> continuous: !!perl/scalar:JSON::PP::Boolean 1
>>>>>> database: shards/00000000-1fffffff/_replicator.1489086006
>>>>>> doc_id: 172.16.100.222_library
>>>>>> doc_write_failures: 0
>>>>>> docs_read: 12
>>>>>> docs_written: 12
>>>>>> missing_revisions_found: 12
>>>>>> node: couchdb@localhost
>>>>>> pid: <0.5521.0>
>>>>>> replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
>>>>>> revisions_checked: 12
>>>>>> source: http://172.16.100.222:5984/library/
>>>>>> source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>>>>>> started_on: 1489086008
>>>>>> target: http://localhost:5984/library/
>>>>>> through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>>>>>> type: replication
>>>>>> updated_on: 1489096815
>>>>>> user: peer
>>>>>> 
>>>>>> 4. Here's the _replicator record for the task:
>>>>>> 
>>>>>> {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
>>>>>> 
>>>>>> There should have been no conflicting transactions on the target host.
>>>>>> The appearance of "61-*" in through_seq of the _active_tasks entry
>>>>>> gives me a false sense of security; I only noticed the missing documents
>>>>>> by chance.
>>>>>> 
>>>>>> A fresh replication to a different target succeeded without any
>>>>>> missing documents.
>>>>>> 
>>>>>> Is there anything here that would tip me off that the target wasn't
>>>>>> in sync with the source?  Is there a good way to resolve the condition?
>>>>>> 
>>>>>> Thanks,
>>>>>> Christopher
>>>>> 
>>> 


Re: incomplete replication under 2.0.0

Posted by "Christopher D. Malon" <ma...@groupring.net>.
These problems appear to be due to the replicator crashing
with {error,{conn_failed,{error,emfile}}}, which apparently
means that I surpassed an open file limit.

The replications were successful if I executed

ulimit -Sn 4096

prior to launching CouchDB, in the same shell.

I'm a bit surprised the replication can't recover after some
files are closed; regular DB gets and puts still worked.


On Wed, 15 Mar 2017 19:43:27 -0400
"Christopher D. Malon" <ma...@groupring.net> wrote:

> Those both return 
> 
> {"error":"not_found","reason":"missing"}
> 
> In the latest example, I have a database where the source has
> doc_count 226, the target gets doc_count 222, and the task reports
> 
>   docs_read: 230
>   docs_written: 230
>   missing_revisions_found: 230
>   revisions_checked: 231
> 
> but the missing documents don't show up as deleted.
> 
> 
> On Wed, 15 Mar 2017 23:13:57 +0000
> Robert Samuel Newson <rn...@apache.org> wrote:
> 
> > Hi,
> > 
> > the presence of;
> > 
> > >>> docs_read: 12
> > >>> docs_written: 12
> > 
> > Is what struck me here. the replicator claims to have replicated 12 docs, which is your expectation and mine, and yet you say they don't appear in the target.
> > 
> > Do you know the doc ids of these missing documents? if so, try GET /dbname/docid?deleted=true and GET /dbname/docid?open_revs=all
> > 
> > B.
> > 
> > > On 15 Mar 2017, at 18:45, Christopher D. Malon <ma...@groupring.net> wrote:
> > > 
> > > Could you explain the meaning of source_seq, checkpointed_source_seq,
> > > and through_seq in more detail?  This problem has happened several times,
> > > with slightly different statuses in _active_tasks, and slightly different
> > > numbers of documents succesfully copied.  On the most recent attempt,
> > > checkpointed_source_seq and through_seq are 61-* (matching the source's
> > > update_seq), but source_seq is 0, and just 9 of the 12 documents are copied.
> > > 
> > > When a replication task is in _replicator but is not listed in _active_tasks
> > > within two minutes, a script of mine deletes the job from _replicator
> > > and re-submits it.  In Couch DB 1.6, this seemed to resolve some kinds
> > > of stalled replications.  Now I wonder if the replication is not resuming
> > > properly after the deletion and resubmission.
> > > 
> > > Christopher
> > > 
> > > 
> > > On Fri, 10 Mar 2017 06:40:49 +0000
> > > Robert Newson <rn...@apache.org> wrote:
> > > 
> > >> Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?
> > >> 
> > >> Sent from my iPhone
> > >> 
> > >>> On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
> > >>> 
> > >>> I replicated a database (continuously), but ended up with fewer
> > >>> documents in the target than in the source.  Even if I wait,
> > >>> the remaining documents don't appear.
> > >>> 
> > >>> 1. Here's the DB entry on the source machine, showing 12 documents:
> > >>> 
> > >>> {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
> > >>> 
> > >>> 2. Here's the DB entry on the target machine, showing 6 documents:
> > >>> 
> > >>> {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
> > >>> 
> > >>> 3. Here's _active_tasks for the task, converted to YAML for readability:
> > >>> 
> > >>> - changes_pending: 0
> > >>> checkpoint_interval: 30000
> > >>> checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
> > >>> pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
> > >>> RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
> > >>> 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> > >>> continuous: !!perl/scalar:JSON::PP::Boolean 1
> > >>> database: shards/00000000-1fffffff/_replicator.1489086006
> > >>> doc_id: 172.16.100.222_library
> > >>> doc_write_failures: 0
> > >>> docs_read: 12
> > >>> docs_written: 12
> > >>> missing_revisions_found: 12
> > >>> node: couchdb@localhost
> > >>> pid: <0.5521.0>
> > >>> replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
> > >>> revisions_checked: 12
> > >>> source: http://172.16.100.222:5984/library/
> > >>> source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> > >>> started_on: 1489086008
> > >>> target: http://localhost:5984/library/
> > >>> through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> > >>> type: replication
> > >>> updated_on: 1489096815
> > >>> user: peer
> > >>> 
> > >>> 4. Here's the _replicator record for the task:
> > >>> 
> > >>> {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
> > >>> 
> > >>> There should have been no conflicting transactions on the target host.
> > >>> The appearance of "61-*" in through_seq of the _active_tasks entry
> > >>> gives me a false sense of security; I only noticed the missing documents
> > >>> by chance.
> > >>> 
> > >>> A fresh replication to a different target succeeded without any
> > >>> missing documents.
> > >>> 
> > >>> Is there anything here that would tip me off that the target wasn't
> > >>> in sync with the source?  Is there a good way to resolve the condition?
> > >>> 
> > >>> Thanks,
> > >>> Christopher
> > >> 
> > 

Re: incomplete replication under 2.0.0

Posted by "Christopher D. Malon" <ma...@groupring.net>.
Those both return 

{"error":"not_found","reason":"missing"}

In the latest example, I have a database where the source has
doc_count 226, the target gets doc_count 222, and the task reports

  docs_read: 230
  docs_written: 230
  missing_revisions_found: 230
  revisions_checked: 231

but the missing documents don't show up as deleted.


On Wed, 15 Mar 2017 23:13:57 +0000
Robert Samuel Newson <rn...@apache.org> wrote:

> Hi,
> 
> the presence of;
> 
> >>> docs_read: 12
> >>> docs_written: 12
> 
> Is what struck me here. the replicator claims to have replicated 12 docs, which is your expectation and mine, and yet you say they don't appear in the target.
> 
> Do you know the doc ids of these missing documents? if so, try GET /dbname/docid?deleted=true and GET /dbname/docid?open_revs=all
> 
> B.
> 
> > On 15 Mar 2017, at 18:45, Christopher D. Malon <ma...@groupring.net> wrote:
> > 
> > Could you explain the meaning of source_seq, checkpointed_source_seq,
> > and through_seq in more detail?  This problem has happened several times,
> > with slightly different statuses in _active_tasks, and slightly different
> > numbers of documents succesfully copied.  On the most recent attempt,
> > checkpointed_source_seq and through_seq are 61-* (matching the source's
> > update_seq), but source_seq is 0, and just 9 of the 12 documents are copied.
> > 
> > When a replication task is in _replicator but is not listed in _active_tasks
> > within two minutes, a script of mine deletes the job from _replicator
> > and re-submits it.  In Couch DB 1.6, this seemed to resolve some kinds
> > of stalled replications.  Now I wonder if the replication is not resuming
> > properly after the deletion and resubmission.
> > 
> > Christopher
> > 
> > 
> > On Fri, 10 Mar 2017 06:40:49 +0000
> > Robert Newson <rn...@apache.org> wrote:
> > 
> >> Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?
> >> 
> >> Sent from my iPhone
> >> 
> >>> On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
> >>> 
> >>> I replicated a database (continuously), but ended up with fewer
> >>> documents in the target than in the source.  Even if I wait,
> >>> the remaining documents don't appear.
> >>> 
> >>> 1. Here's the DB entry on the source machine, showing 12 documents:
> >>> 
> >>> {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
> >>> 
> >>> 2. Here's the DB entry on the target machine, showing 6 documents:
> >>> 
> >>> {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
> >>> 
> >>> 3. Here's _active_tasks for the task, converted to YAML for readability:
> >>> 
> >>> - changes_pending: 0
> >>> checkpoint_interval: 30000
> >>> checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
> >>> pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
> >>> RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
> >>> 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >>> continuous: !!perl/scalar:JSON::PP::Boolean 1
> >>> database: shards/00000000-1fffffff/_replicator.1489086006
> >>> doc_id: 172.16.100.222_library
> >>> doc_write_failures: 0
> >>> docs_read: 12
> >>> docs_written: 12
> >>> missing_revisions_found: 12
> >>> node: couchdb@localhost
> >>> pid: <0.5521.0>
> >>> replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
> >>> revisions_checked: 12
> >>> source: http://172.16.100.222:5984/library/
> >>> source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >>> started_on: 1489086008
> >>> target: http://localhost:5984/library/
> >>> through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >>> type: replication
> >>> updated_on: 1489096815
> >>> user: peer
> >>> 
> >>> 4. Here's the _replicator record for the task:
> >>> 
> >>> {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
> >>> 
> >>> There should have been no conflicting transactions on the target host.
> >>> The appearance of "61-*" in through_seq of the _active_tasks entry
> >>> gives me a false sense of security; I only noticed the missing documents
> >>> by chance.
> >>> 
> >>> A fresh replication to a different target succeeded without any
> >>> missing documents.
> >>> 
> >>> Is there anything here that would tip me off that the target wasn't
> >>> in sync with the source?  Is there a good way to resolve the condition?
> >>> 
> >>> Thanks,
> >>> Christopher
> >> 
> 

Re: incomplete replication under 2.0.0

Posted by Robert Samuel Newson <rn...@apache.org>.
Hi,

the presence of;

>>> docs_read: 12
>>> docs_written: 12

Is what struck me here. the replicator claims to have replicated 12 docs, which is your expectation and mine, and yet you say they don't appear in the target.

Do you know the doc ids of these missing documents? if so, try GET /dbname/docid?deleted=true and GET /dbname/docid?open_revs=all

B.

> On 15 Mar 2017, at 18:45, Christopher D. Malon <ma...@groupring.net> wrote:
> 
> Could you explain the meaning of source_seq, checkpointed_source_seq,
> and through_seq in more detail?  This problem has happened several times,
> with slightly different statuses in _active_tasks, and slightly different
> numbers of documents succesfully copied.  On the most recent attempt,
> checkpointed_source_seq and through_seq are 61-* (matching the source's
> update_seq), but source_seq is 0, and just 9 of the 12 documents are copied.
> 
> When a replication task is in _replicator but is not listed in _active_tasks
> within two minutes, a script of mine deletes the job from _replicator
> and re-submits it.  In Couch DB 1.6, this seemed to resolve some kinds
> of stalled replications.  Now I wonder if the replication is not resuming
> properly after the deletion and resubmission.
> 
> Christopher
> 
> 
> On Fri, 10 Mar 2017 06:40:49 +0000
> Robert Newson <rn...@apache.org> wrote:
> 
>> Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?
>> 
>> Sent from my iPhone
>> 
>>> On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
>>> 
>>> I replicated a database (continuously), but ended up with fewer
>>> documents in the target than in the source.  Even if I wait,
>>> the remaining documents don't appear.
>>> 
>>> 1. Here's the DB entry on the source machine, showing 12 documents:
>>> 
>>> {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
>>> 
>>> 2. Here's the DB entry on the target machine, showing 6 documents:
>>> 
>>> {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
>>> 
>>> 3. Here's _active_tasks for the task, converted to YAML for readability:
>>> 
>>> - changes_pending: 0
>>> checkpoint_interval: 30000
>>> checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
>>> pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
>>> RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
>>> 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>>> continuous: !!perl/scalar:JSON::PP::Boolean 1
>>> database: shards/00000000-1fffffff/_replicator.1489086006
>>> doc_id: 172.16.100.222_library
>>> doc_write_failures: 0
>>> docs_read: 12
>>> docs_written: 12
>>> missing_revisions_found: 12
>>> node: couchdb@localhost
>>> pid: <0.5521.0>
>>> replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
>>> revisions_checked: 12
>>> source: http://172.16.100.222:5984/library/
>>> source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>>> started_on: 1489086008
>>> target: http://localhost:5984/library/
>>> through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>>> type: replication
>>> updated_on: 1489096815
>>> user: peer
>>> 
>>> 4. Here's the _replicator record for the task:
>>> 
>>> {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
>>> 
>>> There should have been no conflicting transactions on the target host.
>>> The appearance of "61-*" in through_seq of the _active_tasks entry
>>> gives me a false sense of security; I only noticed the missing documents
>>> by chance.
>>> 
>>> A fresh replication to a different target succeeded without any
>>> missing documents.
>>> 
>>> Is there anything here that would tip me off that the target wasn't
>>> in sync with the source?  Is there a good way to resolve the condition?
>>> 
>>> Thanks,
>>> Christopher
>> 


Re: incomplete replication under 2.0.0

Posted by "Christopher D. Malon" <ma...@groupring.net>.
Could you explain the meaning of source_seq, checkpointed_source_seq,
and through_seq in more detail?  This problem has happened several times,
with slightly different statuses in _active_tasks, and slightly different
numbers of documents succesfully copied.  On the most recent attempt,
checkpointed_source_seq and through_seq are 61-* (matching the source's
update_seq), but source_seq is 0, and just 9 of the 12 documents are copied.

When a replication task is in _replicator but is not listed in _active_tasks
within two minutes, a script of mine deletes the job from _replicator
and re-submits it.  In Couch DB 1.6, this seemed to resolve some kinds
of stalled replications.  Now I wonder if the replication is not resuming
properly after the deletion and resubmission.

Christopher


On Fri, 10 Mar 2017 06:40:49 +0000
Robert Newson <rn...@apache.org> wrote:

> Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?
> 
> Sent from my iPhone
> 
> > On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
> > 
> > I replicated a database (continuously), but ended up with fewer
> > documents in the target than in the source.  Even if I wait,
> > the remaining documents don't appear.
> > 
> > 1. Here's the DB entry on the source machine, showing 12 documents:
> > 
> > {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
> > 
> > 2. Here's the DB entry on the target machine, showing 6 documents:
> > 
> > {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
> > 
> > 3. Here's _active_tasks for the task, converted to YAML for readability:
> > 
> > - changes_pending: 0
> >  checkpoint_interval: 30000
> >  checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
> > pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
> > RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
> > 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >  continuous: !!perl/scalar:JSON::PP::Boolean 1
> >  database: shards/00000000-1fffffff/_replicator.1489086006
> >  doc_id: 172.16.100.222_library
> >  doc_write_failures: 0
> >  docs_read: 12
> >  docs_written: 12
> >  missing_revisions_found: 12
> >  node: couchdb@localhost
> >  pid: <0.5521.0>
> >  replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
> >  revisions_checked: 12
> >  source: http://172.16.100.222:5984/library/
> >  source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >  started_on: 1489086008
> >  target: http://localhost:5984/library/
> >  through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >  type: replication
> >  updated_on: 1489096815
> >  user: peer
> > 
> > 4. Here's the _replicator record for the task:
> > 
> > {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
> > 
> > There should have been no conflicting transactions on the target host.
> > The appearance of "61-*" in through_seq of the _active_tasks entry
> > gives me a false sense of security; I only noticed the missing documents
> > by chance.
> > 
> > A fresh replication to a different target succeeded without any
> > missing documents.
> > 
> > Is there anything here that would tip me off that the target wasn't
> > in sync with the source?  Is there a good way to resolve the condition?
> > 
> > Thanks,
> > Christopher
> 

Re: incomplete replication under 2.0.0

Posted by "Christopher D. Malon" <ma...@groupring.net>.
No.

The only fishy thing I can think of is that the _users and
_global_changes databases weren't yet created on the target
when the replication started.


On Fri, 10 Mar 2017 06:40:49 +0000
Robert Newson <rn...@apache.org> wrote:

> Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?
> 
> Sent from my iPhone
> 
> > On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
> > 
> > I replicated a database (continuously), but ended up with fewer
> > documents in the target than in the source.  Even if I wait,
> > the remaining documents don't appear.
> > 
> > 1. Here's the DB entry on the source machine, showing 12 documents:
> > 
> > {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
> > 
> > 2. Here's the DB entry on the target machine, showing 6 documents:
> > 
> > {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
> > 
> > 3. Here's _active_tasks for the task, converted to YAML for readability:
> > 
> > - changes_pending: 0
> >  checkpoint_interval: 30000
> >  checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
> > pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
> > RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
> > 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >  continuous: !!perl/scalar:JSON::PP::Boolean 1
> >  database: shards/00000000-1fffffff/_replicator.1489086006
> >  doc_id: 172.16.100.222_library
> >  doc_write_failures: 0
> >  docs_read: 12
> >  docs_written: 12
> >  missing_revisions_found: 12
> >  node: couchdb@localhost
> >  pid: <0.5521.0>
> >  replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
> >  revisions_checked: 12
> >  source: http://172.16.100.222:5984/library/
> >  source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >  started_on: 1489086008
> >  target: http://localhost:5984/library/
> >  through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
> >  type: replication
> >  updated_on: 1489096815
> >  user: peer
> > 
> > 4. Here's the _replicator record for the task:
> > 
> > {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
> > 
> > There should have been no conflicting transactions on the target host.
> > The appearance of "61-*" in through_seq of the _active_tasks entry
> > gives me a false sense of security; I only noticed the missing documents
> > by chance.
> > 
> > A fresh replication to a different target succeeded without any
> > missing documents.
> > 
> > Is there anything here that would tip me off that the target wasn't
> > in sync with the source?  Is there a good way to resolve the condition?
> > 
> > Thanks,
> > Christopher
> 

Re: incomplete replication under 2.0.0

Posted by Robert Newson <rn...@apache.org>.
Were the six missing documents newer on the target? That is, did you delete them on the target and expect another replication to restore them?

Sent from my iPhone

> On 9 Mar 2017, at 22:08, Christopher D. Malon <ma...@groupring.net> wrote:
> 
> I replicated a database (continuously), but ended up with fewer
> documents in the target than in the source.  Even if I wait,
> the remaining documents don't appear.
> 
> 1. Here's the DB entry on the source machine, showing 12 documents:
> 
> {"db_name":"library","update_seq":"61-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHjE-dA0hdPFgdIz51CSB19WB1BnjU5bEASYYGIAVUOh-_mRC1CyBq9-P3D0TtAYja-1mJbATVPoCoBbqXKQsA-0Fvaw","sizes":{"file":181716,"external":11524,"active":60098},"purge_seq":0,"other":{"data_size":11524},"doc_del_count":0,"doc_count":12,"disk_size":181716,"disk_format_version":6,"data_size":60098,"compact_running":false,"instance_start_time":"0"}
> 
> 2. Here's the DB entry on the target machine, showing 6 documents:
> 
> {"db_name":"library","update_seq":"6-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkD1bHhE-dA0hdPFgdIz51CSB19QTV5bEASYYGIAVUOh-_GyFqF0DU7idG7QGI2vvEqH0AUQvyfxYA1_dvNA","sizes":{"file":82337,"external":2282,"active":5874},"purge_seq":0,"other":{"data_size":2282},"doc_del_count":0,"doc_count":6,"disk_size":82337,"disk_format_version":6,"data_size":5874,"compact_running":false,"instance_start_time":"0"}
> 
> 3. Here's _active_tasks for the task, converted to YAML for readability:
> 
> - changes_pending: 0
>  checkpoint_interval: 30000
>  checkpointed_source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWyl
> pvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkW
> RV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu
> 1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>  continuous: !!perl/scalar:JSON::PP::Boolean 1
>  database: shards/00000000-1fffffff/_replicator.1489086006
>  doc_id: 172.16.100.222_library
>  doc_write_failures: 0
>  docs_read: 12
>  docs_written: 12
>  missing_revisions_found: 12
>  node: couchdb@localhost
>  pid: <0.5521.0>
>  replication_id: c60427215125bd97559d069f6fb3ddb4+continuous+create_target
>  revisions_checked: 12
>  source: http://172.16.100.222:5984/library/
>  source_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>  started_on: 1489086008
>  target: http://localhost:5984/library/
>  through_seq: 61-g1AAAAJTeJyd0EsOgjAQBuAqxsfSE-gRKK08VnIT7UwhSBAWylpvojfRm-hNsLQkbAgRNtOkk__L5M8IIcvEkmSNRYmJhDArUGRJcblmajUVBDZVVaWJJchZfSwAucPQkWRV5jKKT3kke-KwVRP2jWBpgdMAwcOuTJ8U1tKhkSZaYhS5x2GodKylWyPZWnJ9QW3KBkr5TE1yV4_CHu1dMeyQ-c4o7Wm0V9u4F9setaM_GzfK2yifWplrxYeAcuGOuulrNN3X1PTFgXPqd-XSHxdwuSQ
>  type: replication
>  updated_on: 1489096815
>  user: peer
> 
> 4. Here's the _replicator record for the task:
> 
> {"_id":"172.16.100.222_library","_rev":"2-8e6cf63bc167c7c7e4bd38242218572c","schema":1,"storejson":null,"source":"http://172.16.100.222:5984/library","target":"http://localhost:5984/library","create_target":true,"dont_storejson":1,"wholejson":{},"user_ctx":{"roles":["_admin"],"name":"peer"},"continuous":true,"owner":null,"_replication_state":"triggered","_replication_state_time":"2017-03-09T19:00:08+00:00","_replication_id":"c60427215125bd97559d069f6fb3ddb4"}
> 
> There should have been no conflicting transactions on the target host.
> The appearance of "61-*" in through_seq of the _active_tasks entry
> gives me a false sense of security; I only noticed the missing documents
> by chance.
> 
> A fresh replication to a different target succeeded without any
> missing documents.
> 
> Is there anything here that would tip me off that the target wasn't
> in sync with the source?  Is there a good way to resolve the condition?
> 
> Thanks,
> Christopher