You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Ning Tan <ni...@gmail.com> on 2009/09/28 19:21:53 UTC

partial replications

Hi,

When we replicate between a remote database and a local one (pulling
from remote into local), we are observing partial replications,
meaning that we have to issue repeated _replicate calls for the
replication to complete. For a database with 10,000 documents, for
example, it could take up to 7 calls for the entire database to
replicate into an empty one. Each time, the number of documents
replicated over seemed random.

The use case is very simple--replicating a database into an empty one
with no concurrent writes, no additional load or i/o, etc.

Has anybody else seen this, and/or is this expected behavior?

The databases involved are a mixture of the 10.0.x code base natively
built on Ubuntu and Mac.  I can get more detailed information about
the environment if needed.

Thanks.

Re: partial replications

Posted by Ning Tan <ni...@gmail.com>.

On Mon, Sep 28, 2009 at 6:38 PM, Adam Kocoloski <ko...@apache.org> wrote:
>
> Hmm, I must admit I'm stumped so far.  Are you by any chance building from
> SVN repeatedly and installing into the same prefix?  Please feel free to
> file a ticket in JIRA[1] so we don't forget about this.  You might try again
> with the log level on the target set to debug, although I'm not certain it
> will tell us anything.  I'll see if I can find a way to reproduce this.

Thanks. I'll log the issue in JIRA tomorrow.

Two additional observations:

1) Futon vs. curl didn't make a difference;
2) I wasn't able to consistently reproduce it, but it happens more
often than not.

Re: partial replications

Posted by Matt Aimonetti <ma...@gmail.com>.

in the meantime, you might want to try a push replication just in case.
- Matt

On Mon, Sep 28, 2009 at 3:38 PM, Adam Kocoloski <ko...@apache.org> wrote:

> On Sep 28, 2009, at 4:44 PM, Ning Tan wrote:
>
>  On Mon, Sep 28, 2009 at 2:41 PM, Adam Kocoloski <ko...@apache.org>
>> wrote:
>>
>>> On Sep 28, 2009, at 1:21 PM, Ning Tan wrote:
>>>
>>>  Hi,
>>>>
>>>> When we replicate between a remote database and a local one (pulling
>>>> from remote into local), we are observing partial replications,
>>>> meaning that we have to issue repeated _replicate calls for the
>>>> replication to complete. For a database with 10,000 documents, for
>>>> example, it could take up to 7 calls for the entire database to
>>>> replicate into an empty one. Each time, the number of documents
>>>> replicated over seemed random.
>>>>
>>>> Thanks.
>>>>
>>>
>>> Hi, it's certainly not the expected behavior.  When the POST to
>>> _replicate
>>> returns and not all documents have been replicated, what does the
>>> response
>>> look like?  Is there anything in the target log indicating a crash?  Can
>>> you
>>> be more specific about the versions you are using?
>>>
>>> Best, Adam
>>>
>>>
>> Nothing indicated a crash. We have 0.10.0a818506 on a Mac, and
>> something very close on an Ubuntu (I'll find the exact version later).
>>
>> Here's the replication response as well as the interesting logs on the
>> target machine. It seems to me that every (not all) partial
>> replication process is associated with a corresponding entry in the
>> log that says "recording a checkpoint at source update_seq .....".
>> (i.e. you can match the recorded_seq number in the replication
>> response with the checkpoint update_seq numbers in the log).
>>
>> {"session_id":"439d41bad454ea5d5dcb16a154800a23","start_time":"Wed, 23
>> Sep 2009 18:07:33 GMT","end_time":"Wed, 23 Sep 2009 18:07:53
>>
>> GMT","start_last_seq":8663,"end_last_seq":17619,"recorded_seq":17619,"missing_checked":0,"missing_found":8952,"docs_read":8952,"docs_written":8952,"doc_write_failures":0}
>> {"session_id":"f85e575614479547d70277d24bff2d51","start_time":"Wed, 23
>> Sep 2009 18:07:12 GMT","end_time":"Wed, 23 Sep 2009 18:07:17
>>
>> GMT","start_last_seq":7710,"end_last_seq":8663,"recorded_seq":8663,"missing_checked":0,"missing_found":953,"docs_read":953,"docs_written":953,"doc_write_failures":0}
>> {"session_id":"84dc053e810b8a46f19c95ef560d42d5","start_time":"Wed, 23
>> Sep 2009 18:06:32 GMT","end_time":"Wed, 23 Sep 2009 18:06:37
>>
>> GMT","start_last_seq":7021,"end_last_seq":7710,"recorded_seq":7710,"missing_checked":0,"missing_found":689,"docs_read":689,"docs_written":689,"doc_write_failures":0}
>> {"session_id":"e72b655988ecc26b85b412fcaf05018a","start_time":"Wed, 23
>> Sep 2009 18:05:47 GMT","end_time":"Wed, 23 Sep 2009 18:05:52
>>
>> GMT","start_last_seq":5792,"end_last_seq":7021,"recorded_seq":7021,"missing_checked":0,"missing_found":1229,"docs_read":1229,"docs_written":1229,"doc_write_failures":0}
>> {"session_id":"8fd5d827721e70a28735ad4c3a291c3f","start_time":"Wed, 23
>> Sep 2009 18:05:30 GMT","end_time":"Wed, 23 Sep 2009 18:05:35
>>
>> GMT","start_last_seq":4875,"end_last_seq":5792,"recorded_seq":5792,"missing_checked":0,"missing_found":917,"docs_read":917,"docs_written":917,"doc_write_failures":0}
>> {"session_id":"187faed013cb2b63b714aab7845e3f56","start_time":"Wed, 23
>> Sep 2009 18:05:02 GMT","end_time":"Wed, 23 Sep 2009 18:05:07
>>
>> GMT","start_last_seq":4539,"end_last_seq":4875,"recorded_seq":4875,"missing_checked":0,"missing_found":336,"docs_read":336,"docs_written":336,"doc_write_failures":0}
>> {"session_id":"e30ee09b3da0dd979d655382bc3dadc8","start_time":"Wed, 23
>> Sep 2009 18:04:23 GMT","end_time":"Wed, 23 Sep 2009 18:04:34
>>
>> GMT","start_last_seq":1590,"end_last_seq":4539,"recorded_seq":4539,"missing_checked":0,"missing_found":2949,"docs_read":2949,"docs_written":2949,"doc_write_failures":0}
>> {"session_id":"3486a3b8d8a1e5eee05b82dcf4c66153","start_time":"Wed, 23
>> Sep 2009 18:02:17 GMT","end_time":"Wed, 23 Sep 2009 18:02:22
>>
>> GMT","start_last_seq":0,"end_last_seq":1590,"recorded_seq":1590,"missing_checked":0,"missing_found":1590,"docs_read":1590,"docs_written":1590,"doc_write_failures":0}
>>
>> [Wed, 23 Sep 2009 18:04:28 GMT] [info] [<0.1959.0>] recording a
>> checkpoint at source update_seq 3632
>>
>> [Wed, 23 Sep 2009 18:04:34 GMT] [info] [<0.1959.0>] recording a
>> checkpoint at source update_seq 4539
>>
>> [Wed, 23 Sep 2009 18:04:41 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
>> 'POST' /_replicate 200
>>
>> Wed, 23 Sep 2009 18:05:02 GMT] [info] [<0.1941.0>] starting
>> replication "9577548b0faafa46430af6d8b2898a47" at <0.4981.0>
>>
>> [Wed, 23 Sep 2009 18:05:07 GMT] [info] [<0.4981.0>] recording a
>> checkpoint at source update_seq 4875
>>
>> [Wed, 23 Sep 2009 18:05:17 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
>> 'POST' /_replicate 200
>>
>> [Wed, 23 Sep 2009 18:05:30 GMT] [info] [<0.1941.0>] starting
>> replication "9577548b0faafa46430af6d8b2898a47" at <0.5376.0>
>>
>> [Wed, 23 Sep 2009 18:05:35 GMT] [info] [<0.5376.0>] recording a
>> checkpoint at source update_seq 5792
>>
>> Wed, 23 Sep 2009 18:05:43 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
>> 'POST' /_replicate 200
>>
>> [Wed, 23 Sep 2009 18:05:47 GMT] [info] [<0.1941.0>] starting
>> replication "9577548b0faafa46430af6d8b2898a47" at <0.6322.0>
>>
>> [Wed, 23 Sep 2009 18:05:52 GMT] [info] [<0.6322.0>] recording a
>> checkpoint at source update_seq 7021
>>
>> [Wed, 23 Sep 2009 18:05:59 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
>> 'POST' /_replicate 200
>>
>> [Wed, 23 Sep 2009 18:06:32 GMT] [info] [<0.1945.0>] starting
>> replication "9577548b0faafa46430af6d8b2898a47" at <0.7609.0>
>>
>> [Wed, 23 Sep 2009 18:06:37 GMT] [info] [<0.7609.0>] recording a
>> checkpoint at source update_seq 7710
>>
>> Wed, 23 Sep 2009 18:06:41 GMT] [info] [<0.1945.0>] 127.0.0.1 - -
>> 'POST' /_replicate 200
>>
>> [Wed, 23 Sep 2009 18:07:12 GMT] [info] [<0.7608.0>] starting
>> replication "9577548b0faafa46430af6d8b2898a47" at <0.8369.0>
>>
>> [Wed, 23 Sep 2009 18:07:17 GMT] [info] [<0.8369.0>] recording a
>> checkpoint at source update_seq 8663
>>
>> [Wed, 23 Sep 2009 18:07:20 GMT] [info] [<0.7608.0>] 127.0.0.1 - -
>> 'POST' /_replicate 200
>>
>> [Wed, 23 Sep 2009 18:07:23 GMT] [info] [<0.7608.0>] 127.0.0.1 - -
>> 'GET' /_utils/image/delete-mini.png 304
>>
>> [Wed, 23 Sep 2009 18:07:33 GMT] [info] [<0.7608.0>] starting
>> replication "9577548b0faafa46430af6d8b2898a47" at <0.9376.0>
>>
>> [Wed, 23 Sep 2009 18:07:38 GMT] [info] [<0.9376.0>] recording a
>> checkpoint at source update_seq 10821
>>
>> [Wed, 23 Sep 2009 18:07:44 GMT] [info] [<0.9376.0>] recording a
>> checkpoint at source update_seq 13507
>>
>> [Wed, 23 Sep 2009 18:07:50 GMT] [info] [<0.9376.0>] recording a
>> checkpoint at source update_seq 16222
>>
>> [Wed, 23 Sep 2009 18:07:53 GMT] [info] [<0.9376.0>] recording a
>> checkpoint at source update_seq 17619
>>
>
> Hmm, I must admit I'm stumped so far.  Are you by any chance building from
> SVN repeatedly and installing into the same prefix?  Please feel free to
> file a ticket in JIRA[1] so we don't forget about this.  You might try again
> with the log level on the target set to debug, although I'm not certain it
> will tell us anything.  I'll see if I can find a way to reproduce this.
>  Best,
>
> Adam
>
> [1]: https://issues.apache.org/jira/browse/COUCHDB
>
>

Re: partial replications

Posted by Adam Kocoloski <ko...@apache.org>.

On Sep 28, 2009, at 4:44 PM, Ning Tan wrote:

> On Mon, Sep 28, 2009 at 2:41 PM, Adam Kocoloski  
> <ko...@apache.org> wrote:
>> On Sep 28, 2009, at 1:21 PM, Ning Tan wrote:
>>
>>> Hi,
>>>
>>> When we replicate between a remote database and a local one (pulling
>>> from remote into local), we are observing partial replications,
>>> meaning that we have to issue repeated _replicate calls for the
>>> replication to complete. For a database with 10,000 documents, for
>>> example, it could take up to 7 calls for the entire database to
>>> replicate into an empty one. Each time, the number of documents
>>> replicated over seemed random.
>>>
>>> Thanks.
>>
>> Hi, it's certainly not the expected behavior.  When the POST to  
>> _replicate
>> returns and not all documents have been replicated, what does the  
>> response
>> look like?  Is there anything in the target log indicating a  
>> crash?  Can you
>> be more specific about the versions you are using?
>>
>> Best, Adam
>>
>
> Nothing indicated a crash. We have 0.10.0a818506 on a Mac, and
> something very close on an Ubuntu (I'll find the exact version later).
>
> Here's the replication response as well as the interesting logs on the
> target machine. It seems to me that every (not all) partial
> replication process is associated with a corresponding entry in the
> log that says "recording a checkpoint at source update_seq .....".
> (i.e. you can match the recorded_seq number in the replication
> response with the checkpoint update_seq numbers in the log).
>
> {"session_id":"439d41bad454ea5d5dcb16a154800a23","start_time":"Wed, 23
> Sep 2009 18:07:33 GMT","end_time":"Wed, 23 Sep 2009 18:07:53
> GMT","start_last_seq":8663,"end_last_seq":17619,"recorded_seq": 
> 17619,"missing_checked":0,"missing_found":8952,"docs_read": 
> 8952,"docs_written":8952,"doc_write_failures":0}
> {"session_id":"f85e575614479547d70277d24bff2d51","start_time":"Wed, 23
> Sep 2009 18:07:12 GMT","end_time":"Wed, 23 Sep 2009 18:07:17
> GMT","start_last_seq":7710,"end_last_seq":8663,"recorded_seq": 
> 8663,"missing_checked":0,"missing_found":953,"docs_read": 
> 953,"docs_written":953,"doc_write_failures":0}
> {"session_id":"84dc053e810b8a46f19c95ef560d42d5","start_time":"Wed, 23
> Sep 2009 18:06:32 GMT","end_time":"Wed, 23 Sep 2009 18:06:37
> GMT","start_last_seq":7021,"end_last_seq":7710,"recorded_seq": 
> 7710,"missing_checked":0,"missing_found":689,"docs_read": 
> 689,"docs_written":689,"doc_write_failures":0}
> {"session_id":"e72b655988ecc26b85b412fcaf05018a","start_time":"Wed, 23
> Sep 2009 18:05:47 GMT","end_time":"Wed, 23 Sep 2009 18:05:52
> GMT","start_last_seq":5792,"end_last_seq":7021,"recorded_seq": 
> 7021,"missing_checked":0,"missing_found":1229,"docs_read": 
> 1229,"docs_written":1229,"doc_write_failures":0}
> {"session_id":"8fd5d827721e70a28735ad4c3a291c3f","start_time":"Wed, 23
> Sep 2009 18:05:30 GMT","end_time":"Wed, 23 Sep 2009 18:05:35
> GMT","start_last_seq":4875,"end_last_seq":5792,"recorded_seq": 
> 5792,"missing_checked":0,"missing_found":917,"docs_read": 
> 917,"docs_written":917,"doc_write_failures":0}
> {"session_id":"187faed013cb2b63b714aab7845e3f56","start_time":"Wed, 23
> Sep 2009 18:05:02 GMT","end_time":"Wed, 23 Sep 2009 18:05:07
> GMT","start_last_seq":4539,"end_last_seq":4875,"recorded_seq": 
> 4875,"missing_checked":0,"missing_found":336,"docs_read": 
> 336,"docs_written":336,"doc_write_failures":0}
> {"session_id":"e30ee09b3da0dd979d655382bc3dadc8","start_time":"Wed, 23
> Sep 2009 18:04:23 GMT","end_time":"Wed, 23 Sep 2009 18:04:34
> GMT","start_last_seq":1590,"end_last_seq":4539,"recorded_seq": 
> 4539,"missing_checked":0,"missing_found":2949,"docs_read": 
> 2949,"docs_written":2949,"doc_write_failures":0}
> {"session_id":"3486a3b8d8a1e5eee05b82dcf4c66153","start_time":"Wed, 23
> Sep 2009 18:02:17 GMT","end_time":"Wed, 23 Sep 2009 18:02:22
> GMT","start_last_seq":0,"end_last_seq":1590,"recorded_seq": 
> 1590,"missing_checked":0,"missing_found":1590,"docs_read": 
> 1590,"docs_written":1590,"doc_write_failures":0}
>
> [Wed, 23 Sep 2009 18:04:28 GMT] [info] [<0.1959.0>] recording a
> checkpoint at source update_seq 3632
>
> [Wed, 23 Sep 2009 18:04:34 GMT] [info] [<0.1959.0>] recording a
> checkpoint at source update_seq 4539
>
> [Wed, 23 Sep 2009 18:04:41 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
> 'POST' /_replicate 200
>
> Wed, 23 Sep 2009 18:05:02 GMT] [info] [<0.1941.0>] starting
> replication "9577548b0faafa46430af6d8b2898a47" at <0.4981.0>
>
> [Wed, 23 Sep 2009 18:05:07 GMT] [info] [<0.4981.0>] recording a
> checkpoint at source update_seq 4875
>
> [Wed, 23 Sep 2009 18:05:17 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
> 'POST' /_replicate 200
>
> [Wed, 23 Sep 2009 18:05:30 GMT] [info] [<0.1941.0>] starting
> replication "9577548b0faafa46430af6d8b2898a47" at <0.5376.0>
>
> [Wed, 23 Sep 2009 18:05:35 GMT] [info] [<0.5376.0>] recording a
> checkpoint at source update_seq 5792
>
> Wed, 23 Sep 2009 18:05:43 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
> 'POST' /_replicate 200
>
> [Wed, 23 Sep 2009 18:05:47 GMT] [info] [<0.1941.0>] starting
> replication "9577548b0faafa46430af6d8b2898a47" at <0.6322.0>
>
> [Wed, 23 Sep 2009 18:05:52 GMT] [info] [<0.6322.0>] recording a
> checkpoint at source update_seq 7021
>
> [Wed, 23 Sep 2009 18:05:59 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
> 'POST' /_replicate 200
>
> [Wed, 23 Sep 2009 18:06:32 GMT] [info] [<0.1945.0>] starting
> replication "9577548b0faafa46430af6d8b2898a47" at <0.7609.0>
>
> [Wed, 23 Sep 2009 18:06:37 GMT] [info] [<0.7609.0>] recording a
> checkpoint at source update_seq 7710
>
> Wed, 23 Sep 2009 18:06:41 GMT] [info] [<0.1945.0>] 127.0.0.1 - -
> 'POST' /_replicate 200
>
> [Wed, 23 Sep 2009 18:07:12 GMT] [info] [<0.7608.0>] starting
> replication "9577548b0faafa46430af6d8b2898a47" at <0.8369.0>
>
> [Wed, 23 Sep 2009 18:07:17 GMT] [info] [<0.8369.0>] recording a
> checkpoint at source update_seq 8663
>
> [Wed, 23 Sep 2009 18:07:20 GMT] [info] [<0.7608.0>] 127.0.0.1 - -
> 'POST' /_replicate 200
>
> [Wed, 23 Sep 2009 18:07:23 GMT] [info] [<0.7608.0>] 127.0.0.1 - -
> 'GET' /_utils/image/delete-mini.png 304
>
> [Wed, 23 Sep 2009 18:07:33 GMT] [info] [<0.7608.0>] starting
> replication "9577548b0faafa46430af6d8b2898a47" at <0.9376.0>
>
> [Wed, 23 Sep 2009 18:07:38 GMT] [info] [<0.9376.0>] recording a
> checkpoint at source update_seq 10821
>
> [Wed, 23 Sep 2009 18:07:44 GMT] [info] [<0.9376.0>] recording a
> checkpoint at source update_seq 13507
>
> [Wed, 23 Sep 2009 18:07:50 GMT] [info] [<0.9376.0>] recording a
> checkpoint at source update_seq 16222
>
> [Wed, 23 Sep 2009 18:07:53 GMT] [info] [<0.9376.0>] recording a
> checkpoint at source update_seq 17619

Hmm, I must admit I'm stumped so far.  Are you by any chance building  
from SVN repeatedly and installing into the same prefix?  Please feel  
free to file a ticket in JIRA[1] so we don't forget about this.  You  
might try again with the log level on the target set to debug,  
although I'm not certain it will tell us anything.  I'll see if I can  
find a way to reproduce this.  Best,

Adam

[1]: https://issues.apache.org/jira/browse/COUCHDB

Re: partial replications

Posted by Ning Tan <ni...@gmail.com>.

On Mon, Sep 28, 2009 at 2:41 PM, Adam Kocoloski <ko...@apache.org> wrote:
> On Sep 28, 2009, at 1:21 PM, Ning Tan wrote:
>
>> Hi,
>>
>> When we replicate between a remote database and a local one (pulling
>> from remote into local), we are observing partial replications,
>> meaning that we have to issue repeated _replicate calls for the
>> replication to complete. For a database with 10,000 documents, for
>> example, it could take up to 7 calls for the entire database to
>> replicate into an empty one. Each time, the number of documents
>> replicated over seemed random.
>>
>> Thanks.
>
> Hi, it's certainly not the expected behavior.  When the POST to _replicate
> returns and not all documents have been replicated, what does the response
> look like?  Is there anything in the target log indicating a crash?  Can you
> be more specific about the versions you are using?
>
> Best, Adam
>

Nothing indicated a crash. We have 0.10.0a818506 on a Mac, and
something very close on an Ubuntu (I'll find the exact version later).

Here's the replication response as well as the interesting logs on the
target machine. It seems to me that every (not all) partial
replication process is associated with a corresponding entry in the
log that says "recording a checkpoint at source update_seq .....".
(i.e. you can match the recorded_seq number in the replication
response with the checkpoint update_seq numbers in the log).

{"session_id":"439d41bad454ea5d5dcb16a154800a23","start_time":"Wed, 23
Sep 2009 18:07:33 GMT","end_time":"Wed, 23 Sep 2009 18:07:53
GMT","start_last_seq":8663,"end_last_seq":17619,"recorded_seq":17619,"missing_checked":0,"missing_found":8952,"docs_read":8952,"docs_written":8952,"doc_write_failures":0}
{"session_id":"f85e575614479547d70277d24bff2d51","start_time":"Wed, 23
Sep 2009 18:07:12 GMT","end_time":"Wed, 23 Sep 2009 18:07:17
GMT","start_last_seq":7710,"end_last_seq":8663,"recorded_seq":8663,"missing_checked":0,"missing_found":953,"docs_read":953,"docs_written":953,"doc_write_failures":0}
{"session_id":"84dc053e810b8a46f19c95ef560d42d5","start_time":"Wed, 23
Sep 2009 18:06:32 GMT","end_time":"Wed, 23 Sep 2009 18:06:37
GMT","start_last_seq":7021,"end_last_seq":7710,"recorded_seq":7710,"missing_checked":0,"missing_found":689,"docs_read":689,"docs_written":689,"doc_write_failures":0}
{"session_id":"e72b655988ecc26b85b412fcaf05018a","start_time":"Wed, 23
Sep 2009 18:05:47 GMT","end_time":"Wed, 23 Sep 2009 18:05:52
GMT","start_last_seq":5792,"end_last_seq":7021,"recorded_seq":7021,"missing_checked":0,"missing_found":1229,"docs_read":1229,"docs_written":1229,"doc_write_failures":0}
{"session_id":"8fd5d827721e70a28735ad4c3a291c3f","start_time":"Wed, 23
Sep 2009 18:05:30 GMT","end_time":"Wed, 23 Sep 2009 18:05:35
GMT","start_last_seq":4875,"end_last_seq":5792,"recorded_seq":5792,"missing_checked":0,"missing_found":917,"docs_read":917,"docs_written":917,"doc_write_failures":0}
{"session_id":"187faed013cb2b63b714aab7845e3f56","start_time":"Wed, 23
Sep 2009 18:05:02 GMT","end_time":"Wed, 23 Sep 2009 18:05:07
GMT","start_last_seq":4539,"end_last_seq":4875,"recorded_seq":4875,"missing_checked":0,"missing_found":336,"docs_read":336,"docs_written":336,"doc_write_failures":0}
{"session_id":"e30ee09b3da0dd979d655382bc3dadc8","start_time":"Wed, 23
Sep 2009 18:04:23 GMT","end_time":"Wed, 23 Sep 2009 18:04:34
GMT","start_last_seq":1590,"end_last_seq":4539,"recorded_seq":4539,"missing_checked":0,"missing_found":2949,"docs_read":2949,"docs_written":2949,"doc_write_failures":0}
{"session_id":"3486a3b8d8a1e5eee05b82dcf4c66153","start_time":"Wed, 23
Sep 2009 18:02:17 GMT","end_time":"Wed, 23 Sep 2009 18:02:22
GMT","start_last_seq":0,"end_last_seq":1590,"recorded_seq":1590,"missing_checked":0,"missing_found":1590,"docs_read":1590,"docs_written":1590,"doc_write_failures":0}

[Wed, 23 Sep 2009 18:04:28 GMT] [info] [<0.1959.0>] recording a
checkpoint at source update_seq 3632

[Wed, 23 Sep 2009 18:04:34 GMT] [info] [<0.1959.0>] recording a
checkpoint at source update_seq 4539

[Wed, 23 Sep 2009 18:04:41 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
'POST' /_replicate 200

Wed, 23 Sep 2009 18:05:02 GMT] [info] [<0.1941.0>] starting
replication "9577548b0faafa46430af6d8b2898a47" at <0.4981.0>

[Wed, 23 Sep 2009 18:05:07 GMT] [info] [<0.4981.0>] recording a
checkpoint at source update_seq 4875

[Wed, 23 Sep 2009 18:05:17 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
'POST' /_replicate 200

[Wed, 23 Sep 2009 18:05:30 GMT] [info] [<0.1941.0>] starting
replication "9577548b0faafa46430af6d8b2898a47" at <0.5376.0>

[Wed, 23 Sep 2009 18:05:35 GMT] [info] [<0.5376.0>] recording a
checkpoint at source update_seq 5792

Wed, 23 Sep 2009 18:05:43 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
'POST' /_replicate 200

[Wed, 23 Sep 2009 18:05:47 GMT] [info] [<0.1941.0>] starting
replication "9577548b0faafa46430af6d8b2898a47" at <0.6322.0>

[Wed, 23 Sep 2009 18:05:52 GMT] [info] [<0.6322.0>] recording a
checkpoint at source update_seq 7021

[Wed, 23 Sep 2009 18:05:59 GMT] [info] [<0.1941.0>] 127.0.0.1 - -
'POST' /_replicate 200

[Wed, 23 Sep 2009 18:06:32 GMT] [info] [<0.1945.0>] starting
replication "9577548b0faafa46430af6d8b2898a47" at <0.7609.0>

[Wed, 23 Sep 2009 18:06:37 GMT] [info] [<0.7609.0>] recording a
checkpoint at source update_seq 7710

Wed, 23 Sep 2009 18:06:41 GMT] [info] [<0.1945.0>] 127.0.0.1 - -
'POST' /_replicate 200

[Wed, 23 Sep 2009 18:07:12 GMT] [info] [<0.7608.0>] starting
replication "9577548b0faafa46430af6d8b2898a47" at <0.8369.0>

[Wed, 23 Sep 2009 18:07:17 GMT] [info] [<0.8369.0>] recording a
checkpoint at source update_seq 8663

[Wed, 23 Sep 2009 18:07:20 GMT] [info] [<0.7608.0>] 127.0.0.1 - -
'POST' /_replicate 200

[Wed, 23 Sep 2009 18:07:23 GMT] [info] [<0.7608.0>] 127.0.0.1 - -
'GET' /_utils/image/delete-mini.png 304

[Wed, 23 Sep 2009 18:07:33 GMT] [info] [<0.7608.0>] starting
replication "9577548b0faafa46430af6d8b2898a47" at <0.9376.0>

[Wed, 23 Sep 2009 18:07:38 GMT] [info] [<0.9376.0>] recording a
checkpoint at source update_seq 10821

[Wed, 23 Sep 2009 18:07:44 GMT] [info] [<0.9376.0>] recording a
checkpoint at source update_seq 13507

[Wed, 23 Sep 2009 18:07:50 GMT] [info] [<0.9376.0>] recording a
checkpoint at source update_seq 16222

[Wed, 23 Sep 2009 18:07:53 GMT] [info] [<0.9376.0>] recording a
checkpoint at source update_seq 17619

Re: partial replications

Posted by Adam Kocoloski <ko...@apache.org>.

On Sep 28, 2009, at 1:21 PM, Ning Tan wrote:

> Hi,
>
> When we replicate between a remote database and a local one (pulling
> from remote into local), we are observing partial replications,
> meaning that we have to issue repeated _replicate calls for the
> replication to complete. For a database with 10,000 documents, for
> example, it could take up to 7 calls for the entire database to
> replicate into an empty one. Each time, the number of documents
> replicated over seemed random.
>
> The use case is very simple--replicating a database into an empty one
> with no concurrent writes, no additional load or i/o, etc.
>
> Has anybody else seen this, and/or is this expected behavior?
>
> The databases involved are a mixture of the 10.0.x code base natively
> built on Ubuntu and Mac.  I can get more detailed information about
> the environment if needed.
>
> Thanks.

Hi, it's certainly not the expected behavior.  When the POST to  
_replicate returns and not all documents have been replicated, what  
does the response look like?  Is there anything in the target log  
indicating a crash?  Can you be more specific about the versions you  
are using?

Best, Adam