You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Stephen Bartell <sn...@gmail.com> on 2013/03/14 10:02:40 UTC

replicating docs with tons of conflicts

Hi all, 

tldr; I've got a database with just a couple docs.  Conflict management went unchecked and these docs have thousands of conflicts each.  Replication fails.  Couch consumes all the server's cpu.

First the story, then the questions.  Please bear with me!

I wanted to replicate this database to another, new database.  So I started the replication.  beam.smp took 100% of my cpu and the replicator status held steady at a constant percent for quite a while.  It eventually finished.

I thought maybe I should handle the conflicts and then replicate.  Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I replicated again but this time I could not get anything to replicate.  Again, cpu held steady, topped out. I eventually restarted couch.

I dug throughout the logs and saw that the POSTS were failing.  I figure that the replicator was timing out when trying to post to couch.

I have a replicator that I've been working on thats written in node.js.  So I started that one up to do the same thing.  I drew inspiration from Pouchdb's replicator and from Jens Alkes amazing replication algorithm documentation, so my replicator follows more or less the same story.  1) consume _changes with style=all_docs.  2) revs_diff on the target database.  3) get each revision from source with revs=true.  4) bulk post with new_edits=false.

Same thing.  Except now I can kind of make sense of whats going on.  Sucking the data out of the source is no problem.  Diffing the revs against the target is no problem.  Posting the docs is THE problem.  Since the database is clean, thousands of docs are being thrown at couch at once to build up the revision trees.  Couch is just taking forever in finishing the job.  It doesn't matter if I bulk post the docs or post them individually, couch sucks 100% of my cpu every time and takes forever to finish. (I actually never let it finish). 

So that is is the story. Here are my questions.

1) Has anyone else stepped on this mine?  If so, could I get pointed towards some workarounds?  I don't think it is right to make the assumption that users of couchdb will never have databases with huge conflict sausages like this. So simply saying manage your conflicts won't help.

2) Lets say I did manage my conflicts.  I still have the _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be replicated to maintain consistency across the cluster.  If the replicator throws up when these huge sausages come through, how is the data ever going to replicate?  Is there a trade secret I don't know about?

3) Is there any limit on the resources that CouchDB is allowed to consume?  I can get that we run into these cases where theres tons of data to move and its just going to take a hell of a long time.  But I don't get why its permissible for CouchDB to eat all my cpu.  The whole server should never grind to a halt because its moving lots of data.  I feel like it should be like the little train who could.  Just chug along slow and steady until it crests the hill.

I would really like to reply on the erlang replicator, but I can't.  At least with the replicator I wrote I have a chance with throttling the posts so CouchDB doesn't render my server useless.

Sorry for wrapping more questions into those questions.  I'm pretty tired, stumped, and have machines in production crumbling.

Best, 
Stephen

Re: replicating docs with tons of conflicts

Posted by Riyad Kalla <rk...@gmail.com>.
Robert,
Assuming you are perfectly happy with the state of all docs and don't want
them to retain any of their conflict information would that require some
program to read all the docs (current rev) out of Couch1 and then write
them into Couch2 (empty server with no replication setup). Then once all
docs have been written to Couch2, wipe Couch1 and then setup replication
and allow 2 to replicate to 1 with no conflicts on any of the docs?


On Thu, Mar 14, 2013 at 11:44 AM, Robert Newson <rn...@apache.org> wrote:

> Conflicts are *not* removed during compaction, CouchDB has no way of
> knowing which ones it would be ok to delete.
>
> CouchDB does struggle to process documents with lots of conflicts,
> we've encountered this at Cloudant a fair bunch. We resolve the
> conflicts via http if possible or, if that consistently fails, with a
> direct erlang manipulation. It's certainly something we need to
> improve.
>
> B.
>
> On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
> > Stephen,
> > I am probably wrong here (someone hop in and correct me), but I thought
> > Compaction would remove the old revisions (and conflicts) of docs.
> >
> > Alternatively a question for the Couch devs, if Stephen set _revs_limit
> to
> > something artifically low, say 1, and restarted couch and did a
> compaction,
> > would that force the DB to smash down the datastore to 1 rev per doc and
> > remove the long-tail off these docs?
> >
> > REF: http://wiki.apache.org/couchdb/Compaction
> >
> > On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <snbartell@gmail.com
> >wrote:
> >
> >> Hi all,
> >>
> >> tldr; I've got a database with just a couple docs.  Conflict management
> >> went unchecked and these docs have thousands of conflicts each.
> >>  Replication fails.  Couch consumes all the server's cpu.
> >>
> >> First the story, then the questions.  Please bear with me!
> >>
> >> I wanted to replicate this database to another, new database.  So I
> >> started the replication.  beam.smp took 100% of my cpu and the
> replicator
> >> status held steady at a constant percent for quite a while.  It
> eventually
> >> finished.
> >>
> >> I thought maybe I should handle the conflicts and then replicate.
> >>  Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
> >> replicated again but this time I could not get anything to replicate.
> >>  Again, cpu held steady, topped out. I eventually restarted couch.
> >>
> >> I dug throughout the logs and saw that the POSTS were failing.  I figure
> >> that the replicator was timing out when trying to post to couch.
> >>
> >> I have a replicator that I've been working on thats written in node.js.
> >>  So I started that one up to do the same thing.  I drew inspiration from
> >> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
> >> documentation, so my replicator follows more or less the same story.  1)
> >> consume _changes with style=all_docs.  2) revs_diff on the target
> database.
> >>  3) get each revision from source with revs=true.  4) bulk post with
> >> new_edits=false.
> >>
> >> Same thing.  Except now I can kind of make sense of whats going on.
> >>  Sucking the data out of the source is no problem.  Diffing the revs
> >> against the target is no problem.  Posting the docs is THE problem.
>  Since
> >> the database is clean, thousands of docs are being thrown at couch at
> once
> >> to build up the revision trees.  Couch is just taking forever in
> finishing
> >> the job.  It doesn't matter if I bulk post the docs or post them
> >> individually, couch sucks 100% of my cpu every time and takes forever to
> >> finish. (I actually never let it finish).
> >>
> >> So that is is the story. Here are my questions.
> >>
> >> 1) Has anyone else stepped on this mine?  If so, could I get pointed
> >> towards some workarounds?  I don't think it is right to make the
> assumption
> >> that users of couchdb will never have databases with huge conflict
> sausages
> >> like this. So simply saying manage your conflicts won't help.
> >>
> >> 2) Lets say I did manage my conflicts.  I still have the
> >> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs
> must be
> >> replicated to maintain consistency across the cluster.  If the
> replicator
> >> throws up when these huge sausages come through, how is the data ever
> going
> >> to replicate?  Is there a trade secret I don't know about?
> >>
> >> 3) Is there any limit on the resources that CouchDB is allowed to
> consume?
> >>  I can get that we run into these cases where theres tons of data to
> move
> >> and its just going to take a hell of a long time.  But I don't get why
> its
> >> permissible for CouchDB to eat all my cpu.  The whole server should
> never
> >> grind to a halt because its moving lots of data.  I feel like it should
> be
> >> like the little train who could.  Just chug along slow and steady until
> it
> >> crests the hill.
> >>
> >> I would really like to reply on the erlang replicator, but I can't.  At
> >> least with the replicator I wrote I have a chance with throttling the
> posts
> >> so CouchDB doesn't render my server useless.
> >>
> >> Sorry for wrapping more questions into those questions.  I'm pretty
> tired,
> >> stumped, and have machines in production crumbling.
> >>
> >> Best,
> >> Stephen
>

Re: replicating docs with tons of conflicts

Posted by Adam Kocoloski <ko...@apache.org>.
On Mar 15, 2013, at 1:40 PM, Stephen Bartell <sn...@gmail.com> wrote:

> 
> On Mar 14, 2013, at 3:36 PM, Robert Newson <rn...@apache.org> wrote:
> 
>> Runaway processes are the very devil but the problem is not specific
>> to CouchDB, there is no CouchDB mechanism for this just as there's no
>> bash/python/ruby/perl method to limit a while(true){} loop.
> 
> Totally makes sense.

The folks at Silverline pitch an idea about "application containers" to try to manage this sort of situation.  I've never used myself but the tech always sounded neat:

https://silverline.librato.com/promo/application_management

>> 
>> Highly conflicted documents are painful to update and read. I can't do
>> anything about that today.
> 
> Thanks for your feedback!

We've talked about this a bit internally at Cloudant.  Perhaps a more appropriate discussion for dev@, but I think there are possible enhancements one can make to CouchDB's handling of deleted edit branches that allow the server to prune them automatically once it knows that all of its prior replication peers have received the tombstone at the end of the branch.  In a multi-master scenario you do run the risk of re-vivifying part of the branch when the other side pushes edits back to you, but I think it's a risk that most folks who have been subjected to the pain of heavily-conflicted documents would be willing to take.

Adam

Re: replicating docs with tons of conflicts

Posted by Stephen Bartell <sn...@gmail.com>.
On Mar 14, 2013, at 3:36 PM, Robert Newson <rn...@apache.org> wrote:

> Runaway processes are the very devil but the problem is not specific
> to CouchDB, there is no CouchDB mechanism for this just as there's no
> bash/python/ruby/perl method to limit a while(true){} loop.

Totally makes sense.

> 
> Highly conflicted documents are painful to update and read. I can't do
> anything about that today.

Thanks for your feedback!

> 
> B.
> 
> On 14 March 2013 17:23, Stephen Bartell <sn...@gmail.com> wrote:
>> Robert, this only works if I don't need to keep those docs around anymore.  In my case, I want to keep the docs.  I don't want to keep the conflicts of the docs. Most importantly thought, even if I delete all the conflicts on all my docs, I still have the problem of _deleted_docs.  What I've seen is that only a few docs with a few thousand _deleted_docs each will plug up Couch and render unusable. You can't get rid of it through natural means.
>> 
>> This is what Riyad was bringing up and what Ive implemented.  I have a program which replicates from the troubled database _changes with the query param style=main_only.  This allows me to still have the revision tree of the troubled database, but without the _deleted_conflicts.  I can then wipe out the troubled db, recreate it, and replicate the shiny clean data back into it.
>> 
>> This is unnatural and requires custom code to make happen.  I can live with it until a better solution comes around.
>> 
>> What I'm really concerned about is how couchdb eats all my cpu.
>> 
>> Is there any way to ration the resources that couchdb uses? Like tell it not to use more than 50% or something.  I think that couch eating all the resources on a machine just because its reading loads of data is a bug.  Is this a reasonable conclusion?
>> 
>> On Mar 14, 2013, at 2:18 PM, Robert Newson <rn...@apache.org> wrote:
>> 
>>> One trick, you can delete the doc and replicate with a filter like
>>> 'return !doc['_deleted'];' that blocks all deletes. the target db will
>>> then not receive any trace of these highly conflicted docs.
>>> 
>>> On 14 March 2013 14:10, Stephen Bartell <sn...@gmail.com> wrote:
>>>> 
>>>> On Mar 14, 2013, at 11:44 AM, Robert Newson <rn...@apache.org> wrote:
>>>> 
>>>>> Conflicts are *not* removed during compaction, CouchDB has no way of
>>>>> knowing which ones it would be ok to delete.
>>>> 
>>>> Yep, they need to be deleted in the context of the person/process manipulating the docs.
>>>> 
>>>>> 
>>>>> CouchDB does struggle to process documents with lots of conflicts,
>>>>> we've encountered this at Cloudant a fair bunch. We resolve the
>>>>> conflicts via http if possible or, if that consistently fails, with a
>>>>> direct erlang manipulation. It's certainly something we need to
>>>>> improve.
>>>>> 
>>>> 
>>>> But even deleting them yields the same problem.  When replicating, the _deleted_conflicts is carried over.
>>>> Users could be diligent in deleting conflicts, but still end up unable to replicate their docs because of the volume of _deleted_conflicts.
>>>> 
>>>> Robert, thanks for chiming in.  I feel better knowing I'm in good company with this problem. When this mine eventually goes off, couchdb is rendered useless because beam.smp takes all the cpu.  Is there any way to ration the resources couchdb consumes?
>>>> 
>>>>> B.
>>>>> 
>>>>> On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
>>>>>> Stephen,
>>>>>> I am probably wrong here (someone hop in and correct me), but I thought
>>>>>> Compaction would remove the old revisions (and conflicts) of docs.
>>>>>> 
>>>>>> Alternatively a question for the Couch devs, if Stephen set _revs_limit to
>>>>>> something artifically low, say 1, and restarted couch and did a compaction,
>>>>>> would that force the DB to smash down the datastore to 1 rev per doc and
>>>>>> remove the long-tail off these docs?
>>>>>> 
>>>>>> REF: http://wiki.apache.org/couchdb/Compaction
>>>>>> 
>>>>>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> tldr; I've got a database with just a couple docs.  Conflict management
>>>>>>> went unchecked and these docs have thousands of conflicts each.
>>>>>>> Replication fails.  Couch consumes all the server's cpu.
>>>>>>> 
>>>>>>> First the story, then the questions.  Please bear with me!
>>>>>>> 
>>>>>>> I wanted to replicate this database to another, new database.  So I
>>>>>>> started the replication.  beam.smp took 100% of my cpu and the replicator
>>>>>>> status held steady at a constant percent for quite a while.  It eventually
>>>>>>> finished.
>>>>>>> 
>>>>>>> I thought maybe I should handle the conflicts and then replicate.
>>>>>>> Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>>>>>>> replicated again but this time I could not get anything to replicate.
>>>>>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>>>>>> 
>>>>>>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>>>>>>> that the replicator was timing out when trying to post to couch.
>>>>>>> 
>>>>>>> I have a replicator that I've been working on thats written in node.js.
>>>>>>> So I started that one up to do the same thing.  I drew inspiration from
>>>>>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>>>>>> documentation, so my replicator follows more or less the same story.  1)
>>>>>>> consume _changes with style=all_docs.  2) revs_diff on the target database.
>>>>>>> 3) get each revision from source with revs=true.  4) bulk post with
>>>>>>> new_edits=false.
>>>>>>> 
>>>>>>> Same thing.  Except now I can kind of make sense of whats going on.
>>>>>>> Sucking the data out of the source is no problem.  Diffing the revs
>>>>>>> against the target is no problem.  Posting the docs is THE problem.  Since
>>>>>>> the database is clean, thousands of docs are being thrown at couch at once
>>>>>>> to build up the revision trees.  Couch is just taking forever in finishing
>>>>>>> the job.  It doesn't matter if I bulk post the docs or post them
>>>>>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>>>>>> finish. (I actually never let it finish).
>>>>>>> 
>>>>>>> So that is is the story. Here are my questions.
>>>>>>> 
>>>>>>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>>>>>>> towards some workarounds?  I don't think it is right to make the assumption
>>>>>>> that users of couchdb will never have databases with huge conflict sausages
>>>>>>> like this. So simply saying manage your conflicts won't help.
>>>>>>> 
>>>>>>> 2) Lets say I did manage my conflicts.  I still have the
>>>>>>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
>>>>>>> replicated to maintain consistency across the cluster.  If the replicator
>>>>>>> throws up when these huge sausages come through, how is the data ever going
>>>>>>> to replicate?  Is there a trade secret I don't know about?
>>>>>>> 
>>>>>>> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>>>>>>> I can get that we run into these cases where theres tons of data to move
>>>>>>> and its just going to take a hell of a long time.  But I don't get why its
>>>>>>> permissible for CouchDB to eat all my cpu.  The whole server should never
>>>>>>> grind to a halt because its moving lots of data.  I feel like it should be
>>>>>>> like the little train who could.  Just chug along slow and steady until it
>>>>>>> crests the hill.
>>>>>>> 
>>>>>>> I would really like to reply on the erlang replicator, but I can't.  At
>>>>>>> least with the replicator I wrote I have a chance with throttling the posts
>>>>>>> so CouchDB doesn't render my server useless.
>>>>>>> 
>>>>>>> Sorry for wrapping more questions into those questions.  I'm pretty tired,
>>>>>>> stumped, and have machines in production crumbling.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Stephen
>>>> 
>> 


Re: replicating docs with tons of conflicts

Posted by svilen <az...@svilendobrev.com>.
not really my domain, but one might play with something like unix "nice"
- that is, process-priority - but u cannot guess when a process is to
be suppresed and when not. unless u put lots of efforts into it..
measuring heart beats, responsiveness etc

svil

On Thu, 14 Mar 2013 17:36:56 -0500
Robert Newson <rn...@apache.org> wrote:

> Runaway processes are the very devil but the problem is not specific
> to CouchDB, there is no CouchDB mechanism for this just as there's no
> bash/python/ruby/perl method to limit a while(true){} loop.
> 
> Highly conflicted documents are painful to update and read. I can't do
> anything about that today.
> 
> B.
> 
> On 14 March 2013 17:23, Stephen Bartell <sn...@gmail.com> wrote:
> > Robert, this only works if I don't need to keep those docs around
> > anymore.  In my case, I want to keep the docs.  I don't want to
> > keep the conflicts of the docs. Most importantly thought, even if I
> > delete all the conflicts on all my docs, I still have the problem
> > of _deleted_docs.  What I've seen is that only a few docs with a
> > few thousand _deleted_docs each will plug up Couch and render
> > unusable. You can't get rid of it through natural means.
> >
> > This is what Riyad was bringing up and what Ive implemented.  I
> > have a program which replicates from the troubled database _changes
> > with the query param style=main_only.  This allows me to still have
> > the revision tree of the troubled database, but without the
> > _deleted_conflicts.  I can then wipe out the troubled db, recreate
> > it, and replicate the shiny clean data back into it.
> >
> > This is unnatural and requires custom code to make happen.  I can
> > live with it until a better solution comes around.
> >
> > What I'm really concerned about is how couchdb eats all my cpu.
> >
> > Is there any way to ration the resources that couchdb uses? Like
> > tell it not to use more than 50% or something.  I think that couch
> > eating all the resources on a machine just because its reading
> > loads of data is a bug.  Is this a reasonable conclusion?
> >

Re: replicating docs with tons of conflicts

Posted by Robert Newson <rn...@apache.org>.
Runaway processes are the very devil but the problem is not specific
to CouchDB, there is no CouchDB mechanism for this just as there's no
bash/python/ruby/perl method to limit a while(true){} loop.

Highly conflicted documents are painful to update and read. I can't do
anything about that today.

B.

On 14 March 2013 17:23, Stephen Bartell <sn...@gmail.com> wrote:
> Robert, this only works if I don't need to keep those docs around anymore.  In my case, I want to keep the docs.  I don't want to keep the conflicts of the docs. Most importantly thought, even if I delete all the conflicts on all my docs, I still have the problem of _deleted_docs.  What I've seen is that only a few docs with a few thousand _deleted_docs each will plug up Couch and render unusable. You can't get rid of it through natural means.
>
> This is what Riyad was bringing up and what Ive implemented.  I have a program which replicates from the troubled database _changes with the query param style=main_only.  This allows me to still have the revision tree of the troubled database, but without the _deleted_conflicts.  I can then wipe out the troubled db, recreate it, and replicate the shiny clean data back into it.
>
> This is unnatural and requires custom code to make happen.  I can live with it until a better solution comes around.
>
> What I'm really concerned about is how couchdb eats all my cpu.
>
> Is there any way to ration the resources that couchdb uses? Like tell it not to use more than 50% or something.  I think that couch eating all the resources on a machine just because its reading loads of data is a bug.  Is this a reasonable conclusion?
>
> On Mar 14, 2013, at 2:18 PM, Robert Newson <rn...@apache.org> wrote:
>
>> One trick, you can delete the doc and replicate with a filter like
>> 'return !doc['_deleted'];' that blocks all deletes. the target db will
>> then not receive any trace of these highly conflicted docs.
>>
>> On 14 March 2013 14:10, Stephen Bartell <sn...@gmail.com> wrote:
>>>
>>> On Mar 14, 2013, at 11:44 AM, Robert Newson <rn...@apache.org> wrote:
>>>
>>>> Conflicts are *not* removed during compaction, CouchDB has no way of
>>>> knowing which ones it would be ok to delete.
>>>
>>> Yep, they need to be deleted in the context of the person/process manipulating the docs.
>>>
>>>>
>>>> CouchDB does struggle to process documents with lots of conflicts,
>>>> we've encountered this at Cloudant a fair bunch. We resolve the
>>>> conflicts via http if possible or, if that consistently fails, with a
>>>> direct erlang manipulation. It's certainly something we need to
>>>> improve.
>>>>
>>>
>>> But even deleting them yields the same problem.  When replicating, the _deleted_conflicts is carried over.
>>> Users could be diligent in deleting conflicts, but still end up unable to replicate their docs because of the volume of _deleted_conflicts.
>>>
>>> Robert, thanks for chiming in.  I feel better knowing I'm in good company with this problem. When this mine eventually goes off, couchdb is rendered useless because beam.smp takes all the cpu.  Is there any way to ration the resources couchdb consumes?
>>>
>>>> B.
>>>>
>>>> On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
>>>>> Stephen,
>>>>> I am probably wrong here (someone hop in and correct me), but I thought
>>>>> Compaction would remove the old revisions (and conflicts) of docs.
>>>>>
>>>>> Alternatively a question for the Couch devs, if Stephen set _revs_limit to
>>>>> something artifically low, say 1, and restarted couch and did a compaction,
>>>>> would that force the DB to smash down the datastore to 1 rev per doc and
>>>>> remove the long-tail off these docs?
>>>>>
>>>>> REF: http://wiki.apache.org/couchdb/Compaction
>>>>>
>>>>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> tldr; I've got a database with just a couple docs.  Conflict management
>>>>>> went unchecked and these docs have thousands of conflicts each.
>>>>>> Replication fails.  Couch consumes all the server's cpu.
>>>>>>
>>>>>> First the story, then the questions.  Please bear with me!
>>>>>>
>>>>>> I wanted to replicate this database to another, new database.  So I
>>>>>> started the replication.  beam.smp took 100% of my cpu and the replicator
>>>>>> status held steady at a constant percent for quite a while.  It eventually
>>>>>> finished.
>>>>>>
>>>>>> I thought maybe I should handle the conflicts and then replicate.
>>>>>> Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>>>>>> replicated again but this time I could not get anything to replicate.
>>>>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>>>>>
>>>>>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>>>>>> that the replicator was timing out when trying to post to couch.
>>>>>>
>>>>>> I have a replicator that I've been working on thats written in node.js.
>>>>>> So I started that one up to do the same thing.  I drew inspiration from
>>>>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>>>>> documentation, so my replicator follows more or less the same story.  1)
>>>>>> consume _changes with style=all_docs.  2) revs_diff on the target database.
>>>>>> 3) get each revision from source with revs=true.  4) bulk post with
>>>>>> new_edits=false.
>>>>>>
>>>>>> Same thing.  Except now I can kind of make sense of whats going on.
>>>>>> Sucking the data out of the source is no problem.  Diffing the revs
>>>>>> against the target is no problem.  Posting the docs is THE problem.  Since
>>>>>> the database is clean, thousands of docs are being thrown at couch at once
>>>>>> to build up the revision trees.  Couch is just taking forever in finishing
>>>>>> the job.  It doesn't matter if I bulk post the docs or post them
>>>>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>>>>> finish. (I actually never let it finish).
>>>>>>
>>>>>> So that is is the story. Here are my questions.
>>>>>>
>>>>>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>>>>>> towards some workarounds?  I don't think it is right to make the assumption
>>>>>> that users of couchdb will never have databases with huge conflict sausages
>>>>>> like this. So simply saying manage your conflicts won't help.
>>>>>>
>>>>>> 2) Lets say I did manage my conflicts.  I still have the
>>>>>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
>>>>>> replicated to maintain consistency across the cluster.  If the replicator
>>>>>> throws up when these huge sausages come through, how is the data ever going
>>>>>> to replicate?  Is there a trade secret I don't know about?
>>>>>>
>>>>>> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>>>>>> I can get that we run into these cases where theres tons of data to move
>>>>>> and its just going to take a hell of a long time.  But I don't get why its
>>>>>> permissible for CouchDB to eat all my cpu.  The whole server should never
>>>>>> grind to a halt because its moving lots of data.  I feel like it should be
>>>>>> like the little train who could.  Just chug along slow and steady until it
>>>>>> crests the hill.
>>>>>>
>>>>>> I would really like to reply on the erlang replicator, but I can't.  At
>>>>>> least with the replicator I wrote I have a chance with throttling the posts
>>>>>> so CouchDB doesn't render my server useless.
>>>>>>
>>>>>> Sorry for wrapping more questions into those questions.  I'm pretty tired,
>>>>>> stumped, and have machines in production crumbling.
>>>>>>
>>>>>> Best,
>>>>>> Stephen
>>>
>

Re: replicating docs with tons of conflicts

Posted by Stephen Bartell <sn...@gmail.com>.
Robert, this only works if I don't need to keep those docs around anymore.  In my case, I want to keep the docs.  I don't want to keep the conflicts of the docs. Most importantly thought, even if I delete all the conflicts on all my docs, I still have the problem of _deleted_docs.  What I've seen is that only a few docs with a few thousand _deleted_docs each will plug up Couch and render unusable. You can't get rid of it through natural means.

This is what Riyad was bringing up and what Ive implemented.  I have a program which replicates from the troubled database _changes with the query param style=main_only.  This allows me to still have the revision tree of the troubled database, but without the _deleted_conflicts.  I can then wipe out the troubled db, recreate it, and replicate the shiny clean data back into it.

This is unnatural and requires custom code to make happen.  I can live with it until a better solution comes around.

What I'm really concerned about is how couchdb eats all my cpu.

Is there any way to ration the resources that couchdb uses? Like tell it not to use more than 50% or something.  I think that couch eating all the resources on a machine just because its reading loads of data is a bug.  Is this a reasonable conclusion?

On Mar 14, 2013, at 2:18 PM, Robert Newson <rn...@apache.org> wrote:

> One trick, you can delete the doc and replicate with a filter like
> 'return !doc['_deleted'];' that blocks all deletes. the target db will
> then not receive any trace of these highly conflicted docs.
> 
> On 14 March 2013 14:10, Stephen Bartell <sn...@gmail.com> wrote:
>> 
>> On Mar 14, 2013, at 11:44 AM, Robert Newson <rn...@apache.org> wrote:
>> 
>>> Conflicts are *not* removed during compaction, CouchDB has no way of
>>> knowing which ones it would be ok to delete.
>> 
>> Yep, they need to be deleted in the context of the person/process manipulating the docs.
>> 
>>> 
>>> CouchDB does struggle to process documents with lots of conflicts,
>>> we've encountered this at Cloudant a fair bunch. We resolve the
>>> conflicts via http if possible or, if that consistently fails, with a
>>> direct erlang manipulation. It's certainly something we need to
>>> improve.
>>> 
>> 
>> But even deleting them yields the same problem.  When replicating, the _deleted_conflicts is carried over.
>> Users could be diligent in deleting conflicts, but still end up unable to replicate their docs because of the volume of _deleted_conflicts.
>> 
>> Robert, thanks for chiming in.  I feel better knowing I'm in good company with this problem. When this mine eventually goes off, couchdb is rendered useless because beam.smp takes all the cpu.  Is there any way to ration the resources couchdb consumes?
>> 
>>> B.
>>> 
>>> On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
>>>> Stephen,
>>>> I am probably wrong here (someone hop in and correct me), but I thought
>>>> Compaction would remove the old revisions (and conflicts) of docs.
>>>> 
>>>> Alternatively a question for the Couch devs, if Stephen set _revs_limit to
>>>> something artifically low, say 1, and restarted couch and did a compaction,
>>>> would that force the DB to smash down the datastore to 1 rev per doc and
>>>> remove the long-tail off these docs?
>>>> 
>>>> REF: http://wiki.apache.org/couchdb/Compaction
>>>> 
>>>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> tldr; I've got a database with just a couple docs.  Conflict management
>>>>> went unchecked and these docs have thousands of conflicts each.
>>>>> Replication fails.  Couch consumes all the server's cpu.
>>>>> 
>>>>> First the story, then the questions.  Please bear with me!
>>>>> 
>>>>> I wanted to replicate this database to another, new database.  So I
>>>>> started the replication.  beam.smp took 100% of my cpu and the replicator
>>>>> status held steady at a constant percent for quite a while.  It eventually
>>>>> finished.
>>>>> 
>>>>> I thought maybe I should handle the conflicts and then replicate.
>>>>> Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>>>>> replicated again but this time I could not get anything to replicate.
>>>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>>>> 
>>>>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>>>>> that the replicator was timing out when trying to post to couch.
>>>>> 
>>>>> I have a replicator that I've been working on thats written in node.js.
>>>>> So I started that one up to do the same thing.  I drew inspiration from
>>>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>>>> documentation, so my replicator follows more or less the same story.  1)
>>>>> consume _changes with style=all_docs.  2) revs_diff on the target database.
>>>>> 3) get each revision from source with revs=true.  4) bulk post with
>>>>> new_edits=false.
>>>>> 
>>>>> Same thing.  Except now I can kind of make sense of whats going on.
>>>>> Sucking the data out of the source is no problem.  Diffing the revs
>>>>> against the target is no problem.  Posting the docs is THE problem.  Since
>>>>> the database is clean, thousands of docs are being thrown at couch at once
>>>>> to build up the revision trees.  Couch is just taking forever in finishing
>>>>> the job.  It doesn't matter if I bulk post the docs or post them
>>>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>>>> finish. (I actually never let it finish).
>>>>> 
>>>>> So that is is the story. Here are my questions.
>>>>> 
>>>>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>>>>> towards some workarounds?  I don't think it is right to make the assumption
>>>>> that users of couchdb will never have databases with huge conflict sausages
>>>>> like this. So simply saying manage your conflicts won't help.
>>>>> 
>>>>> 2) Lets say I did manage my conflicts.  I still have the
>>>>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
>>>>> replicated to maintain consistency across the cluster.  If the replicator
>>>>> throws up when these huge sausages come through, how is the data ever going
>>>>> to replicate?  Is there a trade secret I don't know about?
>>>>> 
>>>>> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>>>>> I can get that we run into these cases where theres tons of data to move
>>>>> and its just going to take a hell of a long time.  But I don't get why its
>>>>> permissible for CouchDB to eat all my cpu.  The whole server should never
>>>>> grind to a halt because its moving lots of data.  I feel like it should be
>>>>> like the little train who could.  Just chug along slow and steady until it
>>>>> crests the hill.
>>>>> 
>>>>> I would really like to reply on the erlang replicator, but I can't.  At
>>>>> least with the replicator I wrote I have a chance with throttling the posts
>>>>> so CouchDB doesn't render my server useless.
>>>>> 
>>>>> Sorry for wrapping more questions into those questions.  I'm pretty tired,
>>>>> stumped, and have machines in production crumbling.
>>>>> 
>>>>> Best,
>>>>> Stephen
>> 


Re: replicating docs with tons of conflicts

Posted by Robert Newson <rn...@apache.org>.
One trick, you can delete the doc and replicate with a filter like
'return !doc['_deleted'];' that blocks all deletes. the target db will
then not receive any trace of these highly conflicted docs.

On 14 March 2013 14:10, Stephen Bartell <sn...@gmail.com> wrote:
>
> On Mar 14, 2013, at 11:44 AM, Robert Newson <rn...@apache.org> wrote:
>
>> Conflicts are *not* removed during compaction, CouchDB has no way of
>> knowing which ones it would be ok to delete.
>
> Yep, they need to be deleted in the context of the person/process manipulating the docs.
>
>>
>> CouchDB does struggle to process documents with lots of conflicts,
>> we've encountered this at Cloudant a fair bunch. We resolve the
>> conflicts via http if possible or, if that consistently fails, with a
>> direct erlang manipulation. It's certainly something we need to
>> improve.
>>
>
> But even deleting them yields the same problem.  When replicating, the _deleted_conflicts is carried over.
> Users could be diligent in deleting conflicts, but still end up unable to replicate their docs because of the volume of _deleted_conflicts.
>
> Robert, thanks for chiming in.  I feel better knowing I'm in good company with this problem. When this mine eventually goes off, couchdb is rendered useless because beam.smp takes all the cpu.  Is there any way to ration the resources couchdb consumes?
>
>> B.
>>
>> On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
>>> Stephen,
>>> I am probably wrong here (someone hop in and correct me), but I thought
>>> Compaction would remove the old revisions (and conflicts) of docs.
>>>
>>> Alternatively a question for the Couch devs, if Stephen set _revs_limit to
>>> something artifically low, say 1, and restarted couch and did a compaction,
>>> would that force the DB to smash down the datastore to 1 rev per doc and
>>> remove the long-tail off these docs?
>>>
>>> REF: http://wiki.apache.org/couchdb/Compaction
>>>
>>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:
>>>
>>>> Hi all,
>>>>
>>>> tldr; I've got a database with just a couple docs.  Conflict management
>>>> went unchecked and these docs have thousands of conflicts each.
>>>> Replication fails.  Couch consumes all the server's cpu.
>>>>
>>>> First the story, then the questions.  Please bear with me!
>>>>
>>>> I wanted to replicate this database to another, new database.  So I
>>>> started the replication.  beam.smp took 100% of my cpu and the replicator
>>>> status held steady at a constant percent for quite a while.  It eventually
>>>> finished.
>>>>
>>>> I thought maybe I should handle the conflicts and then replicate.
>>>> Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>>>> replicated again but this time I could not get anything to replicate.
>>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>>>
>>>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>>>> that the replicator was timing out when trying to post to couch.
>>>>
>>>> I have a replicator that I've been working on thats written in node.js.
>>>> So I started that one up to do the same thing.  I drew inspiration from
>>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>>> documentation, so my replicator follows more or less the same story.  1)
>>>> consume _changes with style=all_docs.  2) revs_diff on the target database.
>>>> 3) get each revision from source with revs=true.  4) bulk post with
>>>> new_edits=false.
>>>>
>>>> Same thing.  Except now I can kind of make sense of whats going on.
>>>> Sucking the data out of the source is no problem.  Diffing the revs
>>>> against the target is no problem.  Posting the docs is THE problem.  Since
>>>> the database is clean, thousands of docs are being thrown at couch at once
>>>> to build up the revision trees.  Couch is just taking forever in finishing
>>>> the job.  It doesn't matter if I bulk post the docs or post them
>>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>>> finish. (I actually never let it finish).
>>>>
>>>> So that is is the story. Here are my questions.
>>>>
>>>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>>>> towards some workarounds?  I don't think it is right to make the assumption
>>>> that users of couchdb will never have databases with huge conflict sausages
>>>> like this. So simply saying manage your conflicts won't help.
>>>>
>>>> 2) Lets say I did manage my conflicts.  I still have the
>>>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
>>>> replicated to maintain consistency across the cluster.  If the replicator
>>>> throws up when these huge sausages come through, how is the data ever going
>>>> to replicate?  Is there a trade secret I don't know about?
>>>>
>>>> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>>>> I can get that we run into these cases where theres tons of data to move
>>>> and its just going to take a hell of a long time.  But I don't get why its
>>>> permissible for CouchDB to eat all my cpu.  The whole server should never
>>>> grind to a halt because its moving lots of data.  I feel like it should be
>>>> like the little train who could.  Just chug along slow and steady until it
>>>> crests the hill.
>>>>
>>>> I would really like to reply on the erlang replicator, but I can't.  At
>>>> least with the replicator I wrote I have a chance with throttling the posts
>>>> so CouchDB doesn't render my server useless.
>>>>
>>>> Sorry for wrapping more questions into those questions.  I'm pretty tired,
>>>> stumped, and have machines in production crumbling.
>>>>
>>>> Best,
>>>> Stephen
>

Re: replicating docs with tons of conflicts

Posted by Stephen Bartell <sn...@gmail.com>.
On Mar 14, 2013, at 11:44 AM, Robert Newson <rn...@apache.org> wrote:

> Conflicts are *not* removed during compaction, CouchDB has no way of
> knowing which ones it would be ok to delete.

Yep, they need to be deleted in the context of the person/process manipulating the docs.

> 
> CouchDB does struggle to process documents with lots of conflicts,
> we've encountered this at Cloudant a fair bunch. We resolve the
> conflicts via http if possible or, if that consistently fails, with a
> direct erlang manipulation. It's certainly something we need to
> improve.
> 

But even deleting them yields the same problem.  When replicating, the _deleted_conflicts is carried over.
Users could be diligent in deleting conflicts, but still end up unable to replicate their docs because of the volume of _deleted_conflicts.

Robert, thanks for chiming in.  I feel better knowing I'm in good company with this problem. When this mine eventually goes off, couchdb is rendered useless because beam.smp takes all the cpu.  Is there any way to ration the resources couchdb consumes?

> B.
> 
> On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
>> Stephen,
>> I am probably wrong here (someone hop in and correct me), but I thought
>> Compaction would remove the old revisions (and conflicts) of docs.
>> 
>> Alternatively a question for the Couch devs, if Stephen set _revs_limit to
>> something artifically low, say 1, and restarted couch and did a compaction,
>> would that force the DB to smash down the datastore to 1 rev per doc and
>> remove the long-tail off these docs?
>> 
>> REF: http://wiki.apache.org/couchdb/Compaction
>> 
>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:
>> 
>>> Hi all,
>>> 
>>> tldr; I've got a database with just a couple docs.  Conflict management
>>> went unchecked and these docs have thousands of conflicts each.
>>> Replication fails.  Couch consumes all the server's cpu.
>>> 
>>> First the story, then the questions.  Please bear with me!
>>> 
>>> I wanted to replicate this database to another, new database.  So I
>>> started the replication.  beam.smp took 100% of my cpu and the replicator
>>> status held steady at a constant percent for quite a while.  It eventually
>>> finished.
>>> 
>>> I thought maybe I should handle the conflicts and then replicate.
>>> Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>>> replicated again but this time I could not get anything to replicate.
>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>> 
>>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>>> that the replicator was timing out when trying to post to couch.
>>> 
>>> I have a replicator that I've been working on thats written in node.js.
>>> So I started that one up to do the same thing.  I drew inspiration from
>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>> documentation, so my replicator follows more or less the same story.  1)
>>> consume _changes with style=all_docs.  2) revs_diff on the target database.
>>> 3) get each revision from source with revs=true.  4) bulk post with
>>> new_edits=false.
>>> 
>>> Same thing.  Except now I can kind of make sense of whats going on.
>>> Sucking the data out of the source is no problem.  Diffing the revs
>>> against the target is no problem.  Posting the docs is THE problem.  Since
>>> the database is clean, thousands of docs are being thrown at couch at once
>>> to build up the revision trees.  Couch is just taking forever in finishing
>>> the job.  It doesn't matter if I bulk post the docs or post them
>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>> finish. (I actually never let it finish).
>>> 
>>> So that is is the story. Here are my questions.
>>> 
>>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>>> towards some workarounds?  I don't think it is right to make the assumption
>>> that users of couchdb will never have databases with huge conflict sausages
>>> like this. So simply saying manage your conflicts won't help.
>>> 
>>> 2) Lets say I did manage my conflicts.  I still have the
>>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
>>> replicated to maintain consistency across the cluster.  If the replicator
>>> throws up when these huge sausages come through, how is the data ever going
>>> to replicate?  Is there a trade secret I don't know about?
>>> 
>>> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>>> I can get that we run into these cases where theres tons of data to move
>>> and its just going to take a hell of a long time.  But I don't get why its
>>> permissible for CouchDB to eat all my cpu.  The whole server should never
>>> grind to a halt because its moving lots of data.  I feel like it should be
>>> like the little train who could.  Just chug along slow and steady until it
>>> crests the hill.
>>> 
>>> I would really like to reply on the erlang replicator, but I can't.  At
>>> least with the replicator I wrote I have a chance with throttling the posts
>>> so CouchDB doesn't render my server useless.
>>> 
>>> Sorry for wrapping more questions into those questions.  I'm pretty tired,
>>> stumped, and have machines in production crumbling.
>>> 
>>> Best,
>>> Stephen


Re: replicating docs with tons of conflicts

Posted by Robert Newson <rn...@apache.org>.
Conflicts are *not* removed during compaction, CouchDB has no way of
knowing which ones it would be ok to delete.

CouchDB does struggle to process documents with lots of conflicts,
we've encountered this at Cloudant a fair bunch. We resolve the
conflicts via http if possible or, if that consistently fails, with a
direct erlang manipulation. It's certainly something we need to
improve.

B.

On 14 March 2013 13:09, Riyad Kalla <rk...@gmail.com> wrote:
> Stephen,
> I am probably wrong here (someone hop in and correct me), but I thought
> Compaction would remove the old revisions (and conflicts) of docs.
>
> Alternatively a question for the Couch devs, if Stephen set _revs_limit to
> something artifically low, say 1, and restarted couch and did a compaction,
> would that force the DB to smash down the datastore to 1 rev per doc and
> remove the long-tail off these docs?
>
> REF: http://wiki.apache.org/couchdb/Compaction
>
> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:
>
>> Hi all,
>>
>> tldr; I've got a database with just a couple docs.  Conflict management
>> went unchecked and these docs have thousands of conflicts each.
>>  Replication fails.  Couch consumes all the server's cpu.
>>
>> First the story, then the questions.  Please bear with me!
>>
>> I wanted to replicate this database to another, new database.  So I
>> started the replication.  beam.smp took 100% of my cpu and the replicator
>> status held steady at a constant percent for quite a while.  It eventually
>> finished.
>>
>> I thought maybe I should handle the conflicts and then replicate.
>>  Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>> replicated again but this time I could not get anything to replicate.
>>  Again, cpu held steady, topped out. I eventually restarted couch.
>>
>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>> that the replicator was timing out when trying to post to couch.
>>
>> I have a replicator that I've been working on thats written in node.js.
>>  So I started that one up to do the same thing.  I drew inspiration from
>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>> documentation, so my replicator follows more or less the same story.  1)
>> consume _changes with style=all_docs.  2) revs_diff on the target database.
>>  3) get each revision from source with revs=true.  4) bulk post with
>> new_edits=false.
>>
>> Same thing.  Except now I can kind of make sense of whats going on.
>>  Sucking the data out of the source is no problem.  Diffing the revs
>> against the target is no problem.  Posting the docs is THE problem.  Since
>> the database is clean, thousands of docs are being thrown at couch at once
>> to build up the revision trees.  Couch is just taking forever in finishing
>> the job.  It doesn't matter if I bulk post the docs or post them
>> individually, couch sucks 100% of my cpu every time and takes forever to
>> finish. (I actually never let it finish).
>>
>> So that is is the story. Here are my questions.
>>
>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>> towards some workarounds?  I don't think it is right to make the assumption
>> that users of couchdb will never have databases with huge conflict sausages
>> like this. So simply saying manage your conflicts won't help.
>>
>> 2) Lets say I did manage my conflicts.  I still have the
>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
>> replicated to maintain consistency across the cluster.  If the replicator
>> throws up when these huge sausages come through, how is the data ever going
>> to replicate?  Is there a trade secret I don't know about?
>>
>> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>>  I can get that we run into these cases where theres tons of data to move
>> and its just going to take a hell of a long time.  But I don't get why its
>> permissible for CouchDB to eat all my cpu.  The whole server should never
>> grind to a halt because its moving lots of data.  I feel like it should be
>> like the little train who could.  Just chug along slow and steady until it
>> crests the hill.
>>
>> I would really like to reply on the erlang replicator, but I can't.  At
>> least with the replicator I wrote I have a chance with throttling the posts
>> so CouchDB doesn't render my server useless.
>>
>> Sorry for wrapping more questions into those questions.  I'm pretty tired,
>> stumped, and have machines in production crumbling.
>>
>> Best,
>> Stephen

Re: replicating docs with tons of conflicts

Posted by Riyad Kalla <rk...@gmail.com>.
Stephen,
I am probably wrong here (someone hop in and correct me), but I thought
Compaction would remove the old revisions (and conflicts) of docs.

Alternatively a question for the Couch devs, if Stephen set _revs_limit to
something artifically low, say 1, and restarted couch and did a compaction,
would that force the DB to smash down the datastore to 1 rev per doc and
remove the long-tail off these docs?

REF: http://wiki.apache.org/couchdb/Compaction

On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <sn...@gmail.com>wrote:

> Hi all,
>
> tldr; I've got a database with just a couple docs.  Conflict management
> went unchecked and these docs have thousands of conflicts each.
>  Replication fails.  Couch consumes all the server's cpu.
>
> First the story, then the questions.  Please bear with me!
>
> I wanted to replicate this database to another, new database.  So I
> started the replication.  beam.smp took 100% of my cpu and the replicator
> status held steady at a constant percent for quite a while.  It eventually
> finished.
>
> I thought maybe I should handle the conflicts and then replicate.
>  Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
> replicated again but this time I could not get anything to replicate.
>  Again, cpu held steady, topped out. I eventually restarted couch.
>
> I dug throughout the logs and saw that the POSTS were failing.  I figure
> that the replicator was timing out when trying to post to couch.
>
> I have a replicator that I've been working on thats written in node.js.
>  So I started that one up to do the same thing.  I drew inspiration from
> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
> documentation, so my replicator follows more or less the same story.  1)
> consume _changes with style=all_docs.  2) revs_diff on the target database.
>  3) get each revision from source with revs=true.  4) bulk post with
> new_edits=false.
>
> Same thing.  Except now I can kind of make sense of whats going on.
>  Sucking the data out of the source is no problem.  Diffing the revs
> against the target is no problem.  Posting the docs is THE problem.  Since
> the database is clean, thousands of docs are being thrown at couch at once
> to build up the revision trees.  Couch is just taking forever in finishing
> the job.  It doesn't matter if I bulk post the docs or post them
> individually, couch sucks 100% of my cpu every time and takes forever to
> finish. (I actually never let it finish).
>
> So that is is the story. Here are my questions.
>
> 1) Has anyone else stepped on this mine?  If so, could I get pointed
> towards some workarounds?  I don't think it is right to make the assumption
> that users of couchdb will never have databases with huge conflict sausages
> like this. So simply saying manage your conflicts won't help.
>
> 2) Lets say I did manage my conflicts.  I still have the
> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
> replicated to maintain consistency across the cluster.  If the replicator
> throws up when these huge sausages come through, how is the data ever going
> to replicate?  Is there a trade secret I don't know about?
>
> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>  I can get that we run into these cases where theres tons of data to move
> and its just going to take a hell of a long time.  But I don't get why its
> permissible for CouchDB to eat all my cpu.  The whole server should never
> grind to a halt because its moving lots of data.  I feel like it should be
> like the little train who could.  Just chug along slow and steady until it
> crests the hill.
>
> I would really like to reply on the erlang replicator, but I can't.  At
> least with the replicator I wrote I have a chance with throttling the posts
> so CouchDB doesn't render my server useless.
>
> Sorry for wrapping more questions into those questions.  I'm pretty tired,
> stumped, and have machines in production crumbling.
>
> Best,
> Stephen