You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Nicolas Peeters <ni...@gmail.com> on 2013/03/07 08:19:22 UTC

CouchDB compaction not catching up.

Hi CouchDB Users,

*Disclaimer: I'm very aware that the use case is definitely not the best
for CouchDB, but for now, we have to deal with it.*

*Scenario:*

We have a fairly large (~750Gb) CouchDB (1.2.0) database that is being used
for transactional logs (very write heavy) (bad idea/design, I know, but
that's besides the point of this question - we're looking at alternative
designs). Once in a while, we delete some of the records in large batches
and we have scheduled auto compaction, checking every 2 hours.

This is the compaction config:

[image: Inline image 1]

>From what I can see, the DB is being hammered significantly every 12 hours
and the compaction is taking (sometimes 24 hours (with a size of 100GB of
log data, sometimes much more (up to 500GB)).

We run on EC2. Large instances with EBS. No striping (yet), no IOPS. We
tried fatter machines, but the improvement was really minimal.

**

*The problem:*

The problem is that compaction takes a very long time (e.g. 12h+) and
reduces the performance of the entire stack. The main issue seems to be
that it's hard for the compaction process to "keep up" with the insertions,
hence why it takes so long. Also, the compaction of the view takes long
time (sometimes the view is 100GB). During the re-compaction of the view,
clients don't get a response, which is blocking the processes.

[image: Inline image 2]

The view compaction takes approx. 8 hours and the indexing for the view are
therefore slower and during the time that view indexes, another 300k
insertions have been done (and it doesn't catch up). The only way to solve
the problem was to throttle the number of inserts from the app itself and
then eventually the view compaction resolved. If we would have continued to
insert at the same rate, it would not have finished (and ultimately, we
would have run out of disk space).

Any recommendations to set it up on EC2 is welcome. Also configuration
settings for the compaction would be helpful.

Thanks.

Nicolas

PS: We are happily using CouchDB for other (more traditional) use case
where it does go very well.

Re: CouchDB compaction not catching up.

Posted by Riyad Kalla <rk...@gmail.com>.

Will be very curious how you end up solving this, please keep us posted!

Sent from my iPhone

On Mar 7, 2013, at 1:47 PM, Nicolas Peeters <ni...@gmail.com> wrote:

> See my answers in the text. I know there are all kinds of workarounds
> possible and it seems that this is actually not such a big problem for all
> other users.
> Maybe this "extreme" case warrants more practical workarounds indeed.
>
> On Thu, Mar 7, 2013 at 4:12 PM, Riyad Kalla <rk...@gmail.com> wrote:
>
>> To Simon's point, exactly where I was headed. Your issue is that
>> compaction cannot catch up due to write velocity, so you need to avoid
>> compaction (and by extension replication since the issue is that your
>> background writes cannot catch up) The only way to do that is some
>> working model where you simple discard the data file when done and
>> start anew.
> Indeed. Unless the file actually gets so big that you can't possibly do
> anything. But then again, maybe a design issue in the amount of stuff being
> logged.
>
>
>> You mentioned clearing a few 100 records at a time after a tx
>> completes, so it sounds like over the period of a week, you should be
>> turning over your entire data set completely right?
>
> Typically, yes.
>
>>
>> I wonder there could be a solution here like fronting a few CouchDB
>> instances with nginx and using a cron job, on day 5 or 7, flipping
>> inbound traffic to a hot (empty standby) while processing the
>> remaining data off the old master an then clearing it out which writes
>> are directed to the new master for the next week?
>
> Wow. That's an impressive workaround but that would work indeed. I'd prefer
> using standards features (that can also be easily driven by a web app or
> something (which is the case)).
>
> Again, this only makes sense depending on data usage and if the
>> pending data off the slave would need to stay accessible to a front
>> end like search. Ultimately what I am suggesting here is a solution
>> where you always have a CouchDB instance to write logs to, but you are
>> never trying to compact which would require some clever juggling
>> between instances.
>>
>> Alternatively... Your problem is write performance, I would be curious
>> if IOPS instances would cure this for you right out of the box with no
>> engineering work.
>>
>> Longer term? Probably check out aws redline.
>
> At the moment, we're looking at alternatives which is to use Logstash and
> write either to files and/or stream to ElasticSearch. Delete would be
> achieved by deleted in bulk a whole "index" (a bit like the solution
> mentioned above). We'll keep CouchDB for the "important" logs and
> transactions logs are possibly going to be dealt in a different way.
>
>
>> Sent from my iPhone
>>
>> On Mar 7, 2013, at 1:58 AM, Nicolas Peeters <ni...@gmail.com> wrote:
>>
>>> Simon,
>>>
>>> That's actually a very suggestion and we actually implemented that (we
>> had
>>> one DB per "process"). The problem that the size of the DB sometimes
>>> outgrew our disks (1TB!) (and sometimes, we needed to keep the data
>> around
>>> for longer periods), so we discarded it at the end.
>>>
>>> This is however a workaround. And the main question was about the
>>> compaction not catching up (which may be a problem in some other cases).
>>>
>>>
>>> On Thu, Mar 7, 2013 at 9:58 AM, Simon Metson <si...@cloudant.com> wrote:
>>>
>>>> What about making a database per day/week and dropping the whole lot in
>>>> one go?
>>>>
>>>>
>>>> On Thursday, 7 March 2013 at 08:50, Nicolas Peeters wrote:
>>>>
>>>>> So the use case is some kind of transactional log associated with some
>>>> kind
>>>>> of long running process (1 day). For each process, a few 100 thousands
>>>>> lines of "logging" are inserted. When the process has completed (user
>>>>> approval), we would like to delete all the associated "logs". Marking
>>>> items
>>>>> as deleted is not really the issue. Recovering the space is.
>>>>>
>>>>> The data should ideally be available for up to a week or so.
>>>>>
>>>>>
>>>>> On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla <rk...@gmail.com> wrote:
>>>>>
>>>>>> Nicolas,
>>>>>> Can you provide some insight into how you decide which large batches
>> of
>>>>>> records to delete and roughly how big (MB/GB wise) those batches are?
>>>> What
>>>>>> is the required longevity of this tx information in this couch store?
>>>> Is
>>>>>> this just temporary storage or is this the system of record and what
>>>> you
>>>>>> are deleting in large batches are just temporary intermediary data?
>>>>>>
>>>>>> Understanding how you are using the data and turning over the data
>>>> could
>>>>>> help assess some alternative strategies.
>>>>>>
>>>>>> Best,
>>>>>> Riyad
>>>>>>
>>>>>> On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <nicolists@gmail.com
>>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>> Hi CouchDB Users,
>>>>>>>
>>>>>>> *Disclaimer: I'm very aware that the use case is definitely not the
>>>> best
>>>>>>> for CouchDB, but for now, we have to deal with it.*
>>>>>>>
>>>>>>> *Scenario:*
>>>>>>>
>>>>>>> We have a fairly large (~750Gb) CouchDB (1.2.0) database that is
>>>> being
>>>>>>> used for transactional logs (very write heavy) (bad idea/design, I
>>>> know,
>>>>>>> but that's besides the point of this question - we're looking at
>>>>>>> alternative designs). Once in a while, we delete some of the records
>>>> in
>>>>>>> large batches and we have scheduled auto compaction, checking every 2
>>>>>>> hours.
>>>>>>>
>>>>>>> This is the compaction config:
>>>>>>>
>>>>>>> [image: Inline image 1]
>>>>>>>
>>>>>>> From what I can see, the DB is being hammered significantly every 12
>>>>>> hours
>>>>>>> and the compaction is taking (sometimes 24 hours (with a size of
>>>> 100GB of
>>>>>>> log data, sometimes much more (up to 500GB)).
>>>>>>>
>>>>>>> We run on EC2. Large instances with EBS. No striping (yet), no IOPS.
>>>> We
>>>>>>> tried fatter machines, but the improvement was really minimal.
>>>>>>>
>>>>>>> **
>>>>>>>
>>>>>>> *The problem:*
>>>>>>>
>>>>>>> The problem is that compaction takes a very long time (e.g. 12h+) and
>>>>>>> reduces the performance of the entire stack. The main issue seems to
>>>> be
>>>>>>> that it's hard for the compaction process to "keep up" with the
>>>>>>
>>>>>> insertions,
>>>>>>> hence why it takes so long. Also, the compaction of the view takes
>>>> long
>>>>>>> time (sometimes the view is 100GB). During the re-compaction of the
>>>> view,
>>>>>>> clients don't get a response, which is blocking the processes.
>>>>>>>
>>>>>>> [image: Inline image 2]
>>>>>>>
>>>>>>> The view compaction takes approx. 8 hours and the indexing for the
>>>> view
>>>>>>> are therefore slower and during the time that view indexes, another
>>>> 300k
>>>>>>> insertions have been done (and it doesn't catch up). The only way to
>>>>>>
>>>>>> solve
>>>>>>> the problem was to throttle the number of inserts from the app
>>>> itself and
>>>>>>> then eventually the view compaction resolved. If we would have
>>>> continued
>>>>>>
>>>>>> to
>>>>>>> insert at the same rate, it would not have finished (and ultimately,
>>>> we
>>>>>>> would have run out of disk space).
>>>>>>>
>>>>>>> Any recommendations to set it up on EC2 is welcome. Also
>>>> configuration
>>>>>>> settings for the compaction would be helpful.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Nicolas
>>>>>>>
>>>>>>> PS: We are happily using CouchDB for other (more traditional) use
>>>> case
>>>>>>> where it does go very well.
>>

Re: CouchDB compaction not catching up.

Posted by Nicolas Peeters <ni...@gmail.com>.

See my answers in the text. I know there are all kinds of workarounds
possible and it seems that this is actually not such a big problem for all
other users.
Maybe this "extreme" case warrants more practical workarounds indeed.

On Thu, Mar 7, 2013 at 4:12 PM, Riyad Kalla <rk...@gmail.com> wrote:

> To Simon's point, exactly where I was headed. Your issue is that
> compaction cannot catch up due to write velocity, so you need to avoid
> compaction (and by extension replication since the issue is that your
> background writes cannot catch up) The only way to do that is some
> working model where you simple discard the data file when done and
> start anew.
>
>
Indeed. Unless the file actually gets so big that you can't possibly do
anything. But then again, maybe a design issue in the amount of stuff being
logged.


> You mentioned clearing a few 100 records at a time after a tx
> completes, so it sounds like over the period of a week, you should be
> turning over your entire data set completely right?
>

Typically, yes.

>
> I wonder there could be a solution here like fronting a few CouchDB
> instances with nginx and using a cron job, on day 5 or 7, flipping
> inbound traffic to a hot (empty standby) while processing the
> remaining data off the old master an then clearing it out which writes
> are directed to the new master for the next week?
>

Wow. That's an impressive workaround but that would work indeed. I'd prefer
using standards features (that can also be easily driven by a web app or
something (which is the case)).

Again, this only makes sense depending on data usage and if the
> pending data off the slave would need to stay accessible to a front
> end like search. Ultimately what I am suggesting here is a solution
> where you always have a CouchDB instance to write logs to, but you are
> never trying to compact which would require some clever juggling
> between instances.
>
> Alternatively... Your problem is write performance, I would be curious
> if IOPS instances would cure this for you right out of the box with no
> engineering work.
>
> Longer term? Probably check out aws redline.
>

At the moment, we're looking at alternatives which is to use Logstash and
write either to files and/or stream to ElasticSearch. Delete would be
achieved by deleted in bulk a whole "index" (a bit like the solution
mentioned above). We'll keep CouchDB for the "important" logs and
transactions logs are possibly going to be dealt in a different way.


> Sent from my iPhone
>
> On Mar 7, 2013, at 1:58 AM, Nicolas Peeters <ni...@gmail.com> wrote:
>
> > Simon,
> >
> > That's actually a very suggestion and we actually implemented that (we
> had
> > one DB per "process"). The problem that the size of the DB sometimes
> > outgrew our disks (1TB!) (and sometimes, we needed to keep the data
> around
> > for longer periods), so we discarded it at the end.
> >
> > This is however a workaround. And the main question was about the
> > compaction not catching up (which may be a problem in some other cases).
> >
> >
> > On Thu, Mar 7, 2013 at 9:58 AM, Simon Metson <si...@cloudant.com> wrote:
> >
> >> What about making a database per day/week and dropping the whole lot in
> >> one go?
> >>
> >>
> >> On Thursday, 7 March 2013 at 08:50, Nicolas Peeters wrote:
> >>
> >>> So the use case is some kind of transactional log associated with some
> >> kind
> >>> of long running process (1 day). For each process, a few 100 thousands
> >>> lines of "logging" are inserted. When the process has completed (user
> >>> approval), we would like to delete all the associated "logs". Marking
> >> items
> >>> as deleted is not really the issue. Recovering the space is.
> >>>
> >>> The data should ideally be available for up to a week or so.
> >>>
> >>>
> >>> On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla <rk...@gmail.com> wrote:
> >>>
> >>>> Nicolas,
> >>>> Can you provide some insight into how you decide which large batches
> of
> >>>> records to delete and roughly how big (MB/GB wise) those batches are?
> >> What
> >>>> is the required longevity of this tx information in this couch store?
> >> Is
> >>>> this just temporary storage or is this the system of record and what
> >> you
> >>>> are deleting in large batches are just temporary intermediary data?
> >>>>
> >>>> Understanding how you are using the data and turning over the data
> >> could
> >>>> help assess some alternative strategies.
> >>>>
> >>>> Best,
> >>>> Riyad
> >>>>
> >>>> On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <nicolists@gmail.com
> >>>>> wrote:
> >>>>
> >>>>
> >>>>> Hi CouchDB Users,
> >>>>>
> >>>>> *Disclaimer: I'm very aware that the use case is definitely not the
> >> best
> >>>>> for CouchDB, but for now, we have to deal with it.*
> >>>>>
> >>>>> *Scenario:*
> >>>>>
> >>>>> We have a fairly large (~750Gb) CouchDB (1.2.0) database that is
> >> being
> >>>>> used for transactional logs (very write heavy) (bad idea/design, I
> >> know,
> >>>>> but that's besides the point of this question - we're looking at
> >>>>> alternative designs). Once in a while, we delete some of the records
> >> in
> >>>>> large batches and we have scheduled auto compaction, checking every 2
> >>>>> hours.
> >>>>>
> >>>>> This is the compaction config:
> >>>>>
> >>>>> [image: Inline image 1]
> >>>>>
> >>>>> From what I can see, the DB is being hammered significantly every 12
> >>>> hours
> >>>>> and the compaction is taking (sometimes 24 hours (with a size of
> >> 100GB of
> >>>>> log data, sometimes much more (up to 500GB)).
> >>>>>
> >>>>> We run on EC2. Large instances with EBS. No striping (yet), no IOPS.
> >> We
> >>>>> tried fatter machines, but the improvement was really minimal.
> >>>>>
> >>>>> **
> >>>>>
> >>>>> *The problem:*
> >>>>>
> >>>>> The problem is that compaction takes a very long time (e.g. 12h+) and
> >>>>> reduces the performance of the entire stack. The main issue seems to
> >> be
> >>>>> that it's hard for the compaction process to "keep up" with the
> >>>>
> >>>> insertions,
> >>>>> hence why it takes so long. Also, the compaction of the view takes
> >> long
> >>>>> time (sometimes the view is 100GB). During the re-compaction of the
> >> view,
> >>>>> clients don't get a response, which is blocking the processes.
> >>>>>
> >>>>> [image: Inline image 2]
> >>>>>
> >>>>> The view compaction takes approx. 8 hours and the indexing for the
> >> view
> >>>>> are therefore slower and during the time that view indexes, another
> >> 300k
> >>>>> insertions have been done (and it doesn't catch up). The only way to
> >>>>
> >>>> solve
> >>>>> the problem was to throttle the number of inserts from the app
> >> itself and
> >>>>> then eventually the view compaction resolved. If we would have
> >> continued
> >>>>
> >>>> to
> >>>>> insert at the same rate, it would not have finished (and ultimately,
> >> we
> >>>>> would have run out of disk space).
> >>>>>
> >>>>> Any recommendations to set it up on EC2 is welcome. Also
> >> configuration
> >>>>> settings for the compaction would be helpful.
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> Nicolas
> >>>>>
> >>>>> PS: We are happily using CouchDB for other (more traditional) use
> >> case
> >>>>> where it does go very well.
> >>
> >>
> >>
>

Re: CouchDB compaction not catching up.

Posted by Riyad Kalla <rk...@gmail.com>.

To Simon's point, exactly where I was headed. Your issue is that
compaction cannot catch up due to write velocity, so you need to avoid
compaction (and by extension replication since the issue is that your
background writes cannot catch up) The only way to do that is some
working model where you simple discard the data file when done and
start anew.

You mentioned clearing a few 100 records at a time after a tx
completes, so it sounds like over the period of a week, you should be
turning over your entire data set completely right?

I wonder there could be a solution here like fronting a few CouchDB
instances with nginx and using a cron job, on day 5 or 7, flipping
inbound traffic to a hot (empty standby) while processing the
remaining data off the old master an then clearing it out which writes
are directed to the new master for the next week?

Again, this only makes sense depending on data usage and if the
pending data off the slave would need to stay accessible to a front
end like search. Ultimately what I am suggesting here is a solution
where you always have a CouchDB instance to write logs to, but you are
never trying to compact which would require some clever juggling
between instances.

Alternatively... Your problem is write performance, I would be curious
if IOPS instances would cure this for you right out of the box with no
engineering work.

Longer term? Probably check out aws redline.

Sent from my iPhone

On Mar 7, 2013, at 1:58 AM, Nicolas Peeters <ni...@gmail.com> wrote:

> Simon,
>
> That's actually a very suggestion and we actually implemented that (we had
> one DB per "process"). The problem that the size of the DB sometimes
> outgrew our disks (1TB!) (and sometimes, we needed to keep the data around
> for longer periods), so we discarded it at the end.
>
> This is however a workaround. And the main question was about the
> compaction not catching up (which may be a problem in some other cases).
>
>
> On Thu, Mar 7, 2013 at 9:58 AM, Simon Metson <si...@cloudant.com> wrote:
>
>> What about making a database per day/week and dropping the whole lot in
>> one go?
>>
>>
>> On Thursday, 7 March 2013 at 08:50, Nicolas Peeters wrote:
>>
>>> So the use case is some kind of transactional log associated with some
>> kind
>>> of long running process (1 day). For each process, a few 100 thousands
>>> lines of "logging" are inserted. When the process has completed (user
>>> approval), we would like to delete all the associated "logs". Marking
>> items
>>> as deleted is not really the issue. Recovering the space is.
>>>
>>> The data should ideally be available for up to a week or so.
>>>
>>>
>>> On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla <rk...@gmail.com> wrote:
>>>
>>>> Nicolas,
>>>> Can you provide some insight into how you decide which large batches of
>>>> records to delete and roughly how big (MB/GB wise) those batches are?
>> What
>>>> is the required longevity of this tx information in this couch store?
>> Is
>>>> this just temporary storage or is this the system of record and what
>> you
>>>> are deleting in large batches are just temporary intermediary data?
>>>>
>>>> Understanding how you are using the data and turning over the data
>> could
>>>> help assess some alternative strategies.
>>>>
>>>> Best,
>>>> Riyad
>>>>
>>>> On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <nicolists@gmail.com
>>>>> wrote:
>>>>
>>>>
>>>>> Hi CouchDB Users,
>>>>>
>>>>> *Disclaimer: I'm very aware that the use case is definitely not the
>> best
>>>>> for CouchDB, but for now, we have to deal with it.*
>>>>>
>>>>> *Scenario:*
>>>>>
>>>>> We have a fairly large (~750Gb) CouchDB (1.2.0) database that is
>> being
>>>>> used for transactional logs (very write heavy) (bad idea/design, I
>> know,
>>>>> but that's besides the point of this question - we're looking at
>>>>> alternative designs). Once in a while, we delete some of the records
>> in
>>>>> large batches and we have scheduled auto compaction, checking every 2
>>>>> hours.
>>>>>
>>>>> This is the compaction config:
>>>>>
>>>>> [image: Inline image 1]
>>>>>
>>>>> From what I can see, the DB is being hammered significantly every 12
>>>> hours
>>>>> and the compaction is taking (sometimes 24 hours (with a size of
>> 100GB of
>>>>> log data, sometimes much more (up to 500GB)).
>>>>>
>>>>> We run on EC2. Large instances with EBS. No striping (yet), no IOPS.
>> We
>>>>> tried fatter machines, but the improvement was really minimal.
>>>>>
>>>>> **
>>>>>
>>>>> *The problem:*
>>>>>
>>>>> The problem is that compaction takes a very long time (e.g. 12h+) and
>>>>> reduces the performance of the entire stack. The main issue seems to
>> be
>>>>> that it's hard for the compaction process to "keep up" with the
>>>>
>>>> insertions,
>>>>> hence why it takes so long. Also, the compaction of the view takes
>> long
>>>>> time (sometimes the view is 100GB). During the re-compaction of the
>> view,
>>>>> clients don't get a response, which is blocking the processes.
>>>>>
>>>>> [image: Inline image 2]
>>>>>
>>>>> The view compaction takes approx. 8 hours and the indexing for the
>> view
>>>>> are therefore slower and during the time that view indexes, another
>> 300k
>>>>> insertions have been done (and it doesn't catch up). The only way to
>>>>
>>>> solve
>>>>> the problem was to throttle the number of inserts from the app
>> itself and
>>>>> then eventually the view compaction resolved. If we would have
>> continued
>>>>
>>>> to
>>>>> insert at the same rate, it would not have finished (and ultimately,
>> we
>>>>> would have run out of disk space).
>>>>>
>>>>> Any recommendations to set it up on EC2 is welcome. Also
>> configuration
>>>>> settings for the compaction would be helpful.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Nicolas
>>>>>
>>>>> PS: We are happily using CouchDB for other (more traditional) use
>> case
>>>>> where it does go very well.
>>
>>
>>

Re: CouchDB compaction not catching up.

Posted by Nicolas Peeters <ni...@gmail.com>.

Simon,

That's actually a very suggestion and we actually implemented that (we had
one DB per "process"). The problem that the size of the DB sometimes
outgrew our disks (1TB!) (and sometimes, we needed to keep the data around
for longer periods), so we discarded it at the end.

This is however a workaround. And the main question was about the
compaction not catching up (which may be a problem in some other cases).


On Thu, Mar 7, 2013 at 9:58 AM, Simon Metson <si...@cloudant.com> wrote:

> What about making a database per day/week and dropping the whole lot in
> one go?
>
>
> On Thursday, 7 March 2013 at 08:50, Nicolas Peeters wrote:
>
> > So the use case is some kind of transactional log associated with some
> kind
> > of long running process (1 day). For each process, a few 100 thousands
> > lines of "logging" are inserted. When the process has completed (user
> > approval), we would like to delete all the associated "logs". Marking
> items
> > as deleted is not really the issue. Recovering the space is.
> >
> > The data should ideally be available for up to a week or so.
> >
> >
> > On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla <rk...@gmail.com> wrote:
> >
> > > Nicolas,
> > > Can you provide some insight into how you decide which large batches of
> > > records to delete and roughly how big (MB/GB wise) those batches are?
> What
> > > is the required longevity of this tx information in this couch store?
> Is
> > > this just temporary storage or is this the system of record and what
> you
> > > are deleting in large batches are just temporary intermediary data?
> > >
> > > Understanding how you are using the data and turning over the data
> could
> > > help assess some alternative strategies.
> > >
> > > Best,
> > > Riyad
> > >
> > > On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <nicolists@gmail.com
> > > > wrote:
> > >
> > >
> > > > Hi CouchDB Users,
> > > >
> > > > *Disclaimer: I'm very aware that the use case is definitely not the
> best
> > > > for CouchDB, but for now, we have to deal with it.*
> > > >
> > > > *Scenario:*
> > > >
> > > > We have a fairly large (~750Gb) CouchDB (1.2.0) database that is
> being
> > > > used for transactional logs (very write heavy) (bad idea/design, I
> know,
> > > > but that's besides the point of this question - we're looking at
> > > > alternative designs). Once in a while, we delete some of the records
> in
> > > > large batches and we have scheduled auto compaction, checking every 2
> > > > hours.
> > > >
> > > > This is the compaction config:
> > > >
> > > > [image: Inline image 1]
> > > >
> > > > From what I can see, the DB is being hammered significantly every 12
> > > hours
> > > > and the compaction is taking (sometimes 24 hours (with a size of
> 100GB of
> > > > log data, sometimes much more (up to 500GB)).
> > > >
> > > > We run on EC2. Large instances with EBS. No striping (yet), no IOPS.
> We
> > > > tried fatter machines, but the improvement was really minimal.
> > > >
> > > > **
> > > >
> > > > *The problem:*
> > > >
> > > > The problem is that compaction takes a very long time (e.g. 12h+) and
> > > > reduces the performance of the entire stack. The main issue seems to
> be
> > > > that it's hard for the compaction process to "keep up" with the
> > > >
> > >
> > > insertions,
> > > > hence why it takes so long. Also, the compaction of the view takes
> long
> > > > time (sometimes the view is 100GB). During the re-compaction of the
> view,
> > > > clients don't get a response, which is blocking the processes.
> > > >
> > > > [image: Inline image 2]
> > > >
> > > > The view compaction takes approx. 8 hours and the indexing for the
> view
> > > > are therefore slower and during the time that view indexes, another
> 300k
> > > > insertions have been done (and it doesn't catch up). The only way to
> > > >
> > >
> > > solve
> > > > the problem was to throttle the number of inserts from the app
> itself and
> > > > then eventually the view compaction resolved. If we would have
> continued
> > > >
> > >
> > > to
> > > > insert at the same rate, it would not have finished (and ultimately,
> we
> > > > would have run out of disk space).
> > > >
> > > > Any recommendations to set it up on EC2 is welcome. Also
> configuration
> > > > settings for the compaction would be helpful.
> > > >
> > > > Thanks.
> > > >
> > > > Nicolas
> > > >
> > > > PS: We are happily using CouchDB for other (more traditional) use
> case
> > > > where it does go very well.
> > > >
> > >
> > >
> >
> >
> >
>
>
>

Re: CouchDB compaction not catching up.

Posted by Simon Metson <si...@cloudant.com>.

What about making a database per day/week and dropping the whole lot in one go? 


On Thursday, 7 March 2013 at 08:50, Nicolas Peeters wrote:

> So the use case is some kind of transactional log associated with some kind
> of long running process (1 day). For each process, a few 100 thousands
> lines of "logging" are inserted. When the process has completed (user
> approval), we would like to delete all the associated "logs". Marking items
> as deleted is not really the issue. Recovering the space is.
> 
> The data should ideally be available for up to a week or so.
> 
> 
> On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla <rk...@gmail.com> wrote:
> 
> > Nicolas,
> > Can you provide some insight into how you decide which large batches of
> > records to delete and roughly how big (MB/GB wise) those batches are? What
> > is the required longevity of this tx information in this couch store? Is
> > this just temporary storage or is this the system of record and what you
> > are deleting in large batches are just temporary intermediary data?
> > 
> > Understanding how you are using the data and turning over the data could
> > help assess some alternative strategies.
> > 
> > Best,
> > Riyad
> > 
> > On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <nicolists@gmail.com
> > > wrote:
> > 
> > 
> > > Hi CouchDB Users,
> > > 
> > > *Disclaimer: I'm very aware that the use case is definitely not the best
> > > for CouchDB, but for now, we have to deal with it.*
> > > 
> > > *Scenario:*
> > > 
> > > We have a fairly large (~750Gb) CouchDB (1.2.0) database that is being
> > > used for transactional logs (very write heavy) (bad idea/design, I know,
> > > but that's besides the point of this question - we're looking at
> > > alternative designs). Once in a while, we delete some of the records in
> > > large batches and we have scheduled auto compaction, checking every 2
> > > hours.
> > > 
> > > This is the compaction config:
> > > 
> > > [image: Inline image 1]
> > > 
> > > From what I can see, the DB is being hammered significantly every 12
> > hours
> > > and the compaction is taking (sometimes 24 hours (with a size of 100GB of
> > > log data, sometimes much more (up to 500GB)).
> > > 
> > > We run on EC2. Large instances with EBS. No striping (yet), no IOPS. We
> > > tried fatter machines, but the improvement was really minimal.
> > > 
> > > **
> > > 
> > > *The problem:*
> > > 
> > > The problem is that compaction takes a very long time (e.g. 12h+) and
> > > reduces the performance of the entire stack. The main issue seems to be
> > > that it's hard for the compaction process to "keep up" with the
> > > 
> > 
> > insertions,
> > > hence why it takes so long. Also, the compaction of the view takes long
> > > time (sometimes the view is 100GB). During the re-compaction of the view,
> > > clients don't get a response, which is blocking the processes.
> > > 
> > > [image: Inline image 2]
> > > 
> > > The view compaction takes approx. 8 hours and the indexing for the view
> > > are therefore slower and during the time that view indexes, another 300k
> > > insertions have been done (and it doesn't catch up). The only way to
> > > 
> > 
> > solve
> > > the problem was to throttle the number of inserts from the app itself and
> > > then eventually the view compaction resolved. If we would have continued
> > > 
> > 
> > to
> > > insert at the same rate, it would not have finished (and ultimately, we
> > > would have run out of disk space).
> > > 
> > > Any recommendations to set it up on EC2 is welcome. Also configuration
> > > settings for the compaction would be helpful.
> > > 
> > > Thanks.
> > > 
> > > Nicolas
> > > 
> > > PS: We are happily using CouchDB for other (more traditional) use case
> > > where it does go very well.
> > > 
> > 
> > 
> 
> 
>

Re: CouchDB compaction not catching up.

Posted by Nicolas Peeters <ni...@gmail.com>.

So the use case is some kind of transactional log associated with some kind
of long running process (1 day). For each process, a few 100 thousands
lines of "logging" are inserted. When the process has completed (user
approval), we would like to delete all the associated "logs". Marking items
as deleted is not really the issue. Recovering the space is.

The data should ideally be available for up to a week or so.


On Thu, Mar 7, 2013 at 9:24 AM, Riyad Kalla <rk...@gmail.com> wrote:

> Nicolas,
> Can you provide some insight into how you decide which large batches of
> records to delete and roughly how big (MB/GB wise) those batches are? What
> is the required longevity of this tx information in this couch store? Is
> this just temporary storage or is this the system of record and what you
> are deleting in large batches are just temporary intermediary data?
>
> Understanding how you are using the data and turning over the data could
> help assess some alternative strategies.
>
> Best,
> Riyad
>
> On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <nicolists@gmail.com
> >wrote:
>
> > Hi CouchDB Users,
> >
> > *Disclaimer: I'm very aware that the use case is definitely not the best
> > for CouchDB, but for now, we have to deal with it.*
> >
> > *Scenario:*
> >
> > We have a fairly large (~750Gb) CouchDB (1.2.0) database that is being
> > used for transactional logs (very write heavy) (bad idea/design, I know,
> > but that's besides the point of this question - we're looking at
> > alternative designs). Once in a while, we delete some of the records in
> > large batches and we have scheduled auto compaction, checking every 2
> > hours.
> >
> > This is the compaction config:
> >
> > [image: Inline image 1]
> >
> > From what I can see, the DB is being hammered significantly every 12
> hours
> > and the compaction is taking (sometimes 24 hours (with a size of 100GB of
> > log data, sometimes much more (up to 500GB)).
> >
> > We run on EC2. Large instances with EBS. No striping (yet), no IOPS. We
> > tried fatter machines, but the improvement was really minimal.
> >
> > **
> >
> > *The problem:*
> >
> > The problem is that compaction takes a very long time (e.g. 12h+) and
> > reduces the performance of the entire stack. The main issue seems to be
> > that it's hard for the compaction process to "keep up" with the
> insertions,
> > hence why it takes so long. Also, the compaction of the view takes long
> > time (sometimes the view is 100GB). During the re-compaction of the view,
> > clients don't get a response, which is blocking the processes.
> >
> > [image: Inline image 2]
> >
> > The view compaction takes approx. 8 hours and the indexing for the view
> > are therefore slower and during the time that view indexes, another 300k
> > insertions have been done (and it doesn't catch up). The only way to
> solve
> > the problem was to throttle the number of inserts from the app itself and
> > then eventually the view compaction resolved. If we would have continued
> to
> > insert at the same rate, it would not have finished (and ultimately, we
> > would have run out of disk space).
> >
> > Any recommendations to set it up on EC2 is welcome. Also configuration
> > settings for the compaction would be helpful.
> >
> > Thanks.
> >
> > Nicolas
> >
> > PS: We are happily using CouchDB for other (more traditional) use case
> > where it does go very well.
> >
>

Re: CouchDB compaction not catching up.

Posted by Riyad Kalla <rk...@gmail.com>.

Nicolas,
Can you provide some insight into how you decide which large batches of
records to delete and roughly how big (MB/GB wise) those batches are? What
is the required longevity of this tx information in this couch store? Is
this just temporary storage or is this the system of record and what you
are deleting in large batches are just temporary intermediary data?

Understanding how you are using the data and turning over the data could
help assess some alternative strategies.

Best,
Riyad

On Thu, Mar 7, 2013 at 12:19 AM, Nicolas Peeters <ni...@gmail.com>wrote:

> Hi CouchDB Users,
>
> *Disclaimer: I'm very aware that the use case is definitely not the best
> for CouchDB, but for now, we have to deal with it.*
>
> *Scenario:*
>
> We have a fairly large (~750Gb) CouchDB (1.2.0) database that is being
> used for transactional logs (very write heavy) (bad idea/design, I know,
> but that's besides the point of this question - we're looking at
> alternative designs). Once in a while, we delete some of the records in
> large batches and we have scheduled auto compaction, checking every 2
> hours.
>
> This is the compaction config:
>
> [image: Inline image 1]
>
> From what I can see, the DB is being hammered significantly every 12 hours
> and the compaction is taking (sometimes 24 hours (with a size of 100GB of
> log data, sometimes much more (up to 500GB)).
>
> We run on EC2. Large instances with EBS. No striping (yet), no IOPS. We
> tried fatter machines, but the improvement was really minimal.
>
> **
>
> *The problem:*
>
> The problem is that compaction takes a very long time (e.g. 12h+) and
> reduces the performance of the entire stack. The main issue seems to be
> that it's hard for the compaction process to "keep up" with the insertions,
> hence why it takes so long. Also, the compaction of the view takes long
> time (sometimes the view is 100GB). During the re-compaction of the view,
> clients don't get a response, which is blocking the processes.
>
> [image: Inline image 2]
>
> The view compaction takes approx. 8 hours and the indexing for the view
> are therefore slower and during the time that view indexes, another 300k
> insertions have been done (and it doesn't catch up). The only way to solve
> the problem was to throttle the number of inserts from the app itself and
> then eventually the view compaction resolved. If we would have continued to
> insert at the same rate, it would not have finished (and ultimately, we
> would have run out of disk space).
>
> Any recommendations to set it up on EC2 is welcome. Also configuration
> settings for the compaction would be helpful.
>
> Thanks.
>
> Nicolas
>
> PS: We are happily using CouchDB for other (more traditional) use case
> where it does go very well.
>