You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Adam Kocoloski <ad...@gmail.com> on 2008/12/23 01:32:50 UTC

runaway compaction

Hi, I ran into an odd failure mode last week and I thought I'd ask  
around here to see if anyone has seen something similar.  I have a  
CouchDB server (recent trunk) on a large EC2 instance with a DB that  
sees a constant update rate of ~50 Hz.  I triggered a compaction when  
the DB had reached ~27M update sequences (80 GB in total).  The first  
pass finished after 7h40m, but of course another 1.4M updates had been  
written to the original DB.  So far, so good.

Unfortunately, the subsequent iterations of copy_compact() ran much  
slower than that original pass.  After a few passes, the compactor  
rate was equal to the new write rate, so it effectively entered a  
runaway mode.  The stats looked like

Pass 1:  7h40m    27870955 docs   1010 Hz
Pass 2:  3h44m     1473387 docs    110 Hz
Pass 3:  2h58m      617008 docs     58 Hz
Pass 4:  2h44m      450607 docs     46 Hz
.....
Pass 23: 4h08m      719541 docs     48 Hz
Pass 24: 1h04m      436105 docs    113 Hz
Pass 25: 21 seconds -- done.

We stopped the new write load sometime after the end of Pass 23, and  
the compaction finished soon after that.

We turned the write load back on and have been compacting the DB once/ 
day ever since.  We haven't seen this runaway mode again.  I've  
reviewed the compaction code a couple of times, but I can't figure out  
what would cause such a dramatic slowdown.  Our system monitoring  
wasn't able to turn up any red flags, either -- in particular, all the  
latency/throughput/IOPS stats for the disk hosting the database were  
pretty much constant throughout the lifetime of the compaction.

Best, Adam

Re: runaway compaction

Posted by Damien Katz <da...@apache.org>.

Can't work, the storage file will have data that points to previously  
written data. Copying the raw bytes means the structures point to  
random places in the new file.

-Damien

On Dec 22, 2008, at 8:45 PM, Chris Anderson wrote:

> On Mon, Dec 22, 2008 at 5:29 PM, Damien Katz <da...@apache.org>  
> wrote:
>> It's a known issue that compaction maybe cannot complete under  
>> heavy write
>> load. At some point maybe we should implement a mechanism to  
>> throttle writes
>> if the compaction isn't making enough progress during updates.
>>
>
> Would it be possible to artificially "complete" compaction by
> appending the last few sections of the original file verbatim to the
> new file? Maybe after the second pass it could just copy over
> uncompacted updates... maybe the whole idea is too dirty.
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com

Re: runaway compaction

Posted by Chris Anderson <jc...@gmail.com>.

On Mon, Dec 22, 2008 at 5:29 PM, Damien Katz <da...@apache.org> wrote:
> It's a known issue that compaction maybe cannot complete under heavy write
> load. At some point maybe we should implement a mechanism to throttle writes
> if the compaction isn't making enough progress during updates.
>

Would it be possible to artificially "complete" compaction by
appending the last few sections of the original file verbatim to the
new file? Maybe after the second pass it could just copy over
uncompacted updates... maybe the whole idea is too dirty.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: runaway compaction

Posted by Damien Katz <da...@apache.org>.

It's a known issue that compaction maybe cannot complete under heavy  
write load. At some point maybe we should implement a mechanism to  
throttle writes if the compaction isn't making enough progress during  
updates.

-Damien


On Dec 22, 2008, at 7:32 PM, Adam Kocoloski wrote:

> Hi, I ran into an odd failure mode last week and I thought I'd ask  
> around here to see if anyone has seen something similar.  I have a  
> CouchDB server (recent trunk) on a large EC2 instance with a DB that  
> sees a constant update rate of ~50 Hz.  I triggered a compaction  
> when the DB had reached ~27M update sequences (80 GB in total).  The  
> first pass finished after 7h40m, but of course another 1.4M updates  
> had been written to the original DB.  So far, so good.
>
> Unfortunately, the subsequent iterations of copy_compact() ran much  
> slower than that original pass.  After a few passes, the compactor  
> rate was equal to the new write rate, so it effectively entered a  
> runaway mode.  The stats looked like
>
> Pass 1:  7h40m    27870955 docs   1010 Hz
> Pass 2:  3h44m     1473387 docs    110 Hz
> Pass 3:  2h58m      617008 docs     58 Hz
> Pass 4:  2h44m      450607 docs     46 Hz
> .....
> Pass 23: 4h08m      719541 docs     48 Hz
> Pass 24: 1h04m      436105 docs    113 Hz
> Pass 25: 21 seconds -- done.
>
> We stopped the new write load sometime after the end of Pass 23, and  
> the compaction finished soon after that.
>
> We turned the write load back on and have been compacting the DB  
> once/day ever since.  We haven't seen this runaway mode again.  I've  
> reviewed the compaction code a couple of times, but I can't figure  
> out what would cause such a dramatic slowdown.  Our system  
> monitoring wasn't able to turn up any red flags, either -- in  
> particular, all the latency/throughput/IOPS stats for the disk  
> hosting the database were pretty much constant throughout the  
> lifetime of the compaction.
>
> Best, Adam

Re: runaway compaction

Posted by Adam Kocoloski <ad...@gmail.com>.

On Dec 22, 2008, at 8:57 PM, Damien Katz wrote:

> There is an expected slowdown during the retry, because it needs to  
> update previous values, not just copy docs, which means 2 extra  
> btree operations. However, I must say I'm surprised at the magnitude  
> of the slowdown. Maybe there is bug or simple optimization that can  
> be performed.
>
> -Damien

Hi Damien et al., in my experience the compaction retries on this  
particular DB can still pull docs at a steady state of ~400 Hz.  It  
was just this one time where they were an order of magnitude slower.

Adam

Re: runaway compaction

Posted by Damien Katz <da...@apache.org>.

On Dec 22, 2008, at 7:32 PM, Adam Kocoloski wrote:

> Hi, I ran into an odd failure mode last week and I thought I'd ask  
> around here to see if anyone has seen something similar.  I have a  
> CouchDB server (recent trunk) on a large EC2 instance with a DB that  
> sees a constant update rate of ~50 Hz.  I triggered a compaction  
> when the DB had reached ~27M update sequences (80 GB in total).  The  
> first pass finished after 7h40m, but of course another 1.4M updates  
> had been written to the original DB.  So far, so good.
>
> Unfortunately, the subsequent iterations of copy_compact() ran much  
> slower than that original pass.  After a few passes, the compactor  
> rate was equal to the new write rate, so it effectively entered a  
> runaway mode.  The stats looked like
>
> Pass 1:  7h40m    27870955 docs   1010 Hz
> Pass 2:  3h44m     1473387 docs    110 Hz
> Pass 3:  2h58m      617008 docs     58 Hz
> Pass 4:  2h44m      450607 docs     46 Hz
> .....
> Pass 23: 4h08m      719541 docs     48 Hz
> Pass 24: 1h04m      436105 docs    113 Hz
> Pass 25: 21 seconds -- done.


There is an expected slowdown during the retry, because it needs to  
update previous values, not just copy docs, which means 2 extra btree  
operations. However, I must say I'm surprised at the magnitude of the  
slowdown. Maybe there is bug or simple optimization that can be  
performed.

-Damien