You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@couchdb.apache.org by Stefan Kögl <ko...@gmail.com> on 2012/03/01 12:17:17 UTC

Crash of CouchDB 1.2.x

Hello,

My experiments to replicate some live data / traffic to a CouchDB
1.2.x (running the current 1.2.x branch + the patch from [1]) that
sparked the indexing speed discussions, did also yield another
(potential) problem. First sorry for not further reporting back any
performance measurements, but I didn't yet find the time to run the
tests on my machines.

Anyway, I found the following stack traces in my log (after noticing
that some requests failed and compaction of a view stopped)

http://skoegl.net/~stefan/tmp/couchdb-1.2.x-crash.txt

The files starts at the first failed requests. Every request before
that returned a positiv (ie 2xx) status code. The crash might have
some "natural" reason (such as timeouts, lack of RAM, etc), but I'm
not sure how to interpret Erlang stack traces. Can somebody point me
in the right direction for diagnosing the problem?


Thanks,

-- Stefan


[1] http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w

Re: Crash of CouchDB 1.2.x

Posted by Jason Smith <jh...@iriscouch.com>.

I seem to remember that, say, ext2 had more or less constant-time unlinking.

On Mon, Mar 12, 2012 at 10:32 AM, Robert Newson <rn...@apache.org> wrote:
> I can confirm that XFS is aggressive when deleting large files (other
> i/o requests are slow or blocked while it does it). It has been
> necessary to iteratively truncate a file instead of a simple 'rm' in
> production to avoid that problem. Increasing the size of extent
> preallocation ought to help considerably but I've not yet deployed
> that change. I *can* confirm that you can't 'ionice' the rm call,
> though.
>
> B.
>
> On 12 March 2012 05:00, Randall Leeds <ra...@gmail.com> wrote:
>> On Mar 11, 2012 7:40 PM, "Jason Smith" <jh...@iriscouch.com> wrote:
>>>
>>> On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds <ra...@gmail.com>
>> wrote:
>>> > I'm not sure what else you could provide after the fact. If your couch
>>> > came back online automatically, and did so quickly, I would expect to
>>> > see very long response times while the disk was busy freeing the old,
>>> > un-compacted file. We have had some fixes in the last couple releases
>>> > to address similar issues, but maybe there's something lurking still.
>>> > I've got no other ideas/leads at this time.
>>>
>>> Another long shot, but you could try a filesystem that doesn't
>>> synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think
>>> ext2.
>>
>> I think you're referring to extents, which, IIRC, allow large, contiguous
>> sections if a file to be allocated and freed with less bookkeeping and,
>> therefore, fewer writes. This behavior is not any more or less synchronous.
>>
>> In my production experience, xfs does not show much benefit from this
>> because any machine which contains more than one databases which are
>> growing still results in file fragmentation that limits the gains from
>> extents.
>>
>> I suspect, but have not tried to verify, that very large RAID stripe sizes
>> that force pre allocation of larger blocks, might deliver some gains.
>>
>> I have an open ticket for a manual delete option which was designed to
>> allow deletion of trashed files to occur during low volume hours or using
>> tools like ionice.  Unfortunately, I never got a chance to experiment with
>> that set up in production, though I have seen ionice help significantly to
>> keep request latency down when doing large deletes (just not in this
>> particular use case).



-- 
Iris Couch

Re: Crash of CouchDB 1.2.x

Posted by Robert Newson <rn...@apache.org>.

I can confirm that XFS is aggressive when deleting large files (other
i/o requests are slow or blocked while it does it). It has been
necessary to iteratively truncate a file instead of a simple 'rm' in
production to avoid that problem. Increasing the size of extent
preallocation ought to help considerably but I've not yet deployed
that change. I *can* confirm that you can't 'ionice' the rm call,
though.

B.

On 12 March 2012 05:00, Randall Leeds <ra...@gmail.com> wrote:
> On Mar 11, 2012 7:40 PM, "Jason Smith" <jh...@iriscouch.com> wrote:
>>
>> On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds <ra...@gmail.com>
> wrote:
>> > I'm not sure what else you could provide after the fact. If your couch
>> > came back online automatically, and did so quickly, I would expect to
>> > see very long response times while the disk was busy freeing the old,
>> > un-compacted file. We have had some fixes in the last couple releases
>> > to address similar issues, but maybe there's something lurking still.
>> > I've got no other ideas/leads at this time.
>>
>> Another long shot, but you could try a filesystem that doesn't
>> synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think
>> ext2.
>
> I think you're referring to extents, which, IIRC, allow large, contiguous
> sections if a file to be allocated and freed with less bookkeeping and,
> therefore, fewer writes. This behavior is not any more or less synchronous.
>
> In my production experience, xfs does not show much benefit from this
> because any machine which contains more than one databases which are
> growing still results in file fragmentation that limits the gains from
> extents.
>
> I suspect, but have not tried to verify, that very large RAID stripe sizes
> that force pre allocation of larger blocks, might deliver some gains.
>
> I have an open ticket for a manual delete option which was designed to
> allow deletion of trashed files to occur during low volume hours or using
> tools like ionice.  Unfortunately, I never got a chance to experiment with
> that set up in production, though I have seen ionice help significantly to
> keep request latency down when doing large deletes (just not in this
> particular use case).

Re: Crash of CouchDB 1.2.x

Posted by Randall Leeds <ra...@gmail.com>.

On Mar 11, 2012 7:40 PM, "Jason Smith" <jh...@iriscouch.com> wrote:
>
> On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds <ra...@gmail.com>
wrote:
> > I'm not sure what else you could provide after the fact. If your couch
> > came back online automatically, and did so quickly, I would expect to
> > see very long response times while the disk was busy freeing the old,
> > un-compacted file. We have had some fixes in the last couple releases
> > to address similar issues, but maybe there's something lurking still.
> > I've got no other ideas/leads at this time.
>
> Another long shot, but you could try a filesystem that doesn't
> synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think
> ext2.

I think you're referring to extents, which, IIRC, allow large, contiguous
sections if a file to be allocated and freed with less bookkeeping and,
therefore, fewer writes. This behavior is not any more or less synchronous.

In my production experience, xfs does not show much benefit from this
because any machine which contains more than one databases which are
growing still results in file fragmentation that limits the gains from
extents.

I suspect, but have not tried to verify, that very large RAID stripe sizes
that force pre allocation of larger blocks, might deliver some gains.

I have an open ticket for a manual delete option which was designed to
allow deletion of trashed files to occur during low volume hours or using
tools like ionice.  Unfortunately, I never got a chance to experiment with
that set up in production, though I have seen ionice help significantly to
keep request latency down when doing large deletes (just not in this
particular use case).

Re: Crash of CouchDB 1.2.x

Posted by Jason Smith <jh...@iriscouch.com>.

On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds <ra...@gmail.com> wrote:
> I'm not sure what else you could provide after the fact. If your couch
> came back online automatically, and did so quickly, I would expect to
> see very long response times while the disk was busy freeing the old,
> un-compacted file. We have had some fixes in the last couple releases
> to address similar issues, but maybe there's something lurking still.
> I've got no other ideas/leads at this time.

Another long shot, but you could try a filesystem that doesn't
synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think
ext2.

-- 
Iris Couch