You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Boaz Citrin <bc...@gmail.com> on 2013/10/17 22:57:00 UTC

Finding the right value for compaction configuration parameters

Hello,

Testing database compaction with various doc_buffer_size values I get
completely different results.
Documentation is fairly vague, so I wonder how to choose the right value;
What parameters affect this - HD buffer size, average doc size,
database size, fragmentation, etc... ?!

Same goes for view compaction and keyvalue_buffer_size.

(For me the the compaction with default values was many times slower
than with the values that gave the faster compaction).

Thanks!

Re: Finding the right value for compaction configuration parameters

Posted by Boaz Citrin <bc...@gmail.com>.
View compaction averages of database of 1.5M docs (same one used for the
previous tests) on a Dell Latitude laptop with Windows 7 64b OS, 8G RAM,
Intel Core i7-3720QM CPU (2 * 2.6 GHz):

keyvalue_buffer_size | average time (in minutes)
================================================
2097152              | 0:56
4194304              | 0:52
8388608              | 0:55
16777216            | 0:59
33554432            | 1:00
67108864            | 1:07
134217728          | 1:22
268435456          | 1:45

Same database, this time on a desktop, Windows Server 2008 R2, 2GB RAM,
Inter Xeon 3050 2*13 GHz:

2097152              | 2:30
4194304              | 2:11
8388608              | 2:18
16777216            | 2:33
33554432            | 2:55
67108864            | 4:15
134217728          | 5:43
268435456          | 6:42

Database has 3 views in different 3 design documents.

Would like to know some numbers from others, to get an idea of good default
values that can be assumed for various production environments.



On Sun, Nov 3, 2013 at 10:02 AM, Boaz Citrin <bc...@gmail.com> wrote:

> More numbers...
>
> Same database, this time on a desktop, Windows Server 2008 R2, 2GB RAM,
> Inter Xeon 3050 2*13 GHz.
>
>
> checkpoint_after |doc_buffer_size | avg. compaction time (in minutes)
> ====================================================================
> 41943040            | 4194304             | 17:26
> 83886080            | 8388608             | 12:01
> 167772160          | 16777216          | 9:34
> 335544320          | 33554432          | 8:56
> 671088640          | 67108864          | 9:26 (*)
> 1342177280        | 134217728       | 10:20
> 2684354560        | 268435456       | 12:28
>
> 671088640          | 4194304            | 17:25
> 671088640          | 8388608            | 12:17
> 671088640          | 16777216          | 10:15
> 671088640          | 33554432          | 8:56
> 671088640          | 67108864          | 9:20 (*)
> 671088640          | 134217728        | 10:16
> 671088640          | 268435456        | 12:19
>
>
>
>
>
> On Fri, Nov 1, 2013 at 5:44 PM, Boaz Citrin <bc...@gmail.com> wrote:
>
>> OK some numbers when running compaction of a database of 1.5M docs on a
>> Dell Latitude laptop with Windows 7 64b OS, 8G RAM, Intel Core i7-3720QM
>> CPU (2 * 2.6 GHz).
>>
>> checkpoint_after |doc_buffer_size | avg. compaction time (in minutes)
>> ======================================================
>> 41943040            | 4194304             | 6:11
>> 83886080            | 8388608             | 5:10
>> 167772160          | 16777216          | 4:49
>> 335544320          | 33554432          | 4:06
>> 671088640          | 67108864          | 3:34 (*)
>> 1342177280        | 134217728       | 4:16
>> 2684354560        | 268435456       | 5:19
>>
>> 671088640          | 4194304            | 5:39
>> 671088640          | 8388608            | 4:06
>> 671088640          | 16777216          | 3:44
>> 671088640          | 33554432          | 3:20
>> 671088640          | 67108864          | 3:16 (*)
>> 671088640          | 134217728        | 3:44
>> 671088640          | 268435456        | 4:28
>>
>> As you can see in the starred lines, same parameters might give different
>> results, but I think the direction is clear;
>> In my situation buffer of 64m gives the fastest compaction, and we also
>> see that as expected the checkpoint size affects the compaction time as
>> well.
>>
>> Note that this is NOT the time that an average daily compaction takes as
>> I ran these consecutively, i.e. the database was initially not fragmented.
>>
>>
>>
>> On Sat, Oct 26, 2013 at 5:25 PM, Alexander Shorin <kx...@gmail.com>wrote:
>>
>>> Feel free to post here! I think others would be also interested in
>>> your experience and could share (I hope so) their own too.
>>> --
>>> ,,,^..^,,,
>>>
>>>
>>> On Sat, Oct 26, 2013 at 7:05 PM, Boaz Citrin <bc...@gmail.com> wrote:
>>> > Hi Alexander,
>>> >
>>> > Yes, I have some numbers, do you want me to share here or somewhere
>>> else?
>>> >
>>> > Best,
>>> >
>>> > Boaz
>>> >
>>> >
>>> > On Mon, Oct 21, 2013 at 12:32 AM, Alexander Shorin <kx...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi Boaz!
>>> >>
>>> >> On Fri, Oct 18, 2013 at 12:57 AM, Boaz Citrin <bc...@gmail.com>
>>> wrote:
>>> >> > Testing database compaction with various doc_buffer_size values I
>>> get
>>> >> > completely different results.
>>> >> > Documentation is fairly vague, so I wonder how to choose the right
>>> value;
>>> >> > What parameters affect this - HD buffer size, average doc size,
>>> >> > database size, fragmentation, etc... ?!
>>> >> >
>>> >> > Same goes for view compaction and keyvalue_buffer_size.
>>> >>
>>> >> These parameters does affect on what they are named: they defines
>>> >> buffer size for copying data from db/view file to the .compact one.
>>> >> What information you think is missed?
>>> >>
>>> >> > (For me the the compaction with default values was many times slower
>>> >> > than with the values that gave the faster compaction).
>>> >>
>>> >> All numbers have their cost: large buffers requires more memory while
>>> >> they reduces I/O operations and vice versa. Much likely, that default
>>> >> values wouldn't provide you high performance since they aimed to fit
>>> >> everyone, but that's why you may tweak them for your needs (:
>>> >>
>>> >> I believe, that it's possible to revise them, but first need to
>>> >> collect information in what environment which values are effective and
>>> >> which are not. Would you like to help us with that?
>>> >>
>>> >>
>>> >> --
>>> >> ,,,^..^,,,
>>> >>
>>>
>>
>>
>

Re: Finding the right value for compaction configuration parameters

Posted by Boaz Citrin <bc...@gmail.com>.
More numbers...

Same database, this time on a desktop, Windows Server 2008 R2, 2GB RAM,
Inter Xeon 3050 2*13 GHz.

checkpoint_after |doc_buffer_size | avg. compaction time (in minutes)
====================================================================
41943040            | 4194304             | 17:26
83886080            | 8388608             | 12:01
167772160          | 16777216          | 9:34
335544320          | 33554432          | 8:56
671088640          | 67108864          | 9:26 (*)
1342177280        | 134217728       | 10:20
2684354560        | 268435456       | 12:28

671088640          | 4194304            | 17:25
671088640          | 8388608            | 12:17
671088640          | 16777216          | 10:15
671088640          | 33554432          | 8:56
671088640          | 67108864          | 9:20 (*)
671088640          | 134217728        | 10:16
671088640          | 268435456        | 12:19





On Fri, Nov 1, 2013 at 5:44 PM, Boaz Citrin <bc...@gmail.com> wrote:

> OK some numbers when running compaction of a database of 1.5M docs on a
> Dell Latitude laptop with Windows 7 64b OS, 8G RAM, Intel Core i7-3720QM
> CPU (2 * 2.6 GHz).
>
> checkpoint_after |doc_buffer_size | avg. compaction time (in minutes)
> ======================================================
> 41943040            | 4194304             | 6:11
> 83886080            | 8388608             | 5:10
> 167772160          | 16777216          | 4:49
> 335544320          | 33554432          | 4:06
> 671088640          | 67108864          | 3:34 (*)
> 1342177280        | 134217728       | 4:16
> 2684354560        | 268435456       | 5:19
>
> 671088640          | 4194304            | 5:39
> 671088640          | 8388608            | 4:06
> 671088640          | 16777216          | 3:44
> 671088640          | 33554432          | 3:20
> 671088640          | 67108864          | 3:16 (*)
> 671088640          | 134217728        | 3:44
> 671088640          | 268435456        | 4:28
>
> As you can see in the starred lines, same parameters might give different
> results, but I think the direction is clear;
> In my situation buffer of 64m gives the fastest compaction, and we also
> see that as expected the checkpoint size affects the compaction time as
> well.
>
> Note that this is NOT the time that an average daily compaction takes as I
> ran these consecutively, i.e. the database was initially not fragmented.
>
>
>
> On Sat, Oct 26, 2013 at 5:25 PM, Alexander Shorin <kx...@gmail.com>wrote:
>
>> Feel free to post here! I think others would be also interested in
>> your experience and could share (I hope so) their own too.
>> --
>> ,,,^..^,,,
>>
>>
>> On Sat, Oct 26, 2013 at 7:05 PM, Boaz Citrin <bc...@gmail.com> wrote:
>> > Hi Alexander,
>> >
>> > Yes, I have some numbers, do you want me to share here or somewhere
>> else?
>> >
>> > Best,
>> >
>> > Boaz
>> >
>> >
>> > On Mon, Oct 21, 2013 at 12:32 AM, Alexander Shorin <kx...@gmail.com>
>> wrote:
>> >
>> >> Hi Boaz!
>> >>
>> >> On Fri, Oct 18, 2013 at 12:57 AM, Boaz Citrin <bc...@gmail.com>
>> wrote:
>> >> > Testing database compaction with various doc_buffer_size values I get
>> >> > completely different results.
>> >> > Documentation is fairly vague, so I wonder how to choose the right
>> value;
>> >> > What parameters affect this - HD buffer size, average doc size,
>> >> > database size, fragmentation, etc... ?!
>> >> >
>> >> > Same goes for view compaction and keyvalue_buffer_size.
>> >>
>> >> These parameters does affect on what they are named: they defines
>> >> buffer size for copying data from db/view file to the .compact one.
>> >> What information you think is missed?
>> >>
>> >> > (For me the the compaction with default values was many times slower
>> >> > than with the values that gave the faster compaction).
>> >>
>> >> All numbers have their cost: large buffers requires more memory while
>> >> they reduces I/O operations and vice versa. Much likely, that default
>> >> values wouldn't provide you high performance since they aimed to fit
>> >> everyone, but that's why you may tweak them for your needs (:
>> >>
>> >> I believe, that it's possible to revise them, but first need to
>> >> collect information in what environment which values are effective and
>> >> which are not. Would you like to help us with that?
>> >>
>> >>
>> >> --
>> >> ,,,^..^,,,
>> >>
>>
>
>

Re: Finding the right value for compaction configuration parameters

Posted by Boaz Citrin <bc...@gmail.com>.
OK some numbers when running compaction of a database of 1.5M docs on a
Dell Latitude laptop with Windows 7 64b OS, 8G RAM, Intel Core i7-3720QM
CPU (2 * 2.6 GHz).

checkpoint_after |doc_buffer_size | avg. compaction time (in minutes)
======================================================
41943040            | 4194304             | 6:11
83886080            | 8388608             | 5:10
167772160          | 16777216          | 4:49
335544320          | 33554432          | 4:06
671088640          | 67108864          | 3:34 (*)
1342177280        | 134217728       | 4:16
2684354560        | 268435456       | 5:19

671088640          | 4194304            | 5:39
671088640          | 8388608            | 4:06
671088640          | 16777216          | 3:44
671088640          | 33554432          | 3:20
671088640          | 67108864          | 3:16 (*)
671088640          | 134217728        | 3:44
671088640          | 268435456        | 4:28

As you can see in the starred lines, same parameters might give different
results, but I think the direction is clear;
In my situation buffer of 64m gives the fastest compaction, and we also see
that as expected the checkpoint size affects the compaction time as well.

Note that this is NOT the time that an average daily compaction takes as I
ran these consecutively, i.e. the database was initially not fragmented.



On Sat, Oct 26, 2013 at 5:25 PM, Alexander Shorin <kx...@gmail.com> wrote:

> Feel free to post here! I think others would be also interested in
> your experience and could share (I hope so) their own too.
> --
> ,,,^..^,,,
>
>
> On Sat, Oct 26, 2013 at 7:05 PM, Boaz Citrin <bc...@gmail.com> wrote:
> > Hi Alexander,
> >
> > Yes, I have some numbers, do you want me to share here or somewhere else?
> >
> > Best,
> >
> > Boaz
> >
> >
> > On Mon, Oct 21, 2013 at 12:32 AM, Alexander Shorin <kx...@gmail.com>
> wrote:
> >
> >> Hi Boaz!
> >>
> >> On Fri, Oct 18, 2013 at 12:57 AM, Boaz Citrin <bc...@gmail.com>
> wrote:
> >> > Testing database compaction with various doc_buffer_size values I get
> >> > completely different results.
> >> > Documentation is fairly vague, so I wonder how to choose the right
> value;
> >> > What parameters affect this - HD buffer size, average doc size,
> >> > database size, fragmentation, etc... ?!
> >> >
> >> > Same goes for view compaction and keyvalue_buffer_size.
> >>
> >> These parameters does affect on what they are named: they defines
> >> buffer size for copying data from db/view file to the .compact one.
> >> What information you think is missed?
> >>
> >> > (For me the the compaction with default values was many times slower
> >> > than with the values that gave the faster compaction).
> >>
> >> All numbers have their cost: large buffers requires more memory while
> >> they reduces I/O operations and vice versa. Much likely, that default
> >> values wouldn't provide you high performance since they aimed to fit
> >> everyone, but that's why you may tweak them for your needs (:
> >>
> >> I believe, that it's possible to revise them, but first need to
> >> collect information in what environment which values are effective and
> >> which are not. Would you like to help us with that?
> >>
> >>
> >> --
> >> ,,,^..^,,,
> >>
>

Re: Finding the right value for compaction configuration parameters

Posted by Alexander Shorin <kx...@gmail.com>.
Feel free to post here! I think others would be also interested in
your experience and could share (I hope so) their own too.
--
,,,^..^,,,


On Sat, Oct 26, 2013 at 7:05 PM, Boaz Citrin <bc...@gmail.com> wrote:
> Hi Alexander,
>
> Yes, I have some numbers, do you want me to share here or somewhere else?
>
> Best,
>
> Boaz
>
>
> On Mon, Oct 21, 2013 at 12:32 AM, Alexander Shorin <kx...@gmail.com> wrote:
>
>> Hi Boaz!
>>
>> On Fri, Oct 18, 2013 at 12:57 AM, Boaz Citrin <bc...@gmail.com> wrote:
>> > Testing database compaction with various doc_buffer_size values I get
>> > completely different results.
>> > Documentation is fairly vague, so I wonder how to choose the right value;
>> > What parameters affect this - HD buffer size, average doc size,
>> > database size, fragmentation, etc... ?!
>> >
>> > Same goes for view compaction and keyvalue_buffer_size.
>>
>> These parameters does affect on what they are named: they defines
>> buffer size for copying data from db/view file to the .compact one.
>> What information you think is missed?
>>
>> > (For me the the compaction with default values was many times slower
>> > than with the values that gave the faster compaction).
>>
>> All numbers have their cost: large buffers requires more memory while
>> they reduces I/O operations and vice versa. Much likely, that default
>> values wouldn't provide you high performance since they aimed to fit
>> everyone, but that's why you may tweak them for your needs (:
>>
>> I believe, that it's possible to revise them, but first need to
>> collect information in what environment which values are effective and
>> which are not. Would you like to help us with that?
>>
>>
>> --
>> ,,,^..^,,,
>>

Re: Finding the right value for compaction configuration parameters

Posted by Boaz Citrin <bc...@gmail.com>.
Hi Alexander,

Yes, I have some numbers, do you want me to share here or somewhere else?

Best,

Boaz


On Mon, Oct 21, 2013 at 12:32 AM, Alexander Shorin <kx...@gmail.com> wrote:

> Hi Boaz!
>
> On Fri, Oct 18, 2013 at 12:57 AM, Boaz Citrin <bc...@gmail.com> wrote:
> > Testing database compaction with various doc_buffer_size values I get
> > completely different results.
> > Documentation is fairly vague, so I wonder how to choose the right value;
> > What parameters affect this - HD buffer size, average doc size,
> > database size, fragmentation, etc... ?!
> >
> > Same goes for view compaction and keyvalue_buffer_size.
>
> These parameters does affect on what they are named: they defines
> buffer size for copying data from db/view file to the .compact one.
> What information you think is missed?
>
> > (For me the the compaction with default values was many times slower
> > than with the values that gave the faster compaction).
>
> All numbers have their cost: large buffers requires more memory while
> they reduces I/O operations and vice versa. Much likely, that default
> values wouldn't provide you high performance since they aimed to fit
> everyone, but that's why you may tweak them for your needs (:
>
> I believe, that it's possible to revise them, but first need to
> collect information in what environment which values are effective and
> which are not. Would you like to help us with that?
>
>
> --
> ,,,^..^,,,
>

Re: Finding the right value for compaction configuration parameters

Posted by Alexander Shorin <kx...@gmail.com>.
Hi Boaz!

On Fri, Oct 18, 2013 at 12:57 AM, Boaz Citrin <bc...@gmail.com> wrote:
> Testing database compaction with various doc_buffer_size values I get
> completely different results.
> Documentation is fairly vague, so I wonder how to choose the right value;
> What parameters affect this - HD buffer size, average doc size,
> database size, fragmentation, etc... ?!
>
> Same goes for view compaction and keyvalue_buffer_size.

These parameters does affect on what they are named: they defines
buffer size for copying data from db/view file to the .compact one.
What information you think is missed?

> (For me the the compaction with default values was many times slower
> than with the values that gave the faster compaction).

All numbers have their cost: large buffers requires more memory while
they reduces I/O operations and vice versa. Much likely, that default
values wouldn't provide you high performance since they aimed to fit
everyone, but that's why you may tweak them for your needs (:

I believe, that it's possible to revise them, but first need to
collect information in what environment which values are effective and
which are not. Would you like to help us with that?


--
,,,^..^,,,

Re: Finding the right value for compaction configuration parameters

Posted by Boaz Citrin <bc...@gmail.com>.
Hi Stanley,

This page doesn't cover the configuration parameters I mentioned.

The documentation at
http://docs.couchdb.org/en/latest/config/compaction.html doesn't give too
much info...

Best,

Boaz



On Sun, Oct 20, 2013 at 9:58 PM, Stanley Iriele <si...@gmail.com>wrote:

> hey Boaz,
>
> have you seen this documentation on what the various parameters mean?
>
> http://wiki.apache.org/couchdb/Compaction
>
>
> On Sun, Oct 20, 2013 at 11:48 AM, Boaz Citrin <bc...@gmail.com> wrote:
>
> > Hello,
> >
> > I wonder if anyone had the chance to read this and has any idea...
> >
> > Also, we are looking for some consulting, so if anyone can offer his
> > services please email me directly.
> >
> > Thanks,
> >
> > Boaz
> >
> >
> > On Thu, Oct 17, 2013 at 11:57 PM, Boaz Citrin <bc...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > Testing database compaction with various doc_buffer_size values I get
> > > completely different results.
> > > Documentation is fairly vague, so I wonder how to choose the right
> value;
> > > What parameters affect this - HD buffer size, average doc size,
> > > database size, fragmentation, etc... ?!
> > >
> > > Same goes for view compaction and keyvalue_buffer_size.
> > >
> > > (For me the the compaction with default values was many times slower
> > > than with the values that gave the faster compaction).
> > >
> > > Thanks!
> > >
> >
>

Re: Finding the right value for compaction configuration parameters

Posted by Stanley Iriele <si...@gmail.com>.
hey Boaz,

have you seen this documentation on what the various parameters mean?

http://wiki.apache.org/couchdb/Compaction


On Sun, Oct 20, 2013 at 11:48 AM, Boaz Citrin <bc...@gmail.com> wrote:

> Hello,
>
> I wonder if anyone had the chance to read this and has any idea...
>
> Also, we are looking for some consulting, so if anyone can offer his
> services please email me directly.
>
> Thanks,
>
> Boaz
>
>
> On Thu, Oct 17, 2013 at 11:57 PM, Boaz Citrin <bc...@gmail.com> wrote:
>
> > Hello,
> >
> > Testing database compaction with various doc_buffer_size values I get
> > completely different results.
> > Documentation is fairly vague, so I wonder how to choose the right value;
> > What parameters affect this - HD buffer size, average doc size,
> > database size, fragmentation, etc... ?!
> >
> > Same goes for view compaction and keyvalue_buffer_size.
> >
> > (For me the the compaction with default values was many times slower
> > than with the values that gave the faster compaction).
> >
> > Thanks!
> >
>

Re: Finding the right value for compaction configuration parameters

Posted by Boaz Citrin <bc...@gmail.com>.
Hello,

I wonder if anyone had the chance to read this and has any idea...

Also, we are looking for some consulting, so if anyone can offer his
services please email me directly.

Thanks,

Boaz


On Thu, Oct 17, 2013 at 11:57 PM, Boaz Citrin <bc...@gmail.com> wrote:

> Hello,
>
> Testing database compaction with various doc_buffer_size values I get
> completely different results.
> Documentation is fairly vague, so I wonder how to choose the right value;
> What parameters affect this - HD buffer size, average doc size,
> database size, fragmentation, etc... ?!
>
> Same goes for view compaction and keyvalue_buffer_size.
>
> (For me the the compaction with default values was many times slower
> than with the values that gave the faster compaction).
>
> Thanks!
>