You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Paul Milner <pa...@gmail.com> on 2021/08/18 06:58:19 UTC

Setting up smoosh for database compaction

Hello

I'm looking at the maintenance of my databases and how I could implement
tools to do that. Smoosh seems to be the main option, but I'm struggling to
set it up as the documentation seems a bit limited.

I have only really found this:

5.1. Compaction — Apache CouchDB® 3.1 Documentation
<https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction>

I could do it manually but wanted to explore this first and was wondering
if there are any smoosh examples about, that could help me on my way?

If anyone could point me in the right direction please, I would appreciate
it.

Thanks a lot
Best regards
Paul

Re: Setting up smoosh for database compaction

Posted by Kyle Snavely <kj...@gmail.com>.
4.0 is still in development today.

If you tweak compaction settings and have very large shards, do take care
to leave some disk space headroom to allow the compaction process to take
place. Basically don't run your disks at 90% in production without
experience there. ;)

Kyle

On Thu, Aug 19, 2021, 9:41 AM Paul Milner <pa...@gmail.com> wrote:

> Hi Bob,
>
> Ok thanks, interesting. Can you tell me when 4.0 is planned to be released
> please?
>
> Thanks
> Paul
>
> On Thu, 19 Aug 2021 at 15:19, Robert Newson <rn...@apache.org> wrote:
>
> > Hi Paul,
> >
> >
> > I think that’s reasonable though do note that compaction is also for
> > performance, even if you never update or delete a document, as couchdb
> > defers rebalancing the b+tree disk structures until then (i.e, couchdb
> > isn’t adhering to the b+tree algorithm from literature).
> >
> > Left uncompacted the lookup/insert performance will drop from roughly
> > O(log n) to O(n) over time (though only as a consequence of writing
> > documents).
> >
> > None of what I’ve said will apply in CouchDB 4.0 (compaction no longer
> > required there)
> >
> > B (short for Bob)
> >
> > > On 19 Aug 2021, at 11:32, Paul Milner <pa...@gmail.com> wrote:
> > >
> > > Hi B (?? ;-) )
> > >
> > > I have a log database that could encounter high frequency updates and
> > > deletes. It's not required to be read by multiple users, but will be
> > > updated by all users. So rather than compacting it, which at certain
> > > frequencies of updates could lead to possible race conditions (thinking
> > of
> > > extremes), I was going to do the following steps:
> > >
> > > 1) Switch the active log to a new database
> > > 2) Copy the old database without orphans/history to the new database
> > > 3) delete the old database
> > >
> > > I would toggle databases as needed.
> > >
> > > Best regards
> > > Paul
> > >
> > > On Thu, 19 Aug 2021 at 10:24, Robert Newson <rn...@apache.org>
> wrote:
> > >
> > >> Hi Paul,
> > >>
> > >> We welcome feedback on why the automatic compaction system (in its
> > default
> > >> configuration or custom) is not appropriate for you.
> > >>
> > >> B.
> > >>
> > >>> On 19 Aug 2021, at 05:29, Paul Milner <pa...@gmail.com> wrote:
> > >>>
> > >>> Hi Adam
> > >>>
> > >>> Thanks for the feedback. I was actually struggling with which options
> > to
> > >> set per channel and what to set them to. Anyway after more thought,
> I’ve
> > >> decided on a manual approach as I need it to be more custom than
> > automatic.
> > >>>
> > >>> But thanks again
> > >>> I appreciate it.
> > >>>
> > >>> Best regards
> > >>> Paul
> > >>>
> > >>> Sent from my iPad
> > >>>
> > >>>> On 18 Aug 2021, at 20:01, Adam Kocoloski <ko...@apache.org>
> wrote:
> > >>>>
> > >>>> Hi Paul, sorry to hear you’re finding it a challenge to configure.
> > The
> > >> default configuration described in the documentation does give you an
> > >> example of how things are set up:
> > >>>>
> > >>>>
> > >>
> >
> https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration
> > >>>>
> > >>>> Cross-referenced from that section you can find the full
> configuration
> > >> reference that describes all the supported configuration keys at the
> > >> channel level:
> > >>>>
> > >>>>
> > >>
> >
> https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions
> > >>>>
> > >>>> The general idea is that you create [smoosh.<channelname>]
> > >> configuration blocks with whatever settings you deem appropriate to
> > match a
> > >> certain set of files and prioritize them, and then use the [smoosh]
> > block
> > >> to activate those channels.
> > >>>>
> > >>>> Can you say a little more about what you’re finding lacking in the
> > >> docs? Cheers,
> > >>>>
> > >>>> Adam
> > >>>>
> > >>>>> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com>
> > >> wrote:
> > >>>>>
> > >>>>> Hello
> > >>>>>
> > >>>>> I'm looking at the maintenance of my databases and how I could
> > >> implement
> > >>>>> tools to do that. Smoosh seems to be the main option, but I'm
> > >> struggling to
> > >>>>> set it up as the documentation seems a bit limited.
> > >>>>>
> > >>>>> I have only really found this:
> > >>>>>
> > >>>>> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
> > >>>>> <
> > >>
> >
> https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction
> > >>>
> > >>>>>
> > >>>>> I could do it manually but wanted to explore this first and was
> > >> wondering
> > >>>>> if there are any smoosh examples about, that could help me on my
> way?
> > >>>>>
> > >>>>> If anyone could point me in the right direction please, I would
> > >> appreciate
> > >>>>> it.
> > >>>>>
> > >>>>> Thanks a lot
> > >>>>> Best regards
> > >>>>> Paul
> > >>>>
> > >>
> > >>
> >
> >
>

Re: Setting up smoosh for database compaction

Posted by Paul Milner <pa...@gmail.com>.
Hi Bob,

Ok thanks, interesting. Can you tell me when 4.0 is planned to be released
please?

Thanks
Paul

On Thu, 19 Aug 2021 at 15:19, Robert Newson <rn...@apache.org> wrote:

> Hi Paul,
>
>
> I think that’s reasonable though do note that compaction is also for
> performance, even if you never update or delete a document, as couchdb
> defers rebalancing the b+tree disk structures until then (i.e, couchdb
> isn’t adhering to the b+tree algorithm from literature).
>
> Left uncompacted the lookup/insert performance will drop from roughly
> O(log n) to O(n) over time (though only as a consequence of writing
> documents).
>
> None of what I’ve said will apply in CouchDB 4.0 (compaction no longer
> required there)
>
> B (short for Bob)
>
> > On 19 Aug 2021, at 11:32, Paul Milner <pa...@gmail.com> wrote:
> >
> > Hi B (?? ;-) )
> >
> > I have a log database that could encounter high frequency updates and
> > deletes. It's not required to be read by multiple users, but will be
> > updated by all users. So rather than compacting it, which at certain
> > frequencies of updates could lead to possible race conditions (thinking
> of
> > extremes), I was going to do the following steps:
> >
> > 1) Switch the active log to a new database
> > 2) Copy the old database without orphans/history to the new database
> > 3) delete the old database
> >
> > I would toggle databases as needed.
> >
> > Best regards
> > Paul
> >
> > On Thu, 19 Aug 2021 at 10:24, Robert Newson <rn...@apache.org> wrote:
> >
> >> Hi Paul,
> >>
> >> We welcome feedback on why the automatic compaction system (in its
> default
> >> configuration or custom) is not appropriate for you.
> >>
> >> B.
> >>
> >>> On 19 Aug 2021, at 05:29, Paul Milner <pa...@gmail.com> wrote:
> >>>
> >>> Hi Adam
> >>>
> >>> Thanks for the feedback. I was actually struggling with which options
> to
> >> set per channel and what to set them to. Anyway after more thought, I’ve
> >> decided on a manual approach as I need it to be more custom than
> automatic.
> >>>
> >>> But thanks again
> >>> I appreciate it.
> >>>
> >>> Best regards
> >>> Paul
> >>>
> >>> Sent from my iPad
> >>>
> >>>> On 18 Aug 2021, at 20:01, Adam Kocoloski <ko...@apache.org> wrote:
> >>>>
> >>>> Hi Paul, sorry to hear you’re finding it a challenge to configure.
> The
> >> default configuration described in the documentation does give you an
> >> example of how things are set up:
> >>>>
> >>>>
> >>
> https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration
> >>>>
> >>>> Cross-referenced from that section you can find the full configuration
> >> reference that describes all the supported configuration keys at the
> >> channel level:
> >>>>
> >>>>
> >>
> https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions
> >>>>
> >>>> The general idea is that you create [smoosh.<channelname>]
> >> configuration blocks with whatever settings you deem appropriate to
> match a
> >> certain set of files and prioritize them, and then use the [smoosh]
> block
> >> to activate those channels.
> >>>>
> >>>> Can you say a little more about what you’re finding lacking in the
> >> docs? Cheers,
> >>>>
> >>>> Adam
> >>>>
> >>>>> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com>
> >> wrote:
> >>>>>
> >>>>> Hello
> >>>>>
> >>>>> I'm looking at the maintenance of my databases and how I could
> >> implement
> >>>>> tools to do that. Smoosh seems to be the main option, but I'm
> >> struggling to
> >>>>> set it up as the documentation seems a bit limited.
> >>>>>
> >>>>> I have only really found this:
> >>>>>
> >>>>> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
> >>>>> <
> >>
> https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction
> >>>
> >>>>>
> >>>>> I could do it manually but wanted to explore this first and was
> >> wondering
> >>>>> if there are any smoosh examples about, that could help me on my way?
> >>>>>
> >>>>> If anyone could point me in the right direction please, I would
> >> appreciate
> >>>>> it.
> >>>>>
> >>>>> Thanks a lot
> >>>>> Best regards
> >>>>> Paul
> >>>>
> >>
> >>
>
>

Re: Setting up smoosh for database compaction

Posted by Robert Newson <rn...@apache.org>.
Hi Paul,


I think that’s reasonable though do note that compaction is also for performance, even if you never update or delete a document, as couchdb defers rebalancing the b+tree disk structures until then (i.e, couchdb isn’t adhering to the b+tree algorithm from literature).

Left uncompacted the lookup/insert performance will drop from roughly O(log n) to O(n) over time (though only as a consequence of writing documents).

None of what I’ve said will apply in CouchDB 4.0 (compaction no longer required there)

B (short for Bob)

> On 19 Aug 2021, at 11:32, Paul Milner <pa...@gmail.com> wrote:
> 
> Hi B (?? ;-) )
> 
> I have a log database that could encounter high frequency updates and
> deletes. It's not required to be read by multiple users, but will be
> updated by all users. So rather than compacting it, which at certain
> frequencies of updates could lead to possible race conditions (thinking of
> extremes), I was going to do the following steps:
> 
> 1) Switch the active log to a new database
> 2) Copy the old database without orphans/history to the new database
> 3) delete the old database
> 
> I would toggle databases as needed.
> 
> Best regards
> Paul
> 
> On Thu, 19 Aug 2021 at 10:24, Robert Newson <rn...@apache.org> wrote:
> 
>> Hi Paul,
>> 
>> We welcome feedback on why the automatic compaction system (in its default
>> configuration or custom) is not appropriate for you.
>> 
>> B.
>> 
>>> On 19 Aug 2021, at 05:29, Paul Milner <pa...@gmail.com> wrote:
>>> 
>>> Hi Adam
>>> 
>>> Thanks for the feedback. I was actually struggling with which options to
>> set per channel and what to set them to. Anyway after more thought, I’ve
>> decided on a manual approach as I need it to be more custom than automatic.
>>> 
>>> But thanks again
>>> I appreciate it.
>>> 
>>> Best regards
>>> Paul
>>> 
>>> Sent from my iPad
>>> 
>>>> On 18 Aug 2021, at 20:01, Adam Kocoloski <ko...@apache.org> wrote:
>>>> 
>>>> Hi Paul, sorry to hear you’re finding it a challenge to configure. The
>> default configuration described in the documentation does give you an
>> example of how things are set up:
>>>> 
>>>> 
>> https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration
>>>> 
>>>> Cross-referenced from that section you can find the full configuration
>> reference that describes all the supported configuration keys at the
>> channel level:
>>>> 
>>>> 
>> https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions
>>>> 
>>>> The general idea is that you create [smoosh.<channelname>]
>> configuration blocks with whatever settings you deem appropriate to match a
>> certain set of files and prioritize them, and then use the [smoosh] block
>> to activate those channels.
>>>> 
>>>> Can you say a little more about what you’re finding lacking in the
>> docs? Cheers,
>>>> 
>>>> Adam
>>>> 
>>>>> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com>
>> wrote:
>>>>> 
>>>>> Hello
>>>>> 
>>>>> I'm looking at the maintenance of my databases and how I could
>> implement
>>>>> tools to do that. Smoosh seems to be the main option, but I'm
>> struggling to
>>>>> set it up as the documentation seems a bit limited.
>>>>> 
>>>>> I have only really found this:
>>>>> 
>>>>> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
>>>>> <
>> https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction
>>> 
>>>>> 
>>>>> I could do it manually but wanted to explore this first and was
>> wondering
>>>>> if there are any smoosh examples about, that could help me on my way?
>>>>> 
>>>>> If anyone could point me in the right direction please, I would
>> appreciate
>>>>> it.
>>>>> 
>>>>> Thanks a lot
>>>>> Best regards
>>>>> Paul
>>>> 
>> 
>> 


Re: Setting up smoosh for database compaction

Posted by Paul Milner <pa...@gmail.com>.
Hi B (?? ;-) )

I have a log database that could encounter high frequency updates and
deletes. It's not required to be read by multiple users, but will be
updated by all users. So rather than compacting it, which at certain
frequencies of updates could lead to possible race conditions (thinking of
extremes), I was going to do the following steps:

1) Switch the active log to a new database
2) Copy the old database without orphans/history to the new database
3) delete the old database

I would toggle databases as needed.

Best regards
Paul

On Thu, 19 Aug 2021 at 10:24, Robert Newson <rn...@apache.org> wrote:

> Hi Paul,
>
> We welcome feedback on why the automatic compaction system (in its default
> configuration or custom) is not appropriate for you.
>
> B.
>
> > On 19 Aug 2021, at 05:29, Paul Milner <pa...@gmail.com> wrote:
> >
> > Hi Adam
> >
> > Thanks for the feedback. I was actually struggling with which options to
> set per channel and what to set them to. Anyway after more thought, I’ve
> decided on a manual approach as I need it to be more custom than automatic.
> >
> > But thanks again
> > I appreciate it.
> >
> > Best regards
> > Paul
> >
> > Sent from my iPad
> >
> >> On 18 Aug 2021, at 20:01, Adam Kocoloski <ko...@apache.org> wrote:
> >>
> >> Hi Paul, sorry to hear you’re finding it a challenge to configure. The
> default configuration described in the documentation does give you an
> example of how things are set up:
> >>
> >>
> https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration
> >>
> >> Cross-referenced from that section you can find the full configuration
> reference that describes all the supported configuration keys at the
> channel level:
> >>
> >>
> https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions
> >>
> >> The general idea is that you create [smoosh.<channelname>]
> configuration blocks with whatever settings you deem appropriate to match a
> certain set of files and prioritize them, and then use the [smoosh] block
> to activate those channels.
> >>
> >> Can you say a little more about what you’re finding lacking in the
> docs? Cheers,
> >>
> >> Adam
> >>
> >>> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com>
> wrote:
> >>>
> >>> Hello
> >>>
> >>> I'm looking at the maintenance of my databases and how I could
> implement
> >>> tools to do that. Smoosh seems to be the main option, but I'm
> struggling to
> >>> set it up as the documentation seems a bit limited.
> >>>
> >>> I have only really found this:
> >>>
> >>> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
> >>> <
> https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction
> >
> >>>
> >>> I could do it manually but wanted to explore this first and was
> wondering
> >>> if there are any smoosh examples about, that could help me on my way?
> >>>
> >>> If anyone could point me in the right direction please, I would
> appreciate
> >>> it.
> >>>
> >>> Thanks a lot
> >>> Best regards
> >>> Paul
> >>
>
>

Re: Setting up smoosh for database compaction

Posted by Robert Newson <rn...@apache.org>.
Hi Paul,

We welcome feedback on why the automatic compaction system (in its default configuration or custom) is not appropriate for you.

B.

> On 19 Aug 2021, at 05:29, Paul Milner <pa...@gmail.com> wrote:
> 
> Hi Adam
> 
> Thanks for the feedback. I was actually struggling with which options to set per channel and what to set them to. Anyway after more thought, I’ve decided on a manual approach as I need it to be more custom than automatic. 
> 
> But thanks again
> I appreciate it. 
> 
> Best regards
> Paul 
> 
> Sent from my iPad
> 
>> On 18 Aug 2021, at 20:01, Adam Kocoloski <ko...@apache.org> wrote:
>> 
>> Hi Paul, sorry to hear you’re finding it a challenge to configure. The default configuration described in the documentation does give you an example of how things are set up:
>> 
>> https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration
>> 
>> Cross-referenced from that section you can find the full configuration reference that describes all the supported configuration keys at the channel level:
>> 
>> https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions
>> 
>> The general idea is that you create [smoosh.<channelname>] configuration blocks with whatever settings you deem appropriate to match a certain set of files and prioritize them, and then use the [smoosh] block to activate those channels.
>> 
>> Can you say a little more about what you’re finding lacking in the docs? Cheers,
>> 
>> Adam
>> 
>>> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com> wrote:
>>> 
>>> Hello
>>> 
>>> I'm looking at the maintenance of my databases and how I could implement
>>> tools to do that. Smoosh seems to be the main option, but I'm struggling to
>>> set it up as the documentation seems a bit limited.
>>> 
>>> I have only really found this:
>>> 
>>> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
>>> <https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction>
>>> 
>>> I could do it manually but wanted to explore this first and was wondering
>>> if there are any smoosh examples about, that could help me on my way?
>>> 
>>> If anyone could point me in the right direction please, I would appreciate
>>> it.
>>> 
>>> Thanks a lot
>>> Best regards
>>> Paul
>> 


Re: Setting up smoosh for database compaction

Posted by Paul Milner <pa...@gmail.com>.
Hi Adam

Thanks for the feedback. I was actually struggling with which options to set per channel and what to set them to. Anyway after more thought, I’ve decided on a manual approach as I need it to be more custom than automatic. 

But thanks again
I appreciate it. 

Best regards
Paul 

Sent from my iPad

> On 18 Aug 2021, at 20:01, Adam Kocoloski <ko...@apache.org> wrote:
> 
> Hi Paul, sorry to hear you’re finding it a challenge to configure. The default configuration described in the documentation does give you an example of how things are set up:
> 
> https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration
> 
> Cross-referenced from that section you can find the full configuration reference that describes all the supported configuration keys at the channel level:
> 
> https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions
> 
> The general idea is that you create [smoosh.<channelname>] configuration blocks with whatever settings you deem appropriate to match a certain set of files and prioritize them, and then use the [smoosh] block to activate those channels.
> 
> Can you say a little more about what you’re finding lacking in the docs? Cheers,
> 
> Adam
> 
>> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com> wrote:
>> 
>> Hello
>> 
>> I'm looking at the maintenance of my databases and how I could implement
>> tools to do that. Smoosh seems to be the main option, but I'm struggling to
>> set it up as the documentation seems a bit limited.
>> 
>> I have only really found this:
>> 
>> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
>> <https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction>
>> 
>> I could do it manually but wanted to explore this first and was wondering
>> if there are any smoosh examples about, that could help me on my way?
>> 
>> If anyone could point me in the right direction please, I would appreciate
>> it.
>> 
>> Thanks a lot
>> Best regards
>> Paul
> 

Re: Setting up smoosh for database compaction

Posted by Adam Kocoloski <ko...@apache.org>.
Hi Paul, sorry to hear you’re finding it a challenge to configure. The default configuration described in the documentation does give you an example of how things are set up:

https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration

Cross-referenced from that section you can find the full configuration reference that describes all the supported configuration keys at the channel level:

https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions

The general idea is that you create [smoosh.<channelname>] configuration blocks with whatever settings you deem appropriate to match a certain set of files and prioritize them, and then use the [smoosh] block to activate those channels.

Can you say a little more about what you’re finding lacking in the docs? Cheers,

Adam

> On Aug 18, 2021, at 2:58 AM, Paul Milner <pa...@gmail.com> wrote:
> 
> Hello
> 
> I'm looking at the maintenance of my databases and how I could implement
> tools to do that. Smoosh seems to be the main option, but I'm struggling to
> set it up as the documentation seems a bit limited.
> 
> I have only really found this:
> 
> 5.1. Compaction — Apache CouchDB® 3.1 Documentation
> <https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction>
> 
> I could do it manually but wanted to explore this first and was wondering
> if there are any smoosh examples about, that could help me on my way?
> 
> If anyone could point me in the right direction please, I would appreciate
> it.
> 
> Thanks a lot
> Best regards
> Paul