You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by Jason Heo <ja...@gmail.com> on 2017/03/22 02:44:15 UTC

What's the effect of increasing budgeted_compaction_target_rowset_size

Hi, Congrats. Apache Kudu 1.3.

I'm using Kudu 1.2 on CDH 5.10

Compaction Policy
<https://github.com/apache/kudu/blob/master/docs/design-docs/compaction-policy.md>
says that:

> Compactions are necessary in order to reduce the number of DiskRowSets
which must be consulted for various operations, thus improving the overall
performance of the tablet.

But in my case, number of DiskRowSets are not decreased, even increased :(


   - Before Compaction: 22
   - After Compaction: 30


Here are captured images.


​


​


So, I'm considering increasing `budgeted_compaction_target_rowset_size`
from 32MB to 64MB.

Before changing, I'd like to know what what's the downside of bigger disk
rowset size.

Thanks in advanced.

Re: What's the effect of increasing budgeted_compaction_target_rowset_size

Posted by Todd Lipcon <to...@cloudera.com>.
On Tue, Mar 21, 2017 at 7:44 PM, Jason Heo <ja...@gmail.com> wrote:

> Hi, Congrats. Apache Kudu 1.3.
>
> I'm using Kudu 1.2 on CDH 5.10
>
> Compaction Policy
> <https://github.com/apache/kudu/blob/master/docs/design-docs/compaction-policy.md>
> says that:
>
> > Compactions are necessary in order to reduce the number of DiskRowSets
> which must be consulted for various operations, thus improving the overall
> performance of the tablet.
>
> But in my case, number of DiskRowSets are not decreased, even increased :(
>
>
>    - Before Compaction: 22
>    - After Compaction: 30
>
>
> Here are captured images.
>
>
> ​
>
>
> ​
>
>
One thing which may be confusing your measurements is that we currently
don't show any rowsets that are part of an in-progress compaction. The fact
that there are no orange-colored rowsets in the bottom diagram indicates
that a compaction is going on, and therefore some data is not being shown.
That's a bug filed as KUDU-844.

Keep in mind that the goal is not to reduce the total numer of DiskRowSets,
but rather to reduce the amount of _overlap_ between DRSes. In your top
diagram, an insert into any portion of keyspace would require lookups in
two DRS. In the bottom one, any insert would only require one lookup.


>
> So, I'm considering increasing `budgeted_compaction_target_rowset_size`
> from 32MB to 64MB.
>
> Before changing, I'd like to know what what's the downside of bigger disk
> rowset size.
>

The downside is that compaction work will be "chunkier" and less adaptive.
The upside is potentially larger IOs, fewer blocks to account (using some
memory), etc. But I don't think it would make a substantial difference in
what you're reporting above.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera