You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by Jason Heo <ja...@gmail.com> on 2017/03/24 08:39:54 UTC

How to calculate the optimal value of `maintenance_manager_num_threads`

Hi,

I'm using Apache Kudu 1.2 on CDH 5.10.

Recently, after reading "Bulk write performance improvements for Kudu 1.4
<https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit>"
I've noticed that `maintenance_manager_num_threads` is 4 for the 5 spinning
disks.

In my cluster, each node has 10 SATA disks with RAID 1+0 (WAL and Data
directory located in the same partition). As Todd suggested, bulk loading
is doing in PK sorted manner. I think CPU usage and System Load of my
cluster is not high at this moment, so I think it could be increased a
little bit more.

Would someone please suggest the number of my environment?

Thanks in advanced.

Re: How to calculate the optimal value of `maintenance_manager_num_threads`

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Jason,

On Fri, Mar 24, 2017 at 1:39 AM, Jason Heo <ja...@gmail.com> wrote:

> Hi,
>
> I'm using Apache Kudu 1.2 on CDH 5.10.
>
> Recently, after reading "Bulk write performance improvements for Kudu 1.4
> <https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit>"
> I've noticed that `maintenance_manager_num_threads` is 4 for the 5
> spinning disks.
>
>
Yes, but I wouldn't take that as necessarily optimal. I'm now doing some
tests with 8 threads as a comparison point.


> In my cluster, each node has 10 SATA disks with RAID 1+0 (WAL and Data
> directory located in the same partition). As Todd suggested, bulk loading
> is doing in PK sorted manner. I think CPU usage and System Load of my
> cluster is not high at this moment, so I think it could be increased a
> little bit more.
>
> Would someone please suggest the number of my environment?
>

Increasing the number of maintenance threads may help if you are falling
behind on compaction and flushes. For compaction, you can tell if you are
falling behind by looking at the "bloom_lookups_per_op" metric. For
flushes, you may be falling behind if you see a lot of "memory pressure
rejections". One area for improvement in our tooling is adding some more
scripts and tools to make these types of diagnosis easier.

In general, it's a tradeoff: more MM threads means more resource
consumption, but possibly better performance. The tradeoff may be
non-linear, though (i.e doubling MM threads won't double performance!)

As Kudu is still a young project, we're still gathering operational
experience from users around topics like this. It would be great if you can
share back any results you find with the community.

Thanks

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera