You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2018/03/07 00:52:08 UTC

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Hello David Ribeiro Alves, Todd Lipcon,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/9522

to review the following change.


Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................

KUDU-1913: cap number of threads on server-wide pools

The last remaining piece of work is to do away with the unbounded number of
threads that may be started in the Raft and Prepare server-wide threadpools.
These caps make it easier for admins to reason about appropriate values for
the configuration of the Kudu processes RLIMIT_NPROC resource.

KUDU-1913 proposed a cap of "number of cores + number of disks", but a
lively Slack discussion yielded a better solution: set the cap at some
percentage of the process' RLIMIT_NPROC value. Given that the rest of Kudu
generally uses a constant number of threads, this should prevent spikes from
ever exceeding the RLIMIT_NPROC and crashing the server due to an election
storm. This patch implements a cap of 10% per pool and also provides a new
gflag as an "escape hatch" (in case we were horribly wrong).

Note: it's still possible for a massive number of "hot" replicas to exceed
RLIMIT_NPROC by virtue of each replica's log append thread, but the server
is more likely to run out of memory for MemRowSets before that happens.

Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
---
M src/kudu/kserver/kserver.cc
1 file changed, 55 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/9522/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9522 )

Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................

KUDU-1913: cap number of threads on server-wide pools

The last piece of work is to establish an upper bound on the number of
threads that may be started in the Raft and Prepare server-wide threadpools.
Such caps will make it easier for admins to reason about appropriate values
for the configuration of the Kudu processes' RLIMIT_NPROC resource.

KUDU-1913 proposed a cap of "number of cores + number of disks", but a
lively Slack discussion yielded a better solution: set the cap at some
percentage of the process' RLIMIT_NPROC value. Given that the rest of Kudu
generally uses a constant number of threads, this should prevent spikes from
ever exceeding the RLIMIT_NPROC and crashing the server due to an election
storm. This patch implements a cap of 10% per pool and also provides a new
gflag as an "escape hatch" (in case we were horribly wrong).

Note: it's still possible for a massive number of "hot" replicas to exceed
RLIMIT_NPROC by virtue of each replica's log append thread, but the server
is more likely to run out of memory for MemRowSets before that happens.

Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Reviewed-on: http://gerrit.cloudera.org:8080/9522
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/kserver/kserver.cc
1 file changed, 57 insertions(+), 11 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  David Ribeiro Alves: Looks good to me, approved
  Todd Lipcon: Looks good to me, but someone else must approve

-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 )

Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................


Patch Set 2: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 08 Mar 2018 02:36:42 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9522

to look at the new patch set (#2).

Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................

KUDU-1913: cap number of threads on server-wide pools

The last piece of work is to establish an upper bound on the number of
threads that may be started in the Raft and Prepare server-wide threadpools.
Such caps will make it easier for admins to reason about appropriate values
for the configuration of the Kudu processes' RLIMIT_NPROC resource.

KUDU-1913 proposed a cap of "number of cores + number of disks", but a
lively Slack discussion yielded a better solution: set the cap at some
percentage of the process' RLIMIT_NPROC value. Given that the rest of Kudu
generally uses a constant number of threads, this should prevent spikes from
ever exceeding the RLIMIT_NPROC and crashing the server due to an election
storm. This patch implements a cap of 10% per pool and also provides a new
gflag as an "escape hatch" (in case we were horribly wrong).

Note: it's still possible for a massive number of "hot" replicas to exceed
RLIMIT_NPROC by virtue of each replica's log append thread, but the server
is more likely to run out of memory for MemRowSets before that happens.

Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
---
M src/kudu/kserver/kserver.cc
1 file changed, 57 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/9522/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 )

Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 07 Mar 2018 21:41:13 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 )

Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9522/1/src/kudu/kserver/kserver.cc
File src/kudu/kserver/kserver.cc:

http://gerrit.cloudera.org:8080/#/c/9522/1/src/kudu/kserver/kserver.cc@39
PS1, Line 39: server_thread_pool_thread_limit
lgtm, only minor nit is the name of the flag server_thread_pool_thread_limit, how about server_thread_pool_max_count or server_thread_pool_max_thread_count ?
don't feel super strongly about it so fine by folks like this name.



-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 07 Mar 2018 01:15:46 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1913: cap number of threads on server-wide pools

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 )

Change subject: KUDU-1913: cap number of threads on server-wide pools
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9522/1/src/kudu/kserver/kserver.cc
File src/kudu/kserver/kserver.cc:

http://gerrit.cloudera.org:8080/#/c/9522/1/src/kudu/kserver/kserver.cc@39
PS1, Line 39: server_thread_pool_thread_limit
> lgtm, only minor nit is the name of the flag server_thread_pool_thread_limi
Sure, I'll change it to server_thread_pool_max_thread_count.



-- 
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618
Gerrit-Change-Number: 9522
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 07 Mar 2018 21:04:57 +0000
Gerrit-HasComments: Yes