You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mike Percy (Code Review)" <ge...@cloudera.org> on 2018/10/18 20:25:17 UTC

[kudu-CR] thread: show ulimit nproc when thread creation fails

Hello Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/11726

to review the following change.


Change subject: thread: show ulimit nproc when thread creation fails
......................................................................

thread: show ulimit nproc when thread creation fails

It seems useful to return the ulimit nproc when thread creation fails to
help an administrator or dev diagnose the cause of the fork failure.
This patch adds that info to the error Status returned from
Thread::Create().

Tested manually on Linux. Example:

mpercy@mpercy-T460p:~/src/kudu/build/dynclang$ ps -efwwwL | grep mpercy | wc -l
2382
mpercy@mpercy-T460p:~/src/kudu/build/dynclang$ ulimit -u 2390
mpercy@mpercy-T460p:~/src/kudu/build/dynclang$ ./bin/kudu-tserver  --fs-wal-dir $(pwd)/wal --logtostderr
I1018 13:23:53.856669 23095 tablet_server_main.cc:78] Tablet server non-default flags:
--fs_wal_dir=/home/mpercy/src/kudu/build/dynclang/wal
--heap_profile_path=/tmp/kudu-tserver.23095
--logtostderr=true
Tablet server version:
kudu 1.9.0-SNAPSHOT
revision 53e02879885ef2e1598549b5655f610f12810011-dirty
build type DEBUG
built by mpercy at 18 Oct 2018 13:11:31 PST on mpercy-T460p
I1018 13:23:53.857304 23095 minidump.cc:237] Setting minidump size limit to 20M
F1018 13:23:53.857630 23095 kernel_stack_watchdog.cc:75] Check failed: _s.ok() Bad status: Runtime error: Could not create thread (max number of processes for current user is 2390): Resource temporarily unavailable (error 11)
*** Check failure stack trace: ***
Wrote minidump to /tmp/minidumps/kudu-tserver/5024088d-65f7-4159-e5fd33a8-1d89aa95.dmp
Wrote minidump to /tmp/minidumps/kudu-tserver/5024088d-65f7-4159-e5fd33a8-1d89aa95.dmp
*** Aborted at 1539894233 (unix time) try "date -d @1539894233" if you are using GNU date ***
PC: @     0x7f864d7f7e97 gsignal
*** SIGABRT (@0x3e800005a37) received by PID 23095 (TID 0x7f864cd52900) from PID 23095; stack trace: ***
    @     0x7f8652273890 (unknown)
    @     0x7f864d7f7e97 gsignal
    @     0x7f864d7f9801 abort
    @     0x7f865006a2d9 kudu::AbortFailureFunction()
    @     0x7f864f463d0d google::LogMessage::Fail()
    @     0x7f864f465ce4 google::LogMessage::SendToLog()
    @     0x7f864f46382d google::LogMessage::Flush()
    @     0x7f864f4666b9 google::LogMessageFatal::~LogMessageFatal()
    @     0x7f865002b304 kudu::KernelStackWatchdog::KernelStackWatchdog()
    @     0x7f865002e70a Singleton<>::CreateInstance()
    @     0x7f865002e6c9 Singleton<>::Init()
    @     0x7f864f6ef92c GoogleOnceInternalInit()
    @     0x7f86557206ed GoogleOnceInit()
    @     0x7f865002e6a7 Singleton<>::get()
    @     0x7f865002d1d9 kudu::KernelStackWatchdog::GetInstance()
    @     0x7f865002c813 kudu::KernelStackWatchdog::CreateAndRegisterTLS()
    @     0x7f8654ab35d6 kudu::KernelStackWatchdog::GetTLS()
    @     0x7f8654aaf4dc kudu::ScopedWatchKernelStack::ScopedWatchKernelStack()
    @     0x7f86500bc624 kudu::Thread::StartThread()
    @     0x7f865006bafd kudu::Thread::Create<>()
    @     0x7f865006a75b kudu::MinidumpExceptionHandler::StartUserSignalHandlerThread()
    @     0x7f865006a5ad kudu::MinidumpExceptionHandler::RegisterMinidumpExceptionHandler()
    @     0x7f865006b0b2 kudu::MinidumpExceptionHandler::MinidumpExceptionHandler()
    @     0x7f86554d88bb kudu::server::ServerBase::ServerBase()
    @     0x7f8655a20a3c kudu::kserver::KuduServer::KuduServer()
    @     0x7f865571ccf3 kudu::tserver::TabletServer::TabletServer()
    @           0x4064a2 kudu::tserver::TabletServerMain()
    @           0x4060a2 main
    @     0x7f864d7dab97 __libc_start_main
    @           0x405fba _start
Aborted (core dumped)

Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
---
M src/kudu/util/thread.cc
1 file changed, 8 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/11726/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11726 )

Change subject: thread: show thread limit info when thread creation fails
......................................................................


Patch Set 3: Verified+1

overriding jenkins master bork


-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 19 Oct 2018 23:16:33 +0000
Gerrit-HasComments: No

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11726 )

Change subject: thread: show thread limit info when thread creation fails
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 18 Oct 2018 22:26:06 +0000
Gerrit-HasComments: No

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11726 )

Change subject: thread: show thread limit info when thread creation fails
......................................................................


Patch Set 3: Code-Review+2

I'm still skeptical of the usefulness of this, but it's also harmless.


-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 18 Oct 2018 22:55:36 +0000
Gerrit-HasComments: No

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11726

to look at the new patch set (#2).

Change subject: thread: show thread limit info when thread creation fails
......................................................................

thread: show thread limit info when thread creation fails

It seems useful to return the number of outstanding threads and the
ulimit nproc when thread creation fails, to help an administrator
diagnose the cause of the fork()/clone() failure. This patch adds that
info to the error Status returned from Thread::Create().

That necessitated making the ReadThreadsRunning() method of
ThreadManager public.

Tested manually on Linux. Example:

$ ps -efwwwL | grep mpercy | wc -l
2357
$ ulimit -u 2370
$ ./bin/kudu-tserver  --fs-wal-dir $(pwd)/wal --logtostderr
...
F1018 14:16:00.312577 21557 service_pool.cc:93] Check failed: _s.ok() Bad status: Runtime error: Could not create thread (63 Kudu-managed threads running in this process, 2370 max processes allowed for current user): Resource temporarily unavailable (error 11)
*** Check failure stack trace: ***
*** Aborted at 1539897360 (unix time) try "date -d @1539897360" if you are using GNU date ***
PC: @     0x7f503d9d9e97 gsignal
*** SIGABRT (@0x3e800005435) received by PID 21557 (TID 0x7f503cf34900) from PID 21557; stack trace: ***
    @     0x7f5042455890 (unknown)
    @     0x7f503d9d9e97 gsignal
    @     0x7f503d9db801 abort
    @     0x7f504024c309 kudu::AbortFailureFunction()
    @     0x7f503f645d0d google::LogMessage::Fail()
    @     0x7f503f647ce4 google::LogMessage::SendToLog()
    @     0x7f503f64582d google::LogMessage::Flush()
    @     0x7f503f6486b9 google::LogMessageFatal::~LogMessageFatal()
    @     0x7f504172688e kudu::rpc::ServicePool::Init()
    @     0x7f50456b3eed kudu::RpcServer::RegisterService()
    @     0x7f50456bfc59 kudu::server::ServerBase::RegisterService()
    @     0x7f50459004ac kudu::tserver::TabletServer::Start()
    @           0x40683e kudu::tserver::TabletServerMain()
    @           0x4060a2 main
    @     0x7f503d9bcb97 __libc_start_main
    @           0x405fba _start
Aborted (core dumped)

Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
---
M src/kudu/util/thread.cc
1 file changed, 15 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/11726/2
-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 2
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has removed a vote on this change.

Change subject: thread: show thread limit info when thread creation fails
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11726 )

Change subject: thread: show thread limit info when thread creation fails
......................................................................

thread: show thread limit info when thread creation fails

It seems useful to return the number of outstanding threads and the
ulimit nproc when thread creation fails, to help an administrator
diagnose the cause of the fork()/clone() failure. This patch adds that
info to the error Status returned from Thread::Create().

That necessitated making the ReadThreadsRunning() method of
ThreadManager public.

Tested manually on Linux. Example:

$ ps -efwwwL | grep mpercy | wc -l
2357
$ ulimit -u 2370
$ ./bin/kudu-tserver  --fs-wal-dir $(pwd)/wal --logtostderr
...
F1018 14:16:00.312577 21557 service_pool.cc:93] Check failed: _s.ok() Bad status: Runtime error: Could not create thread (63 Kudu-managed threads running in this process, 2370 max processes allowed for current user): Resource temporarily unavailable (error 11)
*** Check failure stack trace: ***
*** Aborted at 1539897360 (unix time) try "date -d @1539897360" if you are using GNU date ***
PC: @     0x7f503d9d9e97 gsignal
*** SIGABRT (@0x3e800005435) received by PID 21557 (TID 0x7f503cf34900) from PID 21557; stack trace: ***
    @     0x7f5042455890 (unknown)
    @     0x7f503d9d9e97 gsignal
    @     0x7f503d9db801 abort
    @     0x7f504024c309 kudu::AbortFailureFunction()
    @     0x7f503f645d0d google::LogMessage::Fail()
    @     0x7f503f647ce4 google::LogMessage::SendToLog()
    @     0x7f503f64582d google::LogMessage::Flush()
    @     0x7f503f6486b9 google::LogMessageFatal::~LogMessageFatal()
    @     0x7f504172688e kudu::rpc::ServicePool::Init()
    @     0x7f50456b3eed kudu::RpcServer::RegisterService()
    @     0x7f50456bfc59 kudu::server::ServerBase::RegisterService()
    @     0x7f50459004ac kudu::tserver::TabletServer::Start()
    @           0x40683e kudu::tserver::TabletServerMain()
    @           0x4060a2 main
    @     0x7f503d9bcb97 __libc_start_main
    @           0x405fba _start
Aborted (core dumped)

Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Reviewed-on: http://gerrit.cloudera.org:8080/11726
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Mike Percy <mp...@apache.org>
---
M src/kudu/util/thread.cc
1 file changed, 19 insertions(+), 7 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Mike Percy: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 4
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] thread: show thread limit info when thread creation fails

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11726

to look at the new patch set (#3).

Change subject: thread: show thread limit info when thread creation fails
......................................................................

thread: show thread limit info when thread creation fails

It seems useful to return the number of outstanding threads and the
ulimit nproc when thread creation fails, to help an administrator
diagnose the cause of the fork()/clone() failure. This patch adds that
info to the error Status returned from Thread::Create().

That necessitated making the ReadThreadsRunning() method of
ThreadManager public.

Tested manually on Linux. Example:

$ ps -efwwwL | grep mpercy | wc -l
2357
$ ulimit -u 2370
$ ./bin/kudu-tserver  --fs-wal-dir $(pwd)/wal --logtostderr
...
F1018 14:16:00.312577 21557 service_pool.cc:93] Check failed: _s.ok() Bad status: Runtime error: Could not create thread (63 Kudu-managed threads running in this process, 2370 max processes allowed for current user): Resource temporarily unavailable (error 11)
*** Check failure stack trace: ***
*** Aborted at 1539897360 (unix time) try "date -d @1539897360" if you are using GNU date ***
PC: @     0x7f503d9d9e97 gsignal
*** SIGABRT (@0x3e800005435) received by PID 21557 (TID 0x7f503cf34900) from PID 21557; stack trace: ***
    @     0x7f5042455890 (unknown)
    @     0x7f503d9d9e97 gsignal
    @     0x7f503d9db801 abort
    @     0x7f504024c309 kudu::AbortFailureFunction()
    @     0x7f503f645d0d google::LogMessage::Fail()
    @     0x7f503f647ce4 google::LogMessage::SendToLog()
    @     0x7f503f64582d google::LogMessage::Flush()
    @     0x7f503f6486b9 google::LogMessageFatal::~LogMessageFatal()
    @     0x7f504172688e kudu::rpc::ServicePool::Init()
    @     0x7f50456b3eed kudu::RpcServer::RegisterService()
    @     0x7f50456bfc59 kudu::server::ServerBase::RegisterService()
    @     0x7f50459004ac kudu::tserver::TabletServer::Start()
    @           0x40683e kudu::tserver::TabletServerMain()
    @           0x4060a2 main
    @     0x7f503d9bcb97 __libc_start_main
    @           0x405fba _start
Aborted (core dumped)

Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
---
M src/kudu/util/thread.cc
1 file changed, 19 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/11726/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8e0bd0d0776142e8feff18bffe15e61ca1ba5816
Gerrit-Change-Number: 11726
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)