You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/08/10 02:05:01 UTC

[kudu-CR] rpc: hook up a callback for libev fatal errors

Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7633

Change subject: rpc: hook up a callback for libev fatal errors
......................................................................

rpc: hook up a callback for libev fatal errors

In troubleshooting a recent cluster issue, I found that the daemon had
run out of file descriptors. This caused libev to abort(), but the error
message wasn't anywhere obvious since the default implementation just
writes to stderr.

Piping this through to a GLog FATAL is more likely to result in an
obvious log message.

It's difficult to write an automated test for this, but I tested by
setting my ulimit to 10 and running rpc-test. This resulted in:

F0809 19:03:39.882194  3358 reactor.cc:108] LibEV fatal error: (libev)
error creating signal/async pipe: Too many open files [24]

Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
---
M src/kudu/rpc/reactor.cc
1 file changed, 18 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/7633/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>

[kudu-CR] rpc: hook up a callback for libev fatal errors

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: rpc: hook up a callback for libev fatal errors
......................................................................


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-HasComments: No

[kudu-CR] rpc: hook up a callback for libev fatal errors

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: rpc: hook up a callback for libev fatal errors
......................................................................


rpc: hook up a callback for libev fatal errors

In troubleshooting a recent cluster issue, I found that the daemon had
run out of file descriptors. This caused libev to abort(), but the error
message wasn't anywhere obvious since the default implementation just
writes to stderr.

Piping this through to a GLog FATAL is more likely to result in an
obvious log message.

It's difficult to write an automated test for this, but I tested by
setting my ulimit to 10 and running rpc-test. This resulted in:

F0809 19:03:39.882194  3358 reactor.cc:108] LibEV fatal error: (libev)
error creating signal/async pipe: Too many open files [24]

Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Reviewed-on: http://gerrit.cloudera.org:8080/7633
Tested-by: Kudu Jenkins
Reviewed-by: Matthew Jacobs <mj...@cloudera.com>
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/rpc/reactor.cc
1 file changed, 18 insertions(+), 0 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Matthew Jacobs: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] rpc: hook up a callback for libev fatal errors

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: rpc: hook up a callback for libev fatal errors
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7633/2//COMMIT_MSG
Commit Message:

PS2, Line 17: It's difficult to write an automated test for this, but I tested by
            : setting my ulimit to 10 and running rpc-test.
> What about using setrlimit(RLIMIT_NOFILE, ...) to drop the max number of op
would still need it to be a death-test, though, which are somewhat painful. I didn't think this was at much risk of regressing.


-- 
To view, visit http://gerrit.cloudera.org:8080/7633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] rpc: hook up a callback for libev fatal errors

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: rpc: hook up a callback for libev fatal errors
......................................................................


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-HasComments: No

[kudu-CR] rpc: hook up a callback for libev fatal errors

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: rpc: hook up a callback for libev fatal errors
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7633/2//COMMIT_MSG
Commit Message:

PS2, Line 17: It's difficult to write an automated test for this, but I tested by
            : setting my ulimit to 10 and running rpc-test.
What about using setrlimit(RLIMIT_NOFILE, ...) to drop the max number of open fds? Here's some test code I wrote to play around with that: https://gist.github.com/adembo/fba94e3956a8db4ad45d3fce91106c6b.

Alternatively, what about creating an unbounded number of Messengers in a loop? That would work if each one makes libev  create a pipe.


-- 
To view, visit http://gerrit.cloudera.org:8080/7633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes