You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org> on 2018/01/12 02:53:09 UTC

[kudu-CR] KUDU-2208 Refine unit test and move RETRY ON EINTR() to os-util.h

Jeffrey F. Lukman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9015


Change subject: KUDU-2208 Refine unit test and move RETRY_ON_EINTR() to os-util.h
......................................................................

KUDU-2208 Refine unit test and move RETRY_ON_EINTR() to os-util.h

This patch refines some nits and also change unit test scenario, such that
the send kill signals are sent continously without any delay
until the kill signals succeeded thousands of times. In the buggy code,
the subprocess_thread will throws Interrupted system call error.

Furthermore, this bug also move the RETRY_ON_EINTR() function
that occurs in many places across the system into the os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 25 insertions(+), 45 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#8).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption
Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Since the pthread_kill() signal is sent
asynchronously, sometimes the pthread_kill() raises error signal 3.
In that condition, the unit test logs the incident as an INFO.

Prior to this bug fix patch, the Subprocess throws an exception
because it cannot handle the kill signal in Wait() state. To fix the bug,
we add RETRY_ON_EINTR() inside Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 69 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/8
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 8
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Refine unit test and move RETRY ON EINTR() to os-util.h

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Refine unit test and move RETRY_ON_EINTR() to os-util.h
......................................................................


Patch Set 1:

Can you merge this into the previous commit? eg use rebase -i and squash the two commits? That way there is just one review. (gerrit doesn't really handle patch series)


-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 12 Jan 2018 05:25:15 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................


Patch Set 12: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 12
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Jan 2018 07:30:34 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#5).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption
Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Prior to the bug fix in this patch,
the Subprocess throws an exception because it cannot handle
the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR()
at Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 60 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/5
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 5
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................

KUDU-2208 Add RETRY_ON_EINTR() to Subprocess

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Since the pthread_kill() signal is sent
asynchronously, sometimes the pthread_kill() raises error signal 3.
In that condition, the unit test logs the incident as an INFO.

Prior to this bug fix patch, the Subprocess throws an exception
because it cannot handle the kill signal in Wait() state. To fix the bug,
we add RETRY_ON_EINTR() inside Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h and it adds
alarm reset in the end of TestReadFromStdoutAndStderr in
Subprocess-test.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Reviewed-on: http://gerrit.cloudera.org:8080/9015
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 77 insertions(+), 22 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Todd Lipcon: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 13
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h
File src/kudu/util/os-util.h:

http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h@33
PS5, Line 33:   static_assert(std::is_signed<decltype(err)>::value, \
need to #include <type_traits> in this header so that we get access to std::is_signed. Also whatever header has EINTR in it (not sure which one)


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@28
PS5, Line 28: #include <pthread.h>
per Alexey's comment please separate out the C includes from the C++ ones https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@312
PS5, Line 312:     t_started = true;
still think we should add a short sleep here so that we ensure that the signal-sending starts before we call p.Start


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@313
PS5, Line 313:     ASSERT_OK(p.Start());
nit: we have to use CHECK_OK here because gtest isn't thread-safe to run assertions from non-main threads


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@332
PS5, Line 332:       // Add delay to avoid user defined signal 2 error
but the "user defined signal 2 error" is the bug we're trying to provoke here. So if we are still seeing a crash due to it, then that means we haven't fully fixed the bug, right?



-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 5
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 12 Jan 2018 20:56:24 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Jeffrey F. Lukman has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h
File src/kudu/util/os-util.h:

http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h@33
PS5, Line 33: #define RETRY_ON_EINTR(err, expr) do { \
> need to #include <type_traits> in this header so that we get access to std:
Done


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@28
PS5, Line 28: #include <memory>
> per Alexey's comment please separate out the C includes from the C++ ones h
Done


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@312
PS5, Line 312:     t_started = true;
> still think we should add a short sleep here so that we ensure that the sig
Done


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@313
PS5, Line 313:     SleepFor(MonoDelta::FromMilliseconds(50));
> nit: we have to use CHECK_OK here because gtest isn't thread-safe to run as
Done


http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@332
PS5, Line 332:       ASSERT_TRUE(pthread_kill(t, SIGUSR2) == 0);
> but the "user defined signal 2 error" is the bug we're trying to provoke he
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 6
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 12 Jan 2018 23:51:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#9).

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................

KUDU-2208 Add RETRY_ON_EINTR() to Subprocess

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Since the pthread_kill() signal is sent
asynchronously, sometimes the pthread_kill() raises error signal 3.
In that condition, the unit test logs the incident as an INFO.

Prior to this bug fix patch, the Subprocess throws an exception
because it cannot handle the kill signal in Wait() state. To fix the bug,
we add RETRY_ON_EINTR() inside Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 73 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/9
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 9
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#3).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Prior to the bug fix in this patch,
the Subprocess throws an exception because it cannot handle
the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR()
at Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 57 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/3
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 3
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Jeffrey F. Lukman has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................


Patch Set 12:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/os-util.h
File src/kudu/util/os-util.h:

http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/os-util.h@27
PS10, Line 27: 
> should probably #include <errno.h> here since EINTR is defined there
Done


http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc@338
PS10, Line 338:       ASSERT_TRUE(err == 0 || err == ESRCH);
> nit: I think this would read better as ASSERT_TRUE(err == 0 || err == ESRCH
Done


http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc@340
PS10, Line 340:         LOG(INFO) << "Async kill signal failed with err=" << err <<
> can you add an ASSERT_TRUE(t_finished) here? we should only get ESRCH in th
Done


http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc@344
PS10, Line 344:       // Add microseconds delay to make the unit test runs faster and more reliable
> can you change this to rand() % 1? It still runs quickly on my machine with
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 12
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Jan 2018 06:44:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#6).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption
Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Prior to the bug fix in this patch,
the Subprocess throws an exception because it cannot handle
the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR()
at Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 60 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/6
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 6
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#7).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption
Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Since the pthread_kill() signal is sent
asynchronously, sometimes the pthread_kill() raises error signal 3.
In that condition, the unit test logs the incident as an INFO.

Prior to this bug fix patch, the Subprocess throws an exception
because it cannot handle the kill signal in Wait() state. To fix the bug,
we add RETRY_ON_EINTR() inside Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 65 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/7
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 7
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#4).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Prior to the bug fix in this patch,
the Subprocess throws an exception because it cannot handle
the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR()
at Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 60 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/4
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 4
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#2).

Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
......................................................................

KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Prior to the bug fix in this patch,
the Subprocess throws an exception because it cannot handle
the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR()
at Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 57 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 2
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 )

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/os-util.h
File src/kudu/util/os-util.h:

http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/os-util.h@27
PS10, Line 27: #include <string>
should probably #include <errno.h> here since EINTR is defined there


http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc@338
PS10, Line 338:       ASSERT_FALSE(err != 0 && err != ESRCH);
nit: I think this would read better as ASSERT_TRUE(err == 0 || err == ESRCH)


http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc@340
PS10, Line 340:         LOG(INFO) << "Async kill signal failed with err=" << err <<
can you add an ASSERT_TRUE(t_finished) here? we should only get ESRCH in the case where the thread finished


http://gerrit.cloudera.org:8080/#/c/9015/10/src/kudu/util/subprocess-test.cc@344
PS10, Line 344:       SleepFor(MonoDelta::FromMicroseconds(rand() % 100));
can you change this to rand() % 1? It still runs quickly on my machine with that setting and seems more likely to produce races



-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 10
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Jan 2018 06:16:51 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#10).

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................

KUDU-2208 Add RETRY_ON_EINTR() to Subprocess

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Since the pthread_kill() signal is sent
asynchronously, sometimes the pthread_kill() raises error signal 3.
In that condition, the unit test logs the incident as an INFO.

Prior to this bug fix patch, the Subprocess throws an exception
because it cannot handle the kill signal in Wait() state. To fix the bug,
we add RETRY_ON_EINTR() inside Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h and it adds
alarm reset in the end of TestReadFromStdoutAndStderr in
Subprocess-test.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 74 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/10
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 10
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2208 Add RETRY ON EINTR() to Subprocess

Posted by "Jeffrey F. Lukman (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9015

to look at the new patch set (#12).

Change subject: KUDU-2208 Add RETRY_ON_EINTR() to Subprocess
......................................................................

KUDU-2208 Add RETRY_ON_EINTR() to Subprocess

This patch submits a unit test that create a Subprocess thread
and while it is starting and waiting, another thread sends kill signals
to the Subprocess thread. Since the pthread_kill() signal is sent
asynchronously, sometimes the pthread_kill() raises error signal 3.
In that condition, the unit test logs the incident as an INFO.

Prior to this bug fix patch, the Subprocess throws an exception
because it cannot handle the kill signal in Wait() state. To fix the bug,
we add RETRY_ON_EINTR() inside Subprocess::DoWait() function.

Furthermore, this patch also moves the RETRY_ON_EINTR() function
that occurs in many places across the code to os-util.h and it adds
alarm reset in the end of TestReadFromStdoutAndStderr in
Subprocess-test.

Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
---
M src/kudu/consensus/log_index.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.h
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
6 files changed, 77 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/12
-- 
To view, visit http://gerrit.cloudera.org:8080/9015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333
Gerrit-Change-Number: 9015
Gerrit-PatchSet: 12
Gerrit-Owner: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Jeffrey F. Lukman <je...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>