You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2015/01/10 01:15:34 UTC
[jira] [Commented] (MESOS-2079) IO.Write test is flaky on OS X
10.10.
[ https://issues.apache.org/jira/browse/MESOS-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272124#comment-14272124 ]
Benjamin Mahler commented on MESOS-2079:
----------------------------------------
It appears that on my laptop, we fairly consistently lose the race here:
{code}
// Do a write but ignore SIGPIPE so we can return an error when
// writing to a pipe or socket where the reading end is closed.
// TODO(benh): The 'suppress' macro failed to work on OS X as it
// appears that signal delivery was happening asynchronously.
// That is, the signal would not appear to be pending when the
// 'suppress' block was closed thus the destructor for
// 'Suppressor' was not waiting/removing the signal via 'sigwait'.
// It also appeared that the signal would be delivered to another
// thread even if it remained blocked in this thiread. The
// workaround here is to check explicitly for EPIPE and then do
// 'sigwait' regardless of what 'os::signals::pending' returns. We
// don't have that luxury with 'Suppressor' and arbitrary signals
// because we don't always have something like EPIPE to tell us
// that a signal is (or will soon be) pending.
bool pending = os::signals::pending(SIGPIPE);
bool unblock = !pending ? os::signals::block(SIGPIPE) : false;
ssize_t length = ::write(fd, data, size);
// Save the errno so we can restore it after doing sig* functions
// below.
int errno_ = errno;
// XXX: We receive EPIPE, but before we can call sigwait to capture it
// per the TODO above, SIPIPE is delivered to another thread.
if (length < 0 && errno == EPIPE && !pending) {
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGPIPE);
int result;
do {
int ignored;
// XXX: Too late!
result = sigwait(&mask, &ignored);
} while (result == -1 && errno == EINTR);
}
if (unblock) {
os::signals::unblock(SIGPIPE);
}
{code}
> IO.Write test is flaky on OS X 10.10.
> -------------------------------------
>
> Key: MESOS-2079
> URL: https://issues.apache.org/jira/browse/MESOS-2079
> Project: Mesos
> Issue Type: Task
> Components: libprocess, technical debt, test
> Environment: OS X 10.10
> {noformat}
> $ clang++ --version
> Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
> Target: x86_64-apple-darwin14.0.0
> Thread model: posix
> {noformat}
> Reporter: Benjamin Mahler
> Labels: flaky
>
> [~benjaminhindman]: If I recall correctly, this is related to MESOS-1658. Unfortunately, we don't have a stacktrace for SIGPIPE currently:
> {noformat}
> [ RUN ] IO.Write
> make[5]: *** [check-local] Broken pipe: 13
> {noformat}
> Running in gdb, seems to always occur here:
> {code}
> Program received signal SIGPIPE, Broken pipe.
> [Switching to process 56827 thread 0x60b]
> 0x00007fff9a011132 in __psynch_cvwait ()
> (gdb) where
> #0 0x00007fff9a011132 in __psynch_cvwait ()
> #1 0x00007fff903e7ea0 in _pthread_cond_wait ()
> #2 0x000000010062f27c in Gate::arrive (this=0x101908a10, old=14780) at gate.hpp:82
> #3 0x0000000100600888 in process::schedule (arg=0x0) at src/process.cpp:1373
> #4 0x00007fff903e72fc in _pthread_body ()
> #5 0x00007fff903e7279 in _pthread_start ()
> #6 0x00007fff903e54b1 in thread_start ()
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)