You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Neil Conway (JIRA)" <ji...@apache.org> on 2015/12/03 22:02:10 UTC
[jira] [Updated] (MESOS-3760) Remove fragile sleep() from
ProcessManager::settle()
[ https://issues.apache.org/jira/browse/MESOS-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neil Conway updated MESOS-3760:
-------------------------------
Component/s: libprocess
> Remove fragile sleep() from ProcessManager::settle()
> ----------------------------------------------------
>
> Key: MESOS-3760
> URL: https://issues.apache.org/jira/browse/MESOS-3760
> Project: Mesos
> Issue Type: Bug
> Components: libprocess
> Reporter: Neil Conway
> Priority: Minor
> Labels: mesosphere, tech-debt, testing
>
> From {{ProcessManager::settle()}}:
> {code}
> // While refactoring in order to isolate libev behind abstractions
> // it became evident that this os::sleep is vital for tests to
> // pass. In particular, there are certain tests that assume too
> // much before they attempt to do a settle. One such example is
> // tests doing http::get followed by Clock::settle, where they
> // expect the http::get will have properly enqueued a process on
> // the run queue but http::get is just sending bytes on a
> // socket. Without sleeping at the beginning of this function we
> // can get unlucky and appear settled when in actuality the
> // kernel just hasn't copied the bytes to a socket or we haven't
> // yet read the bytes and enqueued an event on a process (and the
> // process on the run queue).
> os::sleep(Milliseconds(10));
> {code}
> Sleeping for 10 milliseconds doesn't guarantee that the kernel has done anything at all; any test cases that depend on this behavior should be fixed to actual perform the necessary synchronization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)