You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Hindman (JIRA)" <ji...@apache.org> on 2015/05/20 02:43:00 UTC

[jira] [Issue Comment Deleted] (MESOS-2735) Change the interaction between the slave and the resource estimator from polling to pushing

     [ https://issues.apache.org/jira/browse/MESOS-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Hindman updated MESOS-2735:
------------------------------------
    Comment: was deleted

(was: There are definitely differences in message queue behavior, one of which is significantly safer than the other. There are two safety concerns that I can think of, one of which [~jieyu] has addressed here but I'll repeat to be sure I properly understood.

(1) Someone might write a ResourceEstimator that isn't asynchronous, causing the slave to "block" while the resource estimator estimates.

(2) The ResourceEstimator might cause a denial of service attack on the slave.

I understand the concern with (1) but I'm not too anxious about it. Why? It should be trivial to make a wrapper module which forces people to implement the ResourceEstimator to be asynchronous, either using `async` like you suggested or implementing a version of ResourceEstimator which wraps an actor (libprocess process). We'll only need to do this once and then other ResourceEstimator implementations can leverage this stuff.

On the other hand, I don't like the behavior of push because of (2). Fundamentally, if the slave can't keep up with the rate at which the ResourceEstimator is pushing then we could create a denial of service issue with the slave, i.e., it takes a long time to process non-ResourceEstimator messages because it's queue is full of just ResourceEstimator messages. I'm more anxious about (2) than (1) because it's harder to find bugs in (2) than with (1) since once you fix (1) it stays fixed forever but any time you updated the algorithm you impact the potential to cause (2).

Now, I acknowledge that implementing this as a pull versus push will make the implementation in the ResourceEstimator slightly more complicated, but not really. In particular, it should be trivial to always use a `Queue` to achieve the push semantics in any ResourceEstimator implementation, while still providing the pull semantics externally. Make sense?

Finally, one of the advantages of the pull model is that it's easier to reason about because we don't have "anonymous" lambdas that cause execution in some other random place in the code (i.e., you can easily see in the slave where the future that gets returned from `ResourceEstimator::estimate()` gets handled). In addition, the ResourceEstimator remains "functional" in the sense that it just has to return some value (or a future) from it's functions versus invoking some callback that causes something to get run some other place (and in fact, may also block, so isn't it safer for the ResourceEstimator to invoke the callback in it's own `async`?).

The invocation of the `ResourceEstimator::estimate()` followed by the `.then` is a nice pattern that let's us compose with other things as well, which is harder to do with the lambda style callbacks and why we've avoided it where we've been able (in fact, I'm curious which place in the code are you imitating here?).)

> Change the interaction between the slave and the resource estimator from polling to pushing 
> --------------------------------------------------------------------------------------------
>
>                 Key: MESOS-2735
>                 URL: https://issues.apache.org/jira/browse/MESOS-2735
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>            Assignee: Jie Yu
>              Labels: twitter
>
> This will make the semantics more clear. The resource estimator can control the speed of sending resources estimation to the slave.
> To avoid cyclic dependency, slave will register a callback with the resource estimator and the resource estimator will simply invoke that callback when there's a new estimation ready. The callback will be a defer to the slave's main event queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)