You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Chun-Hung Hsiao <ch...@apache.org> on 2018/09/19 04:52:54 UTC

Review Request 68760: Added a `terminate` function to `ContainerDaemon`.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68760/
-----------------------------------------------------------

Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.


Bugs: MESOS-9228
    https://issues.apache.org/jira/browse/MESOS-9228


Repository: mesos


Description
-------

The `terminate` function will send a kill to its monitoring container
and cause the monitor loop to stop. The loop will be stopped after
the `postStopHook` is called.


Diffs
-----

  src/slave/container_daemon.hpp a58140dd787524b0438ba4719fa7e3365ab2fa17 
  src/slave/container_daemon.cpp e1b0812b8b467ee4061b23d39552e2417d65a14a 
  src/slave/container_daemon_process.hpp a5d19a04223f6f595e97ec83962558639fd51f7b 


Diff: https://reviews.apache.org/r/68760/diff/1/


Testing
-------

make check


Thanks,

Chun-Hung Hsiao


Re: Review Request 68760: Added a `terminate` function to `ContainerDaemon`.

Posted by Chun-Hung Hsiao <ch...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68760/
-----------------------------------------------------------

(Updated Sept. 20, 2018, 4:31 a.m.)


Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.


Changes
-------

Addressed Benjamin's comments and fixed an invariant violation. Also added more checks.


Bugs: MESOS-9228
    https://issues.apache.org/jira/browse/MESOS-9228


Repository: mesos


Description
-------

The `terminate` function will send a kill to its monitoring container
and cause the monitor loop to stop. The loop will be stopped after
the `postStopHook` is called.


Diffs (updated)
-----

  src/slave/container_daemon.hpp a58140dd787524b0438ba4719fa7e3365ab2fa17 
  src/slave/container_daemon.cpp e1b0812b8b467ee4061b23d39552e2417d65a14a 
  src/slave/container_daemon_process.hpp a5d19a04223f6f595e97ec83962558639fd51f7b 


Diff: https://reviews.apache.org/r/68760/diff/2/

Changes: https://reviews.apache.org/r/68760/diff/1-2/


Testing
-------

make check


Thanks,

Chun-Hung Hsiao


Re: Review Request 68760: Added a `terminate` function to `ContainerDaemon`.

Posted by Chun-Hung Hsiao <ch...@apache.org>.

> On Sept. 19, 2018, 3:04 p.m., Benjamin Bannier wrote:
> > src/slave/container_daemon.hpp
> > Lines 72 (patched)
> > <https://reviews.apache.org/r/68760/diff/1/?file=2090236#file2090236line72>
> >
> >     Missing comma after `completed`?

No there's no missing comma. Let me rephrase it to `The looper is stopped after the `postStopHook` function is completed if `terminate` has been called, or ...`.


> On Sept. 19, 2018, 3:04 p.m., Benjamin Bannier wrote:
> > src/slave/container_daemon.cpp
> > Lines 99 (patched)
> > <https://reviews.apache.org/r/68760/diff/1/?file=2090237#file2090237line99>
> >
> >     Toggling `terminated` already here doesn't look right to me as we might fail to `post` below; I believe we should only set it if we are certain that the agent API has received the call (is the API idempotent?).

The `terminating` is also a control for `launchContainer` to break the loop before launching the next container, so it seems more correct to me to set the flag here. This is also why I added a TODO for retry. I can go implementing the retry if you are concerned loosing connections between two actors in the same agent linux process.

The invariant I'm trying to maintain here is that once `terminated` is completed, no other dispatches will go to the actor. It seems I've made a mistake below ;)


> On Sept. 19, 2018, 3:04 p.m., Benjamin Bannier wrote:
> > src/slave/container_daemon.cpp
> > Lines 119 (patched)
> > <https://reviews.apache.org/r/68760/diff/1/?file=2090237#file2090237line119>
> >
> >     We currently don't need the `defer` here, but will need to once we set `terminated` here, see above.

We need to defer here because the lambda uses `this->containerId`. But actually we should not defer here to make sure that no dispatch will go to the actor once `terminated` is completed.


> On Sept. 19, 2018, 3:04 p.m., Benjamin Bannier wrote:
> > src/slave/container_daemon.cpp
> > Lines 130-131 (patched)
> > <https://reviews.apache.org/r/68760/diff/1/?file=2090237#file2090237line130>
> >
> >     The usual pattern elsewhere seems to be to use an `onAny` instead of a `then` above, and then check in the continutation whether `response.isReady` and conditionally return early there.

It seems we use both patterns in different places and I feel that this pattern reads slightly better to me, as we don't have the conditional return.


> On Sept. 19, 2018, 3:04 p.m., Benjamin Bannier wrote:
> > src/slave/container_daemon_process.hpp
> > Lines 57 (patched)
> > <https://reviews.apache.org/r/68760/diff/1/?file=2090238#file2090238line57>
> >
> >     Nit: Should this be named `terminateContainer`/`stopContainer` for symmetry with `launchContainer`? Maybe not as we likely would want this method to always remain `public`.

I'm following this pattern:
1. Spawn the actor in the constructor.
2. Provide a `void terminate()` for the callor to stop the component.
3. Provide a `Future<Nothing> wait()` for the caller to asnychronously waits for the termination (not necessarily triggered by 2).

Alternatively, we can have the following pattern instead (which is used by the libprocess http server):
1. Provide a `Future<Nothing> run()` to spawn the actor and return a future which will be completed upon the termination of the component.
2. PRovide a `Future<Nothing> stop()` for the caller to stop the component and wait for its termination.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68760/#review208765
-----------------------------------------------------------


On Sept. 19, 2018, 4:52 a.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68760/
> -----------------------------------------------------------
> 
> (Updated Sept. 19, 2018, 4:52 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9228
>     https://issues.apache.org/jira/browse/MESOS-9228
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The `terminate` function will send a kill to its monitoring container
> and cause the monitor loop to stop. The loop will be stopped after
> the `postStopHook` is called.
> 
> 
> Diffs
> -----
> 
>   src/slave/container_daemon.hpp a58140dd787524b0438ba4719fa7e3365ab2fa17 
>   src/slave/container_daemon.cpp e1b0812b8b467ee4061b23d39552e2417d65a14a 
>   src/slave/container_daemon_process.hpp a5d19a04223f6f595e97ec83962558639fd51f7b 
> 
> 
> Diff: https://reviews.apache.org/r/68760/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Re: Review Request 68760: Added a `terminate` function to `ContainerDaemon`.

Posted by Benjamin Bannier <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68760/#review208765
-----------------------------------------------------------




src/slave/container_daemon.hpp
Lines 72 (patched)
<https://reviews.apache.org/r/68760/#comment292918>

    Missing comma after `completed`?



src/slave/container_daemon.hpp
Lines 74 (patched)
<https://reviews.apache.org/r/68760/#comment292919>

    `s/he/the/`, `s/reconstruct/construct/`



src/slave/container_daemon.cpp
Lines 86 (patched)
<https://reviews.apache.org/r/68760/#comment292924>

    Nit: Since this does not depend on any parameters from this constructor we could have set this up in the class declaration,
    ```
    bool terminating = false;
    ```



src/slave/container_daemon.cpp
Lines 99 (patched)
<https://reviews.apache.org/r/68760/#comment292920>

    Toggling `terminated` already here doesn't look right to me as we might fail to `post` below; I believe we should only set it if we are certain that the agent API has received the call (is the API idempotent?).



src/slave/container_daemon.cpp
Lines 119 (patched)
<https://reviews.apache.org/r/68760/#comment292921>

    We currently don't need the `defer` here, but will need to once we set `terminated` here, see above.



src/slave/container_daemon.cpp
Lines 130-131 (patched)
<https://reviews.apache.org/r/68760/#comment292922>

    The usual pattern elsewhere seems to be to use an `onAny` instead of a `then` above, and then check in the continutation whether `response.isReady` and conditionally return early there.



src/slave/container_daemon_process.hpp
Lines 57 (patched)
<https://reviews.apache.org/r/68760/#comment292923>

    Nit: Should this be named `terminateContainer`/`stopContainer` for symmetry with `launchContainer`? Maybe not as we likely would want this method to always remain `public`.


- Benjamin Bannier


On Sept. 19, 2018, 6:52 a.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68760/
> -----------------------------------------------------------
> 
> (Updated Sept. 19, 2018, 6:52 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9228
>     https://issues.apache.org/jira/browse/MESOS-9228
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The `terminate` function will send a kill to its monitoring container
> and cause the monitor loop to stop. The loop will be stopped after
> the `postStopHook` is called.
> 
> 
> Diffs
> -----
> 
>   src/slave/container_daemon.hpp a58140dd787524b0438ba4719fa7e3365ab2fa17 
>   src/slave/container_daemon.cpp e1b0812b8b467ee4061b23d39552e2417d65a14a 
>   src/slave/container_daemon_process.hpp a5d19a04223f6f595e97ec83962558639fd51f7b 
> 
> 
> Diff: https://reviews.apache.org/r/68760/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>