You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2018/06/28 11:16:00 UTC

[jira] [Assigned] (MESOS-8568) Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`

     [ https://issues.apache.org/jira/browse/MESOS-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Till Toenshoff reassigned MESOS-8568:
-------------------------------------

    Assignee:     (was: Benno Evers)

> Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`
> ------------------------------------------------------------------------------------------
>
>                 Key: MESOS-8568
>                 URL: https://issues.apache.org/jira/browse/MESOS-8568
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Andrei Budnik
>            Priority: Blocker
>              Labels: default-executor, health-check, mesosphere
>
> After successful launch of a nested container via `LAUNCH_NESTED_CONTAINER_SESSION` in a checker library, it calls [waitNestedContainer |https://github.com/apache/mesos/blob/0a40243c6a35dc9dc41774d43ee3c19cdf9e54be/src/checks/checker_process.cpp#L657] for the container. Checker library [calls|https://github.com/apache/mesos/blob/0a40243c6a35dc9dc41774d43ee3c19cdf9e54be/src/checks/checker_process.cpp#L466-L487] `REMOVE_NESTED_CONTAINER` to remove a previous nested container before launching a nested container for a subsequent check. Hence, `REMOVE_NESTED_CONTAINER` call follows `WAIT_NESTED_CONTAINER` to ensure that the nested container has been terminated and can be removed/cleaned up.
> In case of failure, the library [doesn't call|https://github.com/apache/mesos/blob/0a40243c6a35dc9dc41774d43ee3c19cdf9e54be/src/checks/checker_process.cpp#L627-L636] `WAIT_NESTED_CONTAINER`. Despite the failure, the container might be launched and the following attempt to remove the container without call `WAIT_NESTED_CONTAINER` leads to errors like:
> {code:java}
> W0202 20:03:08.895830 7 checker_process.cpp:503] Received '500 Internal Server Error' (Nested container has not terminated yet) while removing the nested container '2b0c542c-1f5f-42f7-b914-2c1cadb4aeca.da0a7cca-516c-4ec9-b215-b34412b670fa.check-49adc5f1-37a3-4f26-8708-e27d2d6cd125' used for the COMMAND check for task 'node-0-server__e26a82b0-fbab-46a0-a1ea-e7ac6cfa4c91
> {code}
> The checker library should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)