You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by vi...@apache.org on 2017/01/24 22:58:47 UTC

mesos git commit: Fixed bug allowing IOSwitchboard::connect() after container destruction.

Repository: mesos
Updated Branches:
  refs/heads/master 26923a1de -> 7f83b6e34


Fixed bug allowing IOSwitchboard::connect() after container destruction.

Review: https://reviews.apache.org/r/55810/


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/7f83b6e3
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/7f83b6e3
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/7f83b6e3

Branch: refs/heads/master
Commit: 7f83b6e34a2adef79a427e54f7a2dd4d5afbc1f5
Parents: 26923a1
Author: Kevin Klues <kl...@gmail.com>
Authored: Tue Jan 24 14:58:33 2017 -0800
Committer: Vinod Kone <vi...@gmail.com>
Committed: Tue Jan 24 14:58:33 2017 -0800

----------------------------------------------------------------------
 src/slave/containerizer/mesos/io/switchboard.cpp | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/7f83b6e3/src/slave/containerizer/mesos/io/switchboard.cpp
----------------------------------------------------------------------
diff --git a/src/slave/containerizer/mesos/io/switchboard.cpp b/src/slave/containerizer/mesos/io/switchboard.cpp
index 4b134f8..1b8f490 100644
--- a/src/slave/containerizer/mesos/io/switchboard.cpp
+++ b/src/slave/containerizer/mesos/io/switchboard.cpp
@@ -695,7 +695,7 @@ Future<http::Connection> IOSwitchboard::connect(
       })
     .then(defer(self(), [=]() -> Future<http::Connection> {
       if (!infos.contains(containerId)) {
-        return Failure("Container has or is being destroyed");
+        return Failure("I/O switchboard has shutdown");
       }
 
       // TODO(jieyu): We might still get a connection refused error
@@ -758,8 +758,6 @@ Future<Nothing> IOSwitchboard::cleanup(
   Option<pid_t> pid = infos[containerId]->pid;
   Future<Option<int>> status = infos[containerId]->status;
 
-  infos.erase(containerId);
-
   // If we have a pid, then we attempt to send it a SIGTERM to have it
   // shutdown gracefully. This is best effort, as it's likely that the
   // switchboard has already shutdown in the common case.
@@ -794,6 +792,19 @@ Future<Nothing> IOSwitchboard::cleanup(
   // DISCARDED cases as well.
   return await(list<Future<Option<int>>>{status}).then(
       defer(self(), [this, containerId]() -> Future<Nothing> {
+        // We only remove the 'containerId from our info struct once
+        // we are sure that the I/O switchboard has shutdown. If we
+        // removed it any earlier, attempts to connect to the I/O
+        // switchboard would fail.
+        //
+        // NOTE: One caveat of this approach is that this lambda will
+        // be invoked multiple times if `cleanup()` is called multiple
+        // times before the first instance of it is triggered. This is
+        // OK for now because the logic below has no side effects. If
+        // the logic below gets more complicated, we may need to
+        // revisit this approach.
+        infos.erase(containerId);
+
         // Best effort removal of the unix domain socket file created for
         // this container's `IOSwitchboardServer`. If it hasn't been
         // checkpointed yet, or the socket file itself hasn't been created,


Re: mesos git commit: Fixed bug allowing IOSwitchboard::connect() after container destruction.

Posted by Jie Yu <yu...@gmail.com>.
If 'cleanup' is invoked twice, that might cause signal being sent twice?

- Jie

On Tue, Jan 24, 2017 at 2:58 PM, <vi...@apache.org> wrote:

> Repository: mesos
> Updated Branches:
>   refs/heads/master 26923a1de -> 7f83b6e34
>
>
> Fixed bug allowing IOSwitchboard::connect() after container destruction.
>
> Review: https://reviews.apache.org/r/55810/
>
>
> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/7f83b6e3
> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/7f83b6e3
> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/7f83b6e3
>
> Branch: refs/heads/master
> Commit: 7f83b6e34a2adef79a427e54f7a2dd4d5afbc1f5
> Parents: 26923a1
> Author: Kevin Klues <kl...@gmail.com>
> Authored: Tue Jan 24 14:58:33 2017 -0800
> Committer: Vinod Kone <vi...@gmail.com>
> Committed: Tue Jan 24 14:58:33 2017 -0800
>
> ----------------------------------------------------------------------
>  src/slave/containerizer/mesos/io/switchboard.cpp | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> ----------------------------------------------------------------------
>
>
> http://git-wip-us.apache.org/repos/asf/mesos/blob/7f83b6e3/
> src/slave/containerizer/mesos/io/switchboard.cpp
> ----------------------------------------------------------------------
> diff --git a/src/slave/containerizer/mesos/io/switchboard.cpp
> b/src/slave/containerizer/mesos/io/switchboard.cpp
> index 4b134f8..1b8f490 100644
> --- a/src/slave/containerizer/mesos/io/switchboard.cpp
> +++ b/src/slave/containerizer/mesos/io/switchboard.cpp
> @@ -695,7 +695,7 @@ Future<http::Connection> IOSwitchboard::connect(
>        })
>      .then(defer(self(), [=]() -> Future<http::Connection> {
>        if (!infos.contains(containerId)) {
> -        return Failure("Container has or is being destroyed");
> +        return Failure("I/O switchboard has shutdown");
>        }
>
>        // TODO(jieyu): We might still get a connection refused error
> @@ -758,8 +758,6 @@ Future<Nothing> IOSwitchboard::cleanup(
>    Option<pid_t> pid = infos[containerId]->pid;
>    Future<Option<int>> status = infos[containerId]->status;
>
> -  infos.erase(containerId);
> -
>    // If we have a pid, then we attempt to send it a SIGTERM to have it
>    // shutdown gracefully. This is best effort, as it's likely that the
>    // switchboard has already shutdown in the common case.
> @@ -794,6 +792,19 @@ Future<Nothing> IOSwitchboard::cleanup(
>    // DISCARDED cases as well.
>    return await(list<Future<Option<int>>>{status}).then(
>        defer(self(), [this, containerId]() -> Future<Nothing> {
> +        // We only remove the 'containerId from our info struct once
> +        // we are sure that the I/O switchboard has shutdown. If we
> +        // removed it any earlier, attempts to connect to the I/O
> +        // switchboard would fail.
> +        //
> +        // NOTE: One caveat of this approach is that this lambda will
> +        // be invoked multiple times if `cleanup()` is called multiple
> +        // times before the first instance of it is triggered. This is
> +        // OK for now because the logic below has no side effects. If
> +        // the logic below gets more complicated, we may need to
> +        // revisit this approach.
> +        infos.erase(containerId);
> +
>          // Best effort removal of the unix domain socket file created for
>          // this container's `IOSwitchboardServer`. If it hasn't been
>          // checkpointed yet, or the socket file itself hasn't been
> created,
>
>