You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by al...@apache.org on 2018/09/21 13:01:02 UTC

[mesos] branch 1.5.x updated (ea056c0 -> 0216002)

This is an automated email from the ASF dual-hosted git repository.

alexr pushed a change to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from ea056c0  Added patch for PicoJSON.
     new 7b81956  Fixed disconnection while sending acknowledgment to IOSwitchboard.
     new 0216002  Fixed broken pipe error in IOSwitchboard.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/slave/containerizer/mesos/io/switchboard.cpp | 42 +++++++++++++-----------
 src/slave/http.cpp                               | 14 +++++++-
 2 files changed, 35 insertions(+), 21 deletions(-)


[mesos] 01/02: Fixed disconnection while sending acknowledgment to IOSwitchboard.

Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

alexr pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 7b8195680104c2c5f61073a956f60ac961c37f45
Author: Andrei Budnik <ab...@mesosphere.com>
AuthorDate: Fri Sep 21 14:51:24 2018 +0200

    Fixed disconnection while sending acknowledgment to IOSwitchboard.
    
    Previously, an HTTP connection to the IOSwitchboard could be garbage
    collected before the agent sent an acknowledgment to the IOSwitchboard
    via this connection. This patch fixes the issue by keeping a reference
    count to the connection in a lambda callback until disconnection
    occurs.
    
    Review: https://reviews.apache.org/r/68768/
    (cherry picked from commit bfa2bd24780b5c49467b3c23260855e3d8b4c948)
---
 src/slave/http.cpp | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/slave/http.cpp b/src/slave/http.cpp
index 1c98cea..677958f 100644
--- a/src/slave/http.cpp
+++ b/src/slave/http.cpp
@@ -3238,7 +3238,12 @@ Future<Response> Http::_attachContainerInput(
               // so that the IOSwitchboard process can terminate itself. This is
               // a workaround for the problem with dropping outstanding HTTP
               // responses due to a lack of graceful shutdown in libprocess.
-              acknowledgeContainerInputResponse(containerId);
+              acknowledgeContainerInputResponse(containerId)
+                .onFailed([containerId](const string& failure) {
+                  LOG(ERROR) << "Failed to send an acknowledgment to the"
+                             << " IOSwitchboard for container '"
+                             << containerId << "': " << failure;
+              });
             }));
     }));
 }
@@ -3254,6 +3259,13 @@ Future<Response> Http::acknowledgeContainerInputResponse(
       request.url.domain = "";
       request.url.path = "/acknowledge_container_input_response";
 
+      // This is a non Keep-Alive request which means the connection
+      // will be closed when the response is received. Since the
+      // 'Connection' is reference-counted, we must maintain a copy
+      // until the disconnection occurs.
+      connection.disconnected()
+        .onAny([connection]() {});
+
       return connection.send(request);
     });
 }


[mesos] 02/02: Fixed broken pipe error in IOSwitchboard.

Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

alexr pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 0216002744517a6053fd782b6b4dc3d6cf77dd5e
Author: Andrei Budnik <ab...@mesosphere.com>
AuthorDate: Fri Sep 21 14:51:59 2018 +0200

    Fixed broken pipe error in IOSwitchboard.
    
    Previous attempt to fix `HTTP 500` "broken pipe" in review /r/62187/
    was not correct: after IOSwitchboard sends a response to the agent for
    the `ATTACH_CONTAINER_INPUT` call, the socket is closed immediately,
    thus causing the error on the agent. This patch adds a delay after
    IO redirects are finished and before IOSwitchboard forcibly send a
    response.
    
    Review: https://reviews.apache.org/r/68784/
    (cherry picked from commit c3c77cbef818d497d8bd5e67fa72e55a7190e27a)
---
 src/slave/containerizer/mesos/io/switchboard.cpp | 42 +++++++++++++-----------
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/src/slave/containerizer/mesos/io/switchboard.cpp b/src/slave/containerizer/mesos/io/switchboard.cpp
index 91c14b1..df33b0d 100644
--- a/src/slave/containerizer/mesos/io/switchboard.cpp
+++ b/src/slave/containerizer/mesos/io/switchboard.cpp
@@ -1601,16 +1601,9 @@ IOSwitchboardServerProcess::acknowledgeContainerInputResponse()
   if (--numPendingAcknowledgments == 0) {
     // If IO redirects are finished or writing to `stdin` failed we want to
     // terminate ourselves (after flushing any outstanding messages from our
-    // message queue). Since IOSwitchboard might receive an acknowledgment for
-    // the `ATTACH_CONTAINER_INPUT` request before reading a final message from
-    // the corresponding connection, we need to delay our termination to give
-    // IOSwitchboard a chance to read the final message. Otherwise, the agent
-    // might get `HTTP 500` "broken pipe" while attempting to write the final
-    // message.
+    // message queue).
     if (!redirectFinished.future().isPending() || failure.isSome()) {
-      after(Seconds(1)).onAny(defer(self(), [=](const Future<Nothing>&) {
-        terminate(self(), false);
-      }));
+      terminate(self(), false);
     }
   }
   return http::OK();
@@ -1732,20 +1725,29 @@ Future<http::Response> IOSwitchboardServerProcess::attachContainerInput(
   // the read loop finishes or IO redirects finish. Once this promise is set,
   // we return a final response to the client.
   //
-  // TODO(abudnik): Ideally, we would have used `process::select()` to capture a
-  // transition into a terminal state for any of `{readLoop, redirectFinished}`.
-  // However, `select()` currently does not capture a future that has failed.
-  // Another alternative would be to allow `promise::associate()` to accept
-  // multiple source futures.
+  // We use `defer(self(), ...)` to use this process as a synchronization point
+  // when changing state of the promise.
   Owned<Promise<http::Response>> promise(new Promise<http::Response>());
 
-  auto setPromise = [promise](const Future<http::Response>& response) {
-    promise->set(response);
-  };
-
-  readLoop.onAny(defer(self(), setPromise));
+  readLoop.onAny(
+      defer(self(), [promise](const Future<http::Response>& response) {
+        promise->set(response);
+      }));
 
-  redirectFinished.future().onAny(defer(self(), setPromise));
+  // Since IOSwitchboard might receive an acknowledgment for the
+  // `ATTACH_CONTAINER_INPUT` request before reading a final message from
+  // the corresponding connection, we need to give IOSwitchboard a chance to
+  // read the final message. Otherwise, the agent might get `HTTP 500`
+  // "broken pipe" while attempting to write the final message.
+  redirectFinished.future().onAny(
+      defer(self(), [=](const Future<http::Response>& response) {
+        // TODO(abudnik): Ideally, we would have used `process::delay()` to
+        // delay a dispatch of the lambda to this process.
+        after(Seconds(1))
+          .onAny(defer(self(), [promise, response](const Future<Nothing>&) {
+            promise->set(response);
+          }));
+      }));
 
   // We explicitly specify the return type to avoid a type deduction
   // issue in some versions of clang. See MESOS-2943.