You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by jo...@apache.org on 2018/09/05 23:54:11 UTC

[mesos] branch 1.7.x updated (09ee686 -> 7000e72)

This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a change to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from 09ee686  Added MESOS-8568 to the 1.7.0 CHANGELOG.
     new f7416e7  Enabled garbage collection of terminated tasks' metadata.
     new a90e9da  Shared GC actor between Agent and MesosContainerizer.
     new 831598f  Added flag to control GC-ing of nested container sandboxes.
     new 4c44501  Modified MesosContainerizer to GC nested container sandboxes.
     new c03df89  Added tests for task metadata GC using the default executor.
     new 7000e72  Added MESOS-7947 to the 1.7.0 CHANGELOG.

The 6 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                                          |   6 +
 docs/configuration/agent.md                        |  13 +
 docs/operator-http-api.md                          |  11 +
 docs/sandbox.md                                    |   3 +
 docs/upgrades.md                                   |   9 +
 src/local/local.cpp                                |   1 +
 src/slave/containerizer/containerizer.cpp          |   4 +-
 src/slave/containerizer/containerizer.hpp          |   3 +
 src/slave/containerizer/mesos/containerizer.cpp    | 119 +++++-
 src/slave/containerizer/mesos/containerizer.hpp    |  14 +
 src/slave/flags.cpp                                |   9 +
 src/slave/flags.hpp                                |   1 +
 src/slave/http.cpp                                 |   1 +
 src/slave/main.cpp                                 |   8 +-
 src/slave/slave.cpp                                |  13 +
 src/tests/cluster.cpp                              |  13 +-
 src/tests/containerizer/cni_isolator_tests.cpp     |   1 +
 .../containerizer/docker_volume_isolator_tests.cpp |   1 +
 .../environment_secret_isolator_tests.cpp          |   6 +-
 .../containerizer/mesos_containerizer_tests.cpp    |   7 +
 .../containerizer/volume_secret_isolator_tests.cpp |   1 +
 src/tests/gc_tests.cpp                             | 463 +++++++++++++++++++++
 src/tests/slave_recovery_tests.cpp                 |  13 +-
 23 files changed, 681 insertions(+), 39 deletions(-)


[mesos] 04/06: Modified MesosContainerizer to GC nested container sandboxes.

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 4c44501e881dcc3b41128cedd080e91531cd9761
Author: Joseph Wu <jo...@apache.org>
AuthorDate: Fri Jul 27 11:55:35 2018 -0700

    Modified MesosContainerizer to GC nested container sandboxes.
    
    When the --gc_non_executor_container_sandboxes agent flag is enabled,
    this commit changes the MesosContainerizer to schedule nested container
    sandboxes for garbage collection.  The GC policy is the same between
    the MesosContainerizer and the Agent.
    
    Review: https://reviews.apache.org/r/68095
---
 src/slave/containerizer/mesos/containerizer.cpp | 114 +++++++++++++++++++-----
 src/slave/containerizer/mesos/containerizer.hpp |   4 +
 2 files changed, 96 insertions(+), 22 deletions(-)

diff --git a/src/slave/containerizer/mesos/containerizer.cpp b/src/slave/containerizer/mesos/containerizer.cpp
index 9e9f1d3..a89474b 100644
--- a/src/slave/containerizer/mesos/containerizer.cpp
+++ b/src/slave/containerizer/mesos/containerizer.cpp
@@ -31,6 +31,7 @@
 #include <process/owned.hpp>
 #include <process/reap.hpp>
 #include <process/subprocess.hpp>
+#include <process/time.hpp>
 
 #include <process/metrics/metrics.hpp>
 
@@ -130,6 +131,7 @@ using process::Future;
 using process::Owned;
 using process::Shared;
 using process::Subprocess;
+using process::Time;
 
 using process::http::Connection;
 
@@ -870,6 +872,27 @@ Future<Nothing> MesosContainerizerProcess::recover(
       continue;
     }
 
+    // Determine the sandbox if this is a nested or standalone container.
+    const bool isStandaloneContainer =
+      containerizer::paths::isStandaloneContainer(
+          flags.runtime_dir, containerId);
+
+    const ContainerID& rootContainerId =
+      protobuf::getRootContainerId(containerId);
+
+    Option<string> directory;
+    if (containerId.has_parent()) {
+      CHECK(containers_.contains(rootContainerId));
+
+      if (containers_[rootContainerId]->directory.isSome()) {
+        directory = containerizer::paths::getSandboxPath(
+            containers_[rootContainerId]->directory.get(),
+            containerId);
+      }
+    } else if (isStandaloneContainer) {
+      directory = slave::paths::getContainerPath(flags.work_dir, containerId);
+    }
+
     // Nested containers may have already been destroyed, but we leave
     // their runtime directories around for the lifetime of their
     // top-level container. If they have already been destroyed, we
@@ -880,9 +903,28 @@ Future<Nothing> MesosContainerizerProcess::recover(
         containerizer::paths::TERMINATION_FILE);
 
     if (os::exists(terminationPath)) {
+      CHECK(containerId.has_parent());
+
+      // Schedule the sandbox of the terminated nested container for garbage
+      // collection. Containers that exited while the agent was offline
+      // (i.e. before the termination file was checkpointed) will be GC'd
+      // after recovery.
+      if (flags.gc_non_executor_container_sandboxes &&
+          directory.isSome() &&
+          os::exists(directory.get())) {
+        // TODO(josephw): Should we also GC the runtime directory?
+        // This has the downside of potentially wiping out the exit status
+        // of the container when disk space is low.
+        garbageCollect(directory.get());
+      }
+
       continue;
     }
 
+    // TODO(josephw): Schedule GC for standalone containers.
+    // We currently delete the runtime directory of standalone containers
+    // upon exit, which means there is no record of the sandbox directory to GC.
+
     // Attempt to read the pid from the container runtime directory.
     Result<pid_t> pid =
       containerizer::paths::getContainerPid(flags.runtime_dir, containerId);
@@ -899,27 +941,6 @@ Future<Nothing> MesosContainerizerProcess::recover(
       return Failure("Failed to get container config: " + config.error());
     }
 
-    // Determine the sandbox if this is a nested or standalone container.
-    const bool isStandaloneContainer =
-      containerizer::paths::isStandaloneContainer(
-          flags.runtime_dir, containerId);
-
-    const ContainerID& rootContainerId =
-      protobuf::getRootContainerId(containerId);
-
-    Option<string> directory;
-    if (containerId.has_parent()) {
-      CHECK(containers_.contains(rootContainerId));
-
-      if (containers_[rootContainerId]->directory.isSome()) {
-        directory = containerizer::paths::getSandboxPath(
-            containers_[rootContainerId]->directory.get(),
-            containerId);
-      }
-    } else if (isStandaloneContainer) {
-      directory = slave::paths::getContainerPath(flags.work_dir, containerId);
-    }
-
     Owned<Container> container(new Container());
     container->state = RUNNING;
     container->pid = pid.isSome() ? pid.get() : Option<pid_t>();
@@ -2694,7 +2715,7 @@ void MesosContainerizerProcess::______destroy(
   // its runtime directory. There are two cases to consider:
   //
   // (1) We are a nested container:
-  //     In this case we should defer deletion of the runtime directory
+  //     * In this case we should defer deletion of the runtime directory
   //     until the top-level container is destroyed. Instead, we
   //     checkpoint a file with the termination state indicating that
   //     the container has already been destroyed. This allows
@@ -2702,6 +2723,8 @@ void MesosContainerizerProcess::______destroy(
   //     termination state until the top-level container is destroyed.
   //     It also prevents subsequent `destroy()` calls from attempting
   //     to cleanup the container a second time.
+  //     * We also schedule the nested container's sandbox directory for
+  //     garbage collection, if this behavior is enabled.
   //
   // (2) We are a top-level container:
   //     We should simply remove the runtime directory. Since we build
@@ -2730,6 +2753,19 @@ void MesosContainerizerProcess::______destroy(
       LOG(ERROR) << "Failed to checkpoint nested container's termination state"
                  << " to '" << terminationPath << "': " << checkpointed.error();
     }
+
+    // Schedule the sandbox of the nested container for garbage collection.
+    if (flags.gc_non_executor_container_sandboxes) {
+      const ContainerID rootContainerId =
+        protobuf::getRootContainerId(containerId);
+
+      CHECK(containers_.contains(rootContainerId));
+
+      const string sandboxPath = containerizer::paths::getSandboxPath(
+        containers_[rootContainerId]->directory.get(), containerId);
+
+      garbageCollect(sandboxPath);
+    }
   } else if (os::exists(runtimePath)) {
     Try<Nothing> rmdir = os::rmdir(runtimePath);
     if (rmdir.isError()) {
@@ -2751,6 +2787,33 @@ void MesosContainerizerProcess::______destroy(
 }
 
 
+Future<Nothing> MesosContainerizerProcess::garbageCollect(const string& path)
+{
+  // Some tests do not pass the GC actor into the containerizer for
+  // convenience of test construction. Those tests should not exercise
+  // this code path.
+  CHECK_NOTNULL(gc);
+
+  Try<long> mtime = os::stat::mtime(path);
+  if (mtime.isError()) {
+    LOG(ERROR) << "Failed to find the mtime of '" << path
+               << "': " << mtime.error();
+    return Failure(mtime.error());
+  }
+
+  // It is unsafe for testing to use unix time directly, we must use
+  // Time::create to convert into a Time object that reflects the
+  // possibly advanced state of the libprocess Clock.
+  Try<Time> time = Time::create(mtime.get());
+  CHECK_SOME(time);
+
+  // GC based on the modification time.
+  Duration delay = flags.gc_delay - (Clock::now() - time.get());
+
+  return gc->schedule(delay, path);
+}
+
+
 Future<bool> MesosContainerizerProcess::kill(
     const ContainerID& containerId,
     int signal)
@@ -2818,6 +2881,13 @@ Future<Nothing> MesosContainerizerProcess::remove(
       containers_[rootContainerId]->directory.get(), containerId);
 
   if (os::exists(sandboxPath)) {
+    // Unschedule the nested container sandbox from garbage collection
+    // to prevent potential double-deletion in future.
+    if (flags.gc_non_executor_container_sandboxes) {
+      CHECK_NOTNULL(gc);
+      gc->unschedule(sandboxPath);
+    }
+
     Try<Nothing> rmdir = os::rmdir(sandboxPath);
     if (rmdir.isError()) {
       return Failure(
diff --git a/src/slave/containerizer/mesos/containerizer.hpp b/src/slave/containerizer/mesos/containerizer.hpp
index bccf8af..3102b87 100644
--- a/src/slave/containerizer/mesos/containerizer.hpp
+++ b/src/slave/containerizer/mesos/containerizer.hpp
@@ -282,6 +282,10 @@ private:
       const Option<mesos::slave::ContainerTermination>& termination,
       const process::Future<bool>& destroy);
 
+  // Schedules a path for garbage collection based on its modification time.
+  // Equivalent to the `Slave::garbageCollect` method.
+  process::Future<Nothing> garbageCollect(const std::string& path);
+
   // Call back for when an isolator limits a container and impacts the
   // processes. This will trigger container destruction.
   void limited(


[mesos] 03/06: Added flag to control GC-ing of nested container sandboxes.

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 831598f9a25c0c959d4fa799b37cf391b7ce00b0
Author: Joseph Wu <jo...@apache.org>
AuthorDate: Mon Jul 23 12:25:26 2018 -0700

    Added flag to control GC-ing of nested container sandboxes.
    
    This adds an agent flag which enables garbage collection of finished
    nested container sandboxes.
    
    This also updates some documentation and recommends enabling this flag
    when the user uses the same default executor to launch multiple tasks.
    
    Review: https://reviews.apache.org/r/68067
---
 CHANGELOG                   |  5 +++++
 docs/configuration/agent.md | 13 +++++++++++++
 docs/operator-http-api.md   | 11 +++++++++++
 docs/sandbox.md             |  3 +++
 docs/upgrades.md            |  9 +++++++++
 src/slave/flags.cpp         |  9 +++++++++
 src/slave/flags.hpp         |  1 +
 src/slave/http.cpp          |  1 +
 8 files changed, 52 insertions(+)

diff --git a/CHANGELOG b/CHANGELOG
index 3b55352..bd3d894 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -36,6 +36,11 @@ This release contains the following highlights:
       along with other cgroups related options
       (e.g., `cgroups/cpu`), those options will be just ignored.
 
+    * [MESOS-7947] - Added a new `--gc_non_executor_container_sandboxes`
+      option which tells the agent to garbage collect sandboxes created
+      via the LAUNCH_NESTED_CONTAINER API. The same flag will apply to
+      standalone container sandboxes in future.
+
     * [MESOS-8327] - Added container-specific cgroups mounts under
       `/sys/fs/cgroup` to containers with image launched by Mesos
       containerizer.
diff --git a/docs/configuration/agent.md b/docs/configuration/agent.md
index 0f0f910..4d209e6 100644
--- a/docs/configuration/agent.md
+++ b/docs/configuration/agent.md
@@ -918,6 +918,19 @@ be a value between 0.0 and 1.0 (default: 0.1)
   </td>
 </tr>
 
+<tr id="gc_non_executor_container_sandboxes">
+  <td>
+    --[no-]gc_non_executor_container_sandboxes
+  </td>
+  <td>
+Determines whether nested container sandboxes created via the
+<code>LAUNCH_CONTAINER</code> and <code>LAUNCH_NESTED_CONTAINER</code> APIs will be
+automatically garbage collected by the agent upon termination.
+The <code>REMOVE_(NESTED_)CONTAINER</code> API is unaffected by this flag
+and can still be used. (default: false).
+  </td>
+</tr>
+
 <tr id="hadoop_home">
   <td>
     --hadoop_home=VALUE
diff --git a/docs/operator-http-api.md b/docs/operator-http-api.md
index a571683..6edf361 100644
--- a/docs/operator-http-api.md
+++ b/docs/operator-http-api.md
@@ -2953,6 +2953,10 @@ Content-Type: application/json
         "value": "0.1"
       },
       {
+        "name": "gc_non_executor_container_sandboxes",
+        "value": "false"
+      },
+      {
         "name": "help",
         "value": "false"
       },
@@ -3588,6 +3592,13 @@ This call launches a nested container. Any authorized entity,
 including the executor itself, its tasks, or the operator can use this
 API to launch a nested container.
 
+**NOTE**: Successful invocation of this API will result in some metadata
+tracked by the agent and the creation of a container sandbox under the
+parent container. The REMOVE_NESTED_CONTAINER should be used to remove
+the metadata and sandbox. If the `--gc_non_executor_container_sandboxes`
+agent flag is enabled, the agent will garbage collect the sandboxes of
+any containers launched via this API.
+
 ```
 LAUNCH_NESTED_CONTAINER HTTP Request (JSON):
 
diff --git a/docs/sandbox.md b/docs/sandbox.md
index 3b44112..c536443 100644
--- a/docs/sandbox.md
+++ b/docs/sandbox.md
@@ -157,6 +157,9 @@ Sandbox files are scheduled for garbage collection when:
 * An executor is removed or terminated.
 * A framework is removed.
 * An executor is recovered unsuccessfully during agent recovery.
+* If the `--gc_non_executor_container_sandboxes` agent flag is enabled,
+  nested container sandboxes will also be garbage collected when the
+  container exits.
 
 **NOTE:** During agent recovery, all of the executor's runs, except for the
 latest run, are scheduled for garbage collection as well.
diff --git a/docs/upgrades.md b/docs/upgrades.md
index dc61f9b..1049384 100644
--- a/docs/upgrades.md
+++ b/docs/upgrades.md
@@ -60,6 +60,7 @@ We categorize the changes as follows:
   <td style="word-wrap: break-word; overflow-wrap: break-word;"><!--Flags-->
     <ul style="padding-left:10px;">
       <li>A <a href="#1-7-x-enforce-container-ports">enforce_container_ports</a></li>
+      <li>A <a href="#1-7-x-gc-non-executor-container-sandboxes">gc_non_executor_container_sandboxes</a></li>
     </ul>
   </td>
 
@@ -471,6 +472,14 @@ We categorize the changes as follows:
 
 * A new [`--enforce_container_ports`](configuration/agent.md#enforce_container_ports) flag has been added to toggle whether the [`network/ports`](isolators/network-ports.md) isolator should enforce TCP ports usage limits.
 
+<a name="1-7-x-gc-non-executor-container-sandboxes"></a>
+
+* A new [`--gc_non_executor_container_sandboxes`](configuration/agent.md#gc_non_executor_container_sandboxes)
+  agent flag has been added to garbage collect the sandboxes of nested
+  containers, which includes the tasks groups launched by the default executor.
+  We recommend enabling the flag if you have frameworks that launch multiple
+  task groups on the same default executor instance.
+
 <a name="1-7-x-container-logger"></a>
 
 * `ContainerLogger` module interface has been changed. The `prepare()` method now takes `ContainerID` and `ContainerConfig` instead.
diff --git a/src/slave/flags.cpp b/src/slave/flags.cpp
index 872e412..b87075a 100644
--- a/src/slave/flags.cpp
+++ b/src/slave/flags.cpp
@@ -479,6 +479,15 @@ mesos::internal::slave::Flags::Flags()
       "be a value between 0.0 and 1.0",
       GC_DISK_HEADROOM);
 
+  add(&Flags::gc_non_executor_container_sandboxes,
+      "gc_non_executor_container_sandboxes",
+      "Determines whether nested container sandboxes created via the\n"
+      "LAUNCH_CONTAINER and LAUNCH_NESTED_CONTAINER APIs will be\n"
+      "automatically garbage collected by the agent upon termination.\n"
+      "The REMOVE_(NESTED_)CONTAINER API is unaffected by this flag\n"
+      "and can still be used.",
+      false);
+
   add(&Flags::disk_watch_interval,
       "disk_watch_interval",
       "Periodic time interval (e.g., 10secs, 2mins, etc)\n"
diff --git a/src/slave/flags.hpp b/src/slave/flags.hpp
index bff194f..bbf1bf6 100644
--- a/src/slave/flags.hpp
+++ b/src/slave/flags.hpp
@@ -91,6 +91,7 @@ public:
 #endif // USE_SSL_SOCKET
   Duration gc_delay;
   double gc_disk_headroom;
+  bool gc_non_executor_container_sandboxes;
   Duration disk_watch_interval;
 
   Option<std::string> container_logger;
diff --git a/src/slave/http.cpp b/src/slave/http.cpp
index 0d253f0..000c067 100644
--- a/src/slave/http.cpp
+++ b/src/slave/http.cpp
@@ -1206,6 +1206,7 @@ string Http::STATE_HELP() {
         "         \"containerizers\" : \"mesos\",",
         "         \"docker_socket\" : \"/var/run/docker.sock\",",
         "         \"gc_delay\" : \"1weeks\",",
+        "         \"gc_non_executor_container_sandboxes\" : \"false\",",
         "         \"docker_remove_delay\" : \"6hrs\",",
         "         \"port\" : \"5051\",",
         "         \"systemd_runtime_directory\" : \"/run/systemd/system\",",


[mesos] 02/06: Shared GC actor between Agent and MesosContainerizer.

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit a90e9da72f89489b0d04db0e974e93bc95175fc9
Author: Joseph Wu <jo...@apache.org>
AuthorDate: Tue Jul 24 14:57:33 2018 -0700

    Shared GC actor between Agent and MesosContainerizer.
    
    This inserts the pointer to the GC actor into the MesosContainerizer,
    so the containerizer can schedule sandboxes it manages for GC according
    to the agent's GC policy.  The Composing/Docker Containerizers are
    unchanged as they do not directly deal with nested or standalone
    containers.
    
    To minimize test changes, the GC actor argument defaults to nullptr.
    Only tests that exercise sandbox GC will need to supply the actor.
    
    Review: https://reviews.apache.org/r/68066
---
 src/local/local.cpp                                         |  1 +
 src/slave/containerizer/containerizer.cpp                   |  4 +++-
 src/slave/containerizer/containerizer.hpp                   |  3 +++
 src/slave/containerizer/mesos/containerizer.cpp             |  5 +++++
 src/slave/containerizer/mesos/containerizer.hpp             | 10 ++++++++++
 src/slave/main.cpp                                          |  8 ++++----
 src/tests/cluster.cpp                                       | 13 +++++++------
 src/tests/containerizer/cni_isolator_tests.cpp              |  1 +
 src/tests/containerizer/docker_volume_isolator_tests.cpp    |  1 +
 .../containerizer/environment_secret_isolator_tests.cpp     |  6 ++++--
 src/tests/containerizer/mesos_containerizer_tests.cpp       |  7 +++++++
 src/tests/containerizer/volume_secret_isolator_tests.cpp    |  1 +
 12 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/src/local/local.cpp b/src/local/local.cpp
index 5b7bb59..6087068 100644
--- a/src/local/local.cpp
+++ b/src/local/local.cpp
@@ -508,6 +508,7 @@ PID<Master> launch(const Flags& flags, Allocator* _allocator)
         slaveFlags,
         true,
         fetchers->back(),
+        garbageCollectors->back(),
         secretResolver.get());
 
     if (containerizer.isError()) {
diff --git a/src/slave/containerizer/containerizer.cpp b/src/slave/containerizer/containerizer.cpp
index 59f107a..c6b5e64 100644
--- a/src/slave/containerizer/containerizer.cpp
+++ b/src/slave/containerizer/containerizer.cpp
@@ -34,6 +34,7 @@
 #include "hook/manager.hpp"
 
 #include "slave/flags.hpp"
+#include "slave/gc.hpp"
 #include "slave/slave.hpp"
 
 #include "slave/containerizer/composing.hpp"
@@ -216,6 +217,7 @@ Try<Containerizer*> Containerizer::create(
     const Flags& flags,
     bool local,
     Fetcher* fetcher,
+    GarbageCollector* gc,
     SecretResolver* secretResolver)
 {
   // Get the set of containerizer types.
@@ -286,7 +288,7 @@ Try<Containerizer*> Containerizer::create(
   foreach (const string& type, containerizerTypes) {
     if (type == "mesos") {
       Try<MesosContainerizer*> containerizer = MesosContainerizer::create(
-          flags, local, fetcher, secretResolver, nvidia);
+          flags, local, fetcher, gc, secretResolver, nvidia);
       if (containerizer.isError()) {
         return Error("Could not create MesosContainerizer: " +
                      containerizer.error());
diff --git a/src/slave/containerizer/containerizer.hpp b/src/slave/containerizer/containerizer.hpp
index 675dfad..66f73a3 100644
--- a/src/slave/containerizer/containerizer.hpp
+++ b/src/slave/containerizer/containerizer.hpp
@@ -36,6 +36,8 @@
 #include <stout/option.hpp>
 #include <stout/try.hpp>
 
+#include "slave/gc.hpp"
+
 #include "slave/containerizer/fetcher.hpp"
 
 namespace mesos {
@@ -69,6 +71,7 @@ public:
       const Flags& flags,
       bool local,
       Fetcher* fetcher,
+      GarbageCollector* gc,
       SecretResolver* secretResolver = nullptr);
 
   // Determine slave resources from flags, probing the system or
diff --git a/src/slave/containerizer/mesos/containerizer.cpp b/src/slave/containerizer/mesos/containerizer.cpp
index c38bfac..9e9f1d3 100644
--- a/src/slave/containerizer/mesos/containerizer.cpp
+++ b/src/slave/containerizer/mesos/containerizer.cpp
@@ -56,6 +56,7 @@
 
 #include "module/manager.hpp"
 
+#include "slave/gc.hpp"
 #include "slave/paths.hpp"
 #include "slave/slave.hpp"
 
@@ -162,6 +163,7 @@ Try<MesosContainerizer*> MesosContainerizer::create(
     const Flags& flags,
     bool local,
     Fetcher* fetcher,
+    GarbageCollector* gc,
     SecretResolver* secretResolver,
     const Option<NvidiaComponents>& nvidia)
 {
@@ -541,6 +543,7 @@ Try<MesosContainerizer*> MesosContainerizer::create(
       flags,
       local,
       fetcher,
+      gc,
       Owned<Launcher>(launcher.get()),
       provisioner,
       isolators);
@@ -551,6 +554,7 @@ Try<MesosContainerizer*> MesosContainerizer::create(
     const Flags& flags,
     bool local,
     Fetcher* fetcher,
+    GarbageCollector* gc,
     const Owned<Launcher>& launcher,
     const Shared<Provisioner>& provisioner,
     const vector<Owned<Isolator>>& isolators)
@@ -575,6 +579,7 @@ Try<MesosContainerizer*> MesosContainerizer::create(
       new MesosContainerizerProcess(
           flags,
           fetcher,
+          gc,
           ioSwitchboard.get(),
           launcher,
           provisioner,
diff --git a/src/slave/containerizer/mesos/containerizer.hpp b/src/slave/containerizer/mesos/containerizer.hpp
index 7711d46..bccf8af 100644
--- a/src/slave/containerizer/mesos/containerizer.hpp
+++ b/src/slave/containerizer/mesos/containerizer.hpp
@@ -34,6 +34,7 @@
 #include <stout/multihashmap.hpp>
 #include <stout/os/int_fd.hpp>
 
+#include "slave/gc.hpp"
 #include "slave/state.hpp"
 
 #include "slave/containerizer/containerizer.hpp"
@@ -68,6 +69,7 @@ public:
       const Flags& flags,
       bool local,
       Fetcher* fetcher,
+      GarbageCollector* gc = nullptr,
       SecretResolver* secretResolver = nullptr,
       const Option<NvidiaComponents>& nvidia = None());
 
@@ -75,6 +77,7 @@ public:
       const Flags& flags,
       bool local,
       Fetcher* fetcher,
+      GarbageCollector* gc,
       const process::Owned<Launcher>& launcher,
       const process::Shared<Provisioner>& provisioner,
       const std::vector<process::Owned<mesos::slave::Isolator>>& isolators);
@@ -135,6 +138,7 @@ public:
   MesosContainerizerProcess(
       const Flags& _flags,
       Fetcher* _fetcher,
+      GarbageCollector* _gc,
       IOSwitchboard* _ioSwitchboard,
       const process::Owned<Launcher>& _launcher,
       const process::Shared<Provisioner>& _provisioner,
@@ -142,6 +146,7 @@ public:
     : ProcessBase(process::ID::generate("mesos-containerizer")),
       flags(_flags),
       fetcher(_fetcher),
+      gc(_gc),
       ioSwitchboard(_ioSwitchboard),
       launcher(_launcher),
       provisioner(_provisioner),
@@ -299,6 +304,11 @@ private:
 
   const Flags flags;
   Fetcher* fetcher;
+
+  // NOTE: This actor may be nullptr in tests, as not all tests need to
+  // share this actor with the agent.
+  GarbageCollector* gc;
+
   IOSwitchboard* ioSwitchboard;
   const process::Owned<Launcher> launcher;
   const process::Shared<Provisioner> provisioner;
diff --git a/src/slave/main.cpp b/src/slave/main.cpp
index 55615e7..e774092 100644
--- a/src/slave/main.cpp
+++ b/src/slave/main.cpp
@@ -465,6 +465,7 @@ int main(int argc, char** argv)
 #endif // __linux__
 
   Fetcher* fetcher = new Fetcher(flags);
+  GarbageCollector* gc = new GarbageCollector(flags.work_dir);
 
   // Initialize SecretResolver.
   Try<SecretResolver*> secretResolver =
@@ -476,7 +477,7 @@ int main(int argc, char** argv)
   }
 
   Try<Containerizer*> containerizer =
-    Containerizer::create(flags, false, fetcher, secretResolver.get());
+    Containerizer::create(flags, false, fetcher, gc, secretResolver.get());
 
   if (containerizer.isError()) {
     EXIT(EXIT_FAILURE)
@@ -530,7 +531,6 @@ int main(int argc, char** argv)
   }
 
   Files* files = new Files(READONLY_HTTP_AUTHENTICATION_REALM, authorizer_);
-  GarbageCollector* gc = new GarbageCollector(flags.work_dir);
   TaskStatusUpdateManager* taskStatusUpdateManager =
     new TaskStatusUpdateManager(flags);
 
@@ -605,8 +605,6 @@ int main(int argc, char** argv)
 
   delete taskStatusUpdateManager;
 
-  delete gc;
-
   delete files;
 
   if (authorizer_.isSome()) {
@@ -617,6 +615,8 @@ int main(int argc, char** argv)
 
   delete containerizer.get();
 
+  delete gc;
+
   delete fetcher;
 
   // NOTE: We need to finalize libprocess, on Windows especially,
diff --git a/src/tests/cluster.cpp b/src/tests/cluster.cpp
index cb7d3f0..a7226c7 100644
--- a/src/tests/cluster.cpp
+++ b/src/tests/cluster.cpp
@@ -421,6 +421,11 @@ Try<process::Owned<Slave>> Slave::create(
   slave->flags = flags;
   slave->detector = detector;
 
+  // If the garbage collector is not provided, create a default one.
+  if (gc.isNone()) {
+    slave->gc.reset(new slave::GarbageCollector(flags.work_dir));
+  }
+
   // If the containerizer is not provided, create a default one.
   if (containerizer.isSome()) {
     slave->containerizer = containerizer.get();
@@ -429,7 +434,8 @@ Try<process::Owned<Slave>> Slave::create(
     slave->fetcher.reset(new slave::Fetcher(flags));
 
     Try<slave::Containerizer*> _containerizer =
-      slave::Containerizer::create(flags, true, slave->fetcher.get());
+      slave::Containerizer::create(
+          flags, true, slave->fetcher.get(), gc.getOrElse(slave->gc.get()));
 
     if (_containerizer.isError()) {
       return Error("Failed to create containerizer: " + _containerizer.error());
@@ -503,11 +509,6 @@ Try<process::Owned<Slave>> Slave::create(
     slave->setAuthorizationCallbacks(providedAuthorizer.get());
   }
 
-  // If the garbage collector is not provided, create a default one.
-  if (gc.isNone()) {
-    slave->gc.reset(new slave::GarbageCollector(flags.work_dir));
-  }
-
   // If the resource estimator is not provided, create a default one.
   if (resourceEstimator.isNone()) {
     Try<mesos::slave::ResourceEstimator*> _resourceEstimator =
diff --git a/src/tests/containerizer/cni_isolator_tests.cpp b/src/tests/containerizer/cni_isolator_tests.cpp
index 63b109b..45c81e5 100644
--- a/src/tests/containerizer/cni_isolator_tests.cpp
+++ b/src/tests/containerizer/cni_isolator_tests.cpp
@@ -570,6 +570,7 @@ TEST_F(CniIsolatorTest, ROOT_DestroyWhilePreparing)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       provisioner->share(),
       {Owned<Isolator>(cniIsolator.get()),
diff --git a/src/tests/containerizer/docker_volume_isolator_tests.cpp b/src/tests/containerizer/docker_volume_isolator_tests.cpp
index 553f344..1503290 100644
--- a/src/tests/containerizer/docker_volume_isolator_tests.cpp
+++ b/src/tests/containerizer/docker_volume_isolator_tests.cpp
@@ -202,6 +202,7 @@ protected:
         flags,
         true,
         fetcher.get(),
+        nullptr,
         std::move(launcher),
         provisioner->share(),
         {std::move(linuxIsolator),
diff --git a/src/tests/containerizer/environment_secret_isolator_tests.cpp b/src/tests/containerizer/environment_secret_isolator_tests.cpp
index 7a516e5..350700a 100644
--- a/src/tests/containerizer/environment_secret_isolator_tests.cpp
+++ b/src/tests/containerizer/environment_secret_isolator_tests.cpp
@@ -59,7 +59,8 @@ TEST_F(EnvironmentSecretIsolatorTest, ResolveSecret)
   EXPECT_SOME(secretResolver);
 
   Try<MesosContainerizer*> containerizer =
-    MesosContainerizer::create(flags, false, &fetcher, secretResolver.get());
+    MesosContainerizer::create(
+        flags, false, &fetcher, nullptr, secretResolver.get());
   EXPECT_SOME(containerizer);
 
   Owned<MasterDetector> detector = master.get()->createDetector();
@@ -144,7 +145,8 @@ TEST_F(EnvironmentSecretIsolatorTest, ResolveSecretDefaultExecutor)
   EXPECT_SOME(secretResolver);
 
   Try<MesosContainerizer*> containerizer =
-    MesosContainerizer::create(flags, true, &fetcher, secretResolver.get());
+    MesosContainerizer::create(
+        flags, true, &fetcher, nullptr, secretResolver.get());
   EXPECT_SOME(containerizer);
 
   Owned<MasterDetector> detector = master.get()->createDetector();
diff --git a/src/tests/containerizer/mesos_containerizer_tests.cpp b/src/tests/containerizer/mesos_containerizer_tests.cpp
index 1b8e53c..449928c 100644
--- a/src/tests/containerizer/mesos_containerizer_tests.cpp
+++ b/src/tests/containerizer/mesos_containerizer_tests.cpp
@@ -340,6 +340,7 @@ public:
         flags,
         true,
         fetcher.get(),
+        nullptr,
         std::move(launcher),
         provisioner->share(),
         isolators);
@@ -726,6 +727,7 @@ TEST_F(MesosContainerizerDestroyTest, DestroyWhileFetching)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       provisioner->share(),
       vector<Owned<Isolator>>());
@@ -803,6 +805,7 @@ TEST_F(MesosContainerizerDestroyTest, DestroyWhilePreparing)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       provisioner->share(),
       {Owned<Isolator>(isolator)});
@@ -930,6 +933,7 @@ TEST_F(MesosContainerizerProvisionerTest, ProvisionFailed)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       Shared<Provisioner>(provisioner),
       vector<Owned<Isolator>>());
@@ -1013,6 +1017,7 @@ TEST_F(MesosContainerizerProvisionerTest, DestroyWhileProvisioning)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       Shared<Provisioner>(provisioner),
       vector<Owned<Isolator>>());
@@ -1105,6 +1110,7 @@ TEST_F(MesosContainerizerProvisionerTest, IsolatorCleanupBeforePrepare)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       Shared<Provisioner>(provisioner),
       {Owned<Isolator>(isolator)});
@@ -1194,6 +1200,7 @@ TEST_F(MesosContainerizerDestroyTest, LauncherDestroyFailure)
       flags,
       true,
       &fetcher,
+      nullptr,
       launcher,
       provisioner->share(),
       vector<Owned<Isolator>>());
diff --git a/src/tests/containerizer/volume_secret_isolator_tests.cpp b/src/tests/containerizer/volume_secret_isolator_tests.cpp
index 8b11c6f..b6c43c3 100644
--- a/src/tests/containerizer/volume_secret_isolator_tests.cpp
+++ b/src/tests/containerizer/volume_secret_isolator_tests.cpp
@@ -177,6 +177,7 @@ TEST_P(VolumeSecretIsolatorTest, ROOT_SecretInVolumeWithRootFilesystem)
       flags,
       true,
       &fetcher,
+      nullptr,
       secretResolver.get());
 
   ASSERT_SOME(create);


[mesos] 05/06: Added tests for task metadata GC using the default executor.

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit c03df894d9f58d6b632e9f56fa1bf0a1b6152bf3
Author: Joseph Wu <jo...@apache.org>
AuthorDate: Tue Jul 24 13:59:07 2018 -0700

    Added tests for task metadata GC using the default executor.
    
    This adds two tests that launch a long-lived and short-lived task
    on the same executor.  The tests expect the short-lived task's
    metadata and sandbox to be garbage collected.
    
    One test restarts the agent before GC to ensure recovery will
    schedule completed tasks for GC too.
    
    One existing test is modified due to the extra GC event from
    task metadata.
    
    Review: https://reviews.apache.org/r/68068
---
 src/tests/gc_tests.cpp             | 463 +++++++++++++++++++++++++++++++++++++
 src/tests/slave_recovery_tests.cpp |  13 +-
 2 files changed, 472 insertions(+), 4 deletions(-)

diff --git a/src/tests/gc_tests.cpp b/src/tests/gc_tests.cpp
index 4f288cf..55cba06 100644
--- a/src/tests/gc_tests.cpp
+++ b/src/tests/gc_tests.cpp
@@ -25,6 +25,11 @@
 #include <mesos/resources.hpp>
 #include <mesos/scheduler.hpp>
 
+#include <mesos/v1/executor.hpp>
+#include <mesos/v1/mesos.hpp>
+#include <mesos/v1/scheduler.hpp>
+
+#include <process/collect.hpp>
 #include <process/dispatch.hpp>
 #include <process/future.hpp>
 #include <process/gmock.hpp>
@@ -84,7 +89,9 @@ using std::string;
 using std::vector;
 
 using testing::_;
+using testing::AllOf;
 using testing::AtMost;
+using testing::DoAll;
 using testing::Return;
 using testing::SaveArg;
 
@@ -646,6 +653,462 @@ TEST_F(GarbageCollectorIntegrationTest, ExitedExecutor)
 }
 
 
+// This test verifies that task metadata and sandboxes are scheduled for GC
+// when a task finishes, but the executor is still running.
+TEST_F(GarbageCollectorIntegrationTest, LongLivedDefaultExecutor)
+{
+  Try<Owned<cluster::Master>> master = StartMaster();
+  ASSERT_SOME(master);
+
+  // We need this for the agent's work directory and GC policy.
+  slave::Flags flags = CreateSlaveFlags();
+
+  // Turn on GC of nested container sandboxes by default.
+  flags.gc_non_executor_container_sandboxes = true;
+
+  Owned<MasterDetector> detector = master.get()->createDetector();
+  Try<Owned<cluster::Slave>> slave = StartSlave(detector.get(), flags);
+  ASSERT_SOME(slave);
+
+  auto scheduler = std::make_shared<v1::MockHTTPScheduler>();
+
+  // Enable checkpointing, otherwise there will be no metadata to GC.
+  v1::FrameworkInfo frameworkInfo = v1::DEFAULT_FRAMEWORK_INFO;
+  frameworkInfo.set_checkpoint(true);
+
+  EXPECT_CALL(*scheduler, connected(_))
+    .WillOnce(v1::scheduler::SendSubscribe(frameworkInfo));
+
+  Future<v1::scheduler::Event::Subscribed> subscribed;
+  EXPECT_CALL(*scheduler, subscribed(_, _))
+    .WillOnce(FutureArg<1>(&subscribed));
+
+  Future<v1::scheduler::Event::Offers> offers;
+  EXPECT_CALL(*scheduler, offers(_, _))
+    .WillOnce(FutureArg<1>(&offers))
+    .WillRepeatedly(Return());
+
+  EXPECT_CALL(*scheduler, heartbeat(_))
+    .WillRepeatedly(Return()); // Ignore heartbeats.
+
+  v1::scheduler::TestMesos mesos(
+      master.get()->pid,
+      ContentType::PROTOBUF,
+      scheduler);
+
+  AWAIT_READY(subscribed);
+  v1::FrameworkID frameworkId(subscribed->framework_id());
+
+  AWAIT_READY(offers);
+  ASSERT_FALSE(offers->offers().empty());
+
+  const v1::Offer& offer = offers->offers(0);
+  const v1::AgentID& agentId = offer.agent_id();
+
+  v1::Resources resources =
+    v1::Resources::parse("cpus:0.1;mem:32;disk:32").get();
+
+  v1::ExecutorInfo executorInfo = v1::createExecutorInfo(
+      v1::DEFAULT_EXECUTOR_ID,
+      None(),
+      resources,
+      v1::ExecutorInfo::DEFAULT,
+      frameworkId);
+
+  // We launch two tasks for this test:
+  //   * One will be a long-lived task to keep the executor alive.
+  //   * One will be a short-lived task to exercise task metadata/sandbox GC.
+  v1::TaskInfo longLivedTaskInfo =
+    v1::createTask(agentId, resources, SLEEP_COMMAND(1000));
+
+  v1::TaskInfo shortLivedTaskInfo =
+    v1::createTask(agentId, resources, "exit 0");
+
+  // There should be a total of 5 updates:
+  //   * TASK_STARTING/RUNNING from the long-lived task,
+  //   * TASK_STARTING/RUNNING/FINISHED from the short-lived task.
+  testing::Sequence longTask;
+  Future<v1::scheduler::Event::Update> longStartingUpdate;
+  Future<v1::scheduler::Event::Update> longRunningUpdate;
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(longLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_STARTING))))
+    .InSequence(longTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&longStartingUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(longLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_RUNNING))))
+    .InSequence(longTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&longRunningUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  testing::Sequence shortTask;
+  Future<v1::scheduler::Event::Update> shortStartingUpdate;
+  Future<v1::scheduler::Event::Update> shortRunningUpdate;
+  Future<v1::scheduler::Event::Update> shortFinishedUpdate;
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(shortLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_STARTING))))
+    .InSequence(shortTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&shortStartingUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(shortLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_RUNNING))))
+    .InSequence(shortTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&shortRunningUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(shortLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_FINISHED))))
+    .InSequence(shortTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&shortFinishedUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  // There should be two directories scheduled for GC:
+  // the short-lived task's metadata and sandbox.
+  vector<Future<Nothing>> schedules = {
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule)
+  };
+
+  mesos.send(
+      v1::createCallAccept(
+          frameworkId,
+          offer,
+          {
+            v1::LAUNCH_GROUP(
+                executorInfo, v1::createTaskGroupInfo({longLivedTaskInfo})),
+            v1::LAUNCH_GROUP(
+                executorInfo, v1::createTaskGroupInfo({shortLivedTaskInfo}))
+          }));
+
+  AWAIT_READY(collect(schedules));
+  AWAIT_READY(longStartingUpdate);
+  AWAIT_READY(longRunningUpdate);
+  AWAIT_READY(shortStartingUpdate);
+  AWAIT_READY(shortRunningUpdate);
+  AWAIT_READY(shortFinishedUpdate);
+
+  // Check that the short-lived task's metadata and sandbox exist.
+  string shortLivedTaskPath = slave::paths::getTaskPath(
+      slave::paths::getMetaRootDir(flags.work_dir),
+      devolve(agentId),
+      devolve(frameworkId),
+      devolve(executorInfo.executor_id()),
+      devolve(
+          shortStartingUpdate->status()
+            .container_status().container_id().parent()),
+      devolve(shortLivedTaskInfo.task_id()));
+
+  ASSERT_TRUE(os::exists(shortLivedTaskPath));
+
+  string shortLivedSandboxPath = path::join(
+      slave::paths::getExecutorRunPath(
+          flags.work_dir,
+            devolve(agentId),
+            devolve(frameworkId),
+            devolve(executorInfo.executor_id()),
+            devolve(
+                shortStartingUpdate->status()
+                  .container_status().container_id().parent())),
+      "containers",
+      shortStartingUpdate->status().container_status().container_id().value());
+
+  ASSERT_TRUE(os::exists(shortLivedSandboxPath));
+
+  // Check another metadata directory that should only be GC'd after the
+  // executor exits.
+  string executorMetaPath = slave::paths::getExecutorPath(
+      slave::paths::getMetaRootDir(flags.work_dir),
+      devolve(agentId),
+      devolve(frameworkId),
+      devolve(executorInfo.executor_id()));
+
+  ASSERT_TRUE(os::exists(executorMetaPath));
+
+  // Trigger garbage collection on the short-lived task's directories
+  // and check that those are properly deleted.
+  Clock::pause();
+  Clock::advance(flags.gc_delay);
+  Clock::settle();
+  Clock::resume();
+
+  ASSERT_FALSE(os::exists(shortLivedTaskPath));
+  ASSERT_FALSE(os::exists(shortLivedSandboxPath));
+  ASSERT_TRUE(os::exists(executorMetaPath));
+
+  // Kill the remaining task and trigger garbage collection again.
+  Future<v1::scheduler::Event::Update> killedUpdate;
+  EXPECT_CALL(*scheduler, update(_, _))
+    .WillOnce(DoAll(
+        FutureArg<1>(&killedUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  // Since this is the last executor belonging to the framework, we expect
+  // multiple directories to be scheduled for GC:
+  //   * Task, Executor container, Executor, and Framework metadata directories.
+  //   * Executor sandbox and run directories.
+  //   * Framework work directory.
+  schedules = {
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule)
+  };
+
+  mesos.send(v1::createCallKill(frameworkId, longLivedTaskInfo.task_id()));
+
+  AWAIT_READY(killedUpdate);
+  EXPECT_EQ(v1::TASK_KILLED, killedUpdate->status().state());
+  EXPECT_EQ(longLivedTaskInfo.task_id(), killedUpdate->status().task_id());
+
+  AWAIT_READY(collect(schedules));
+
+  // Trigger GC and then check one of the directories above.
+  Clock::pause();
+  Clock::advance(flags.gc_delay);
+  Clock::settle();
+  Clock::resume();
+
+  ASSERT_FALSE(os::exists(executorMetaPath));
+}
+
+
+// This test verifies that task metadata and sandboxes are scheduled for GC
+// when a task finishes, but the executor is still running. This version of
+// the test restarts the agent to ensure recovered tasks are also scheduled
+// for GC.
+TEST_F(GarbageCollectorIntegrationTest, LongLivedDefaultExecutorRestart)
+{
+  Try<Owned<cluster::Master>> master = StartMaster();
+  ASSERT_SOME(master);
+
+  // We need this for the agent's work directory and GC policy.
+  slave::Flags flags = CreateSlaveFlags();
+
+  // Turn on GC of nested container sandboxes by default.
+  flags.gc_non_executor_container_sandboxes = true;
+
+  Owned<MasterDetector> detector = master.get()->createDetector();
+  Try<Owned<cluster::Slave>> slave = StartSlave(detector.get(), flags);
+  ASSERT_SOME(slave);
+
+  auto scheduler = std::make_shared<v1::MockHTTPScheduler>();
+
+  // Enable checkpointing, otherwise there will be no metadata to GC.
+  v1::FrameworkInfo frameworkInfo = v1::DEFAULT_FRAMEWORK_INFO;
+  frameworkInfo.set_checkpoint(true);
+
+  EXPECT_CALL(*scheduler, connected(_))
+    .WillOnce(v1::scheduler::SendSubscribe(frameworkInfo));
+
+  Future<v1::scheduler::Event::Subscribed> subscribed;
+  EXPECT_CALL(*scheduler, subscribed(_, _))
+    .WillOnce(FutureArg<1>(&subscribed));
+
+  Future<v1::scheduler::Event::Offers> offers;
+  EXPECT_CALL(*scheduler, offers(_, _))
+    .WillOnce(FutureArg<1>(&offers))
+    .WillRepeatedly(Return());
+
+  EXPECT_CALL(*scheduler, heartbeat(_))
+    .WillRepeatedly(Return()); // Ignore heartbeats.
+
+  v1::scheduler::TestMesos mesos(
+      master.get()->pid,
+      ContentType::PROTOBUF,
+      scheduler);
+
+  AWAIT_READY(subscribed);
+  v1::FrameworkID frameworkId(subscribed->framework_id());
+
+  AWAIT_READY(offers);
+  ASSERT_FALSE(offers->offers().empty());
+
+  const v1::Offer& offer = offers->offers(0);
+  const v1::AgentID& agentId = offer.agent_id();
+
+  v1::Resources resources =
+    v1::Resources::parse("cpus:0.1;mem:32;disk:32").get();
+
+  v1::ExecutorInfo executorInfo = v1::createExecutorInfo(
+      v1::DEFAULT_EXECUTOR_ID,
+      None(),
+      resources,
+      v1::ExecutorInfo::DEFAULT,
+      frameworkId);
+
+  // We launch two tasks for this test:
+  //   * One will be a long-lived task to keep the executor alive.
+  //   * One will be a short-lived task to exercise task metadata/sandbox GC.
+  v1::TaskInfo longLivedTaskInfo =
+    v1::createTask(agentId, resources, SLEEP_COMMAND(1000));
+
+  v1::TaskInfo shortLivedTaskInfo =
+    v1::createTask(agentId, resources, "exit 0");
+
+  // There should be a total of 5 updates:
+  //   * TASK_STARTING/RUNNING from the long-lived task,
+  //   * TASK_STARTING/RUNNING/FINISHED from the short-lived task.
+  testing::Sequence longTask;
+  Future<v1::scheduler::Event::Update> longStartingUpdate;
+  Future<v1::scheduler::Event::Update> longRunningUpdate;
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(longLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_STARTING))))
+    .InSequence(longTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&longStartingUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(longLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_RUNNING))))
+    .InSequence(longTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&longRunningUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  testing::Sequence shortTask;
+  Future<v1::scheduler::Event::Update> shortStartingUpdate;
+  Future<v1::scheduler::Event::Update> shortRunningUpdate;
+  Future<v1::scheduler::Event::Update> shortFinishedUpdate;
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(shortLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_STARTING))))
+    .InSequence(shortTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&shortStartingUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(shortLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_RUNNING))))
+    .InSequence(shortTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&shortRunningUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  EXPECT_CALL(
+      *scheduler,
+      update(_, AllOf(
+          TaskStatusUpdateTaskIdEq(shortLivedTaskInfo),
+          TaskStatusUpdateStateEq(v1::TASK_FINISHED))))
+    .InSequence(shortTask)
+    .WillOnce(DoAll(
+        FutureArg<1>(&shortFinishedUpdate),
+        v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+  // There should be two directories scheduled for GC:
+  // the short-lived task's metadata and sandbox.
+  vector<Future<Nothing>> schedules = {
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule)
+  };
+
+  mesos.send(
+      v1::createCallAccept(
+          frameworkId,
+          offer,
+          {
+            v1::LAUNCH_GROUP(
+                executorInfo, v1::createTaskGroupInfo({longLivedTaskInfo})),
+            v1::LAUNCH_GROUP(
+                executorInfo, v1::createTaskGroupInfo({shortLivedTaskInfo}))
+          }));
+
+  AWAIT_READY(collect(schedules));
+  AWAIT_READY(longStartingUpdate);
+  AWAIT_READY(longRunningUpdate);
+  AWAIT_READY(shortStartingUpdate);
+  AWAIT_READY(shortRunningUpdate);
+  AWAIT_READY(shortFinishedUpdate);
+
+  // Check that the short-lived task's metadata and sandbox exist.
+  string shortLivedTaskPath = slave::paths::getTaskPath(
+      slave::paths::getMetaRootDir(flags.work_dir),
+      devolve(agentId),
+      devolve(frameworkId),
+      devolve(executorInfo.executor_id()),
+      devolve(
+          shortStartingUpdate->status()
+            .container_status().container_id().parent()),
+      devolve(shortLivedTaskInfo.task_id()));
+
+  ASSERT_TRUE(os::exists(shortLivedTaskPath));
+
+  string shortLivedSandboxPath = path::join(
+      slave::paths::getExecutorRunPath(
+          flags.work_dir,
+            devolve(agentId),
+            devolve(frameworkId),
+            devolve(executorInfo.executor_id()),
+            devolve(
+                shortStartingUpdate->status()
+                  .container_status().container_id().parent())),
+      "containers",
+      shortStartingUpdate->status().container_status().container_id().value());
+
+  ASSERT_TRUE(os::exists(shortLivedSandboxPath));
+
+  // Restart the agent to wipe out any scheduled GC.
+  Future<SlaveReregisteredMessage> slaveReregisteredMessage =
+    FUTURE_PROTOBUF(SlaveReregisteredMessage(), _, _);
+  slave.get()->terminate();
+
+  // The agent should reregister once recovery is complete, which also means
+  // that any finished tasks metadata/sandboxes should be rescheduled for GC.
+  slave = StartSlave(detector.get(), flags);
+  ASSERT_SOME(slave);
+  AWAIT_READY(slaveReregisteredMessage);
+
+  // Trigger garbage collection on the short-lived task's directories
+  // and check that those are properly deleted.
+  Clock::pause();
+  Clock::advance(flags.gc_delay);
+  Clock::settle();
+  Clock::resume();
+
+  ASSERT_FALSE(os::exists(shortLivedTaskPath));
+  ASSERT_FALSE(os::exists(shortLivedSandboxPath));
+}
+
+
 TEST_F(GarbageCollectorIntegrationTest, DiskUsage)
 {
   Try<Owned<cluster::Master>> master = StartMaster();
diff --git a/src/tests/slave_recovery_tests.cpp b/src/tests/slave_recovery_tests.cpp
index 9710518..5842ccf 100644
--- a/src/tests/slave_recovery_tests.cpp
+++ b/src/tests/slave_recovery_tests.cpp
@@ -33,6 +33,7 @@
 
 #include <mesos/scheduler/scheduler.hpp>
 
+#include <process/collect.hpp>
 #include <process/dispatch.hpp>
 #include <process/gmock.hpp>
 #include <process/gtest.hpp>
@@ -1864,8 +1865,13 @@ TYPED_TEST(SlaveRecoveryTest, RecoverCompletedExecutor)
   Future<RegisterExecutorMessage> registerExecutor =
     FUTURE_PROTOBUF(RegisterExecutorMessage(), _, _);
 
-  Future<Nothing> schedule = FUTURE_DISPATCH(
-      _, &GarbageCollectorProcess::schedule);
+  // We use 'gc.schedule' as a proxy for the cleanup of the executor.
+  // The first event will correspond with the finished task, whereas
+  // the second is associated with the exited executor.
+  vector<Future<Nothing>> schedules = {
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule),
+    FUTURE_DISPATCH(_, &GarbageCollectorProcess::schedule)
+  };
 
   driver.launchTasks(offers1.get()[0].id(), {task});
 
@@ -1873,8 +1879,7 @@ TYPED_TEST(SlaveRecoveryTest, RecoverCompletedExecutor)
   AWAIT_READY(registerExecutor);
   ExecutorID executorId = registerExecutor->executor_id();
 
-  // We use 'gc.schedule' as a proxy for the cleanup of the executor.
-  AWAIT_READY(schedule);
+  AWAIT_READY(collect(schedules));
 
   slave.get()->terminate();
 


[mesos] 06/06: Added MESOS-7947 to the 1.7.0 CHANGELOG.

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 7000e7242864ff99f409d6391a34a3fbf13176d0
Author: Joseph Wu <jo...@apache.org>
AuthorDate: Wed Sep 5 16:52:40 2018 -0700

    Added MESOS-7947 to the 1.7.0 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index bd3d894..588eb82 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -258,6 +258,7 @@ All Resolved Issues:
 ** Improvement
   * [MESOS-6451] - Add timer and percentile for docker pull latency distribution.
   * [MESOS-7691] - Support local enabled cgroups subsystems automatically.
+  * [MESOS-7947] - Add GC capability to nested containers
   * [MESOS-8064] - Add capability so mesos can programmatically decode .zip, .tar, .gzip, and other common file compression schemes
   * [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
   * [MESOS-8340] - Add a no-enforce option to the `network/ports` isolator.


[mesos] 01/06: Enabled garbage collection of terminated tasks' metadata.

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f7416e79f593bfc2e3a7658697f1d0c104e2c2bd
Author: Joseph Wu <jo...@apache.org>
AuthorDate: Tue Jul 24 13:43:29 2018 -0700

    Enabled garbage collection of terminated tasks' metadata.
    
    This commit schedules tasks' metadata, whose terminal statuses have been
    acknowledged, for garbage collection.  GC occurs according to the
    existing GC policy (controlled by agent flags --gc_delay and
    --gc_disk_headroom).  This change helps mitigate potential accumulation
    of directories for long-lived, multi-task executors, such as the
    default executor.
    
    Review: https://reviews.apache.org/r/68065
---
 src/slave/slave.cpp | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 8255f0a..7425cfb 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -9667,6 +9667,19 @@ void Executor::completeTask(const TaskID& taskId)
     slave->detachTaskVolumeDirectories(info, containerId, {*firstTask});
   }
 
+  // Mark the task metadata (TaskInfo and status updates) for garbage
+  // collection. This is important for keeping the metadata of long-lived,
+  // multi-task executors within reasonable levels.
+  if (checkpoint) {
+    slave->garbageCollect(paths::getTaskPath(
+        slave->metaDir,
+        slave->info.id(),
+        frameworkId,
+        id,
+        containerId,
+        taskId));
+  }
+
   Task* task = terminatedTasks[taskId];
   completedTasks.push_back(shared_ptr<Task>(task));
   terminatedTasks.erase(taskId);