You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@aurora.apache.org by zm...@apache.org on 2016/11/04 20:42:10 UTC

aurora git commit: Send SIGTERM to daemonized processes on shutdown.

Repository: aurora
Updated Branches:
  refs/heads/master fb936b545 -> 5410c229f


Send SIGTERM to daemonized processes on shutdown.

Problem

Processes can deamonize and escape the supervision of a coordinator. Using the
Docker Containerizer or the Mesos Containerizer with pid isolation means that
the processes will be come reparented to the sh process that launches the
executor. For example:

```
root@aurora:/# ps xf
  PID TTY      STAT   TIME COMMAND
   48 ?        Ss     0:00 /bin/bash
   86 ?        R+     0:00  _ ps xf
    1 ?        Ss     0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/va
    5 ?        Sl     0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/vag
   23 ?        S      0:00  _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be152 --
   29 ?        Ss     0:00      _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15
   32 ?        S      0:00      |   _ /bin/bash -c      while true; do       echo hello world       sleep 10     done
   81 ?        S      0:00      |       _ sleep 10
   31 ?        Ss     0:00      _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15
   33 ?        S      0:00          _ /bin/bash -c      while true; do       echo hello world       sleep 10     done
   82 ?        S      0:00              _ sleep 10
   47 ?        S      0:00 python ./daemon.py
```

Solution

Ensure processes that escape the supervision of the coordinator reparent to the
runner who can send signals to them on task tear down. We do this by using the
`PR_SET_CHILD_SUBREAPER` flag of `prctl(2)`.

After this change the process tree looks like:
```
root@aurora:/# ps xf
  PID TTY      STAT   TIME COMMAND
   66 ?        Ss     0:00 /bin/bash
   70 ?        R+     0:00  _ ps xf
    1 ?        Ss     0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/va
    5 ?        Sl     0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/vag
   23 ?        S      0:00  _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b849 --
   33 ?        Ss     0:00      _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84
   40 ?        S      0:00      |   _ /bin/bash -c      while true; do       echo hello world       sleep 10     done
   63 ?        S      0:00      |       _ sleep 10
   36 ?        Ss     0:00      _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84
   37 ?        S      0:00      |   _ /bin/bash -c      while true; do       echo hello world       sleep 10     done
   62 ?        S      0:00      |       _ sleep 10
   55 ?        S      0:00      _ python ./daemon.py

```

Now the runner is aware of the reparented procesess can can tear it down cleanly
with a `SIGTERM`.

Testing Done:
src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

Bugs closed: AURORA-1808

Reviewed at https://reviews.apache.org/r/53418/


Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/5410c229
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/5410c229
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/5410c229

Branch: refs/heads/master
Commit: 5410c229f30d6d8e331cdddc5c84b9b2b5313c01
Parents: fb936b5
Author: Zameer Manji <zm...@apache.org>
Authored: Fri Nov 4 13:41:25 2016 -0700
Committer: Zameer Manji <zm...@apache.org>
Committed: Fri Nov 4 13:41:25 2016 -0700

----------------------------------------------------------------------
 RELEASE-NOTES.md                                |  2 +
 .../apache/thermos/common/process_util.py       | 30 +++++++
 src/main/python/apache/thermos/core/helper.py   | 20 +++++
 src/main/python/apache/thermos/core/process.py  | 22 +++++-
 src/main/python/apache/thermos/core/runner.py   |  2 +
 .../aurora/e2e/test_daemonizing_process.aurora  | 83 ++++++++++++++++++++
 .../sh/org/apache/aurora/e2e/test_end_to_end.sh | 29 ++++++-
 7 files changed, 182 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/RELEASE-NOTES.md
----------------------------------------------------------------------
diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md
index d89ef2f..94224be 100644
--- a/RELEASE-NOTES.md
+++ b/RELEASE-NOTES.md
@@ -10,6 +10,8 @@
 - The Aurora Scheduler API supports volume mounts per task for the Mesos
   Containerizer if the scheduler is running with the `-allow_container_volumes`
   flag.
+* The executor will send SIGTERM to processes that self daemonize via double forking.
+* The executor now requires Linux kernel 3.4 or later.
 
 ### Deprecations and removals:
 

http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/common/process_util.py
----------------------------------------------------------------------
diff --git a/src/main/python/apache/thermos/common/process_util.py b/src/main/python/apache/thermos/common/process_util.py
index abd2c0e..c63b9af 100644
--- a/src/main/python/apache/thermos/common/process_util.py
+++ b/src/main/python/apache/thermos/common/process_util.py
@@ -12,8 +12,11 @@
 # limitations under the License.
 #
 
+import ctypes
 import os
 
+from twitter.common import log
+
 from gen.apache.aurora.api.constants import TASK_FILESYSTEM_MOUNT_POINT
 
 
@@ -42,3 +45,30 @@ def wrap_with_mesos_containerizer(cmdline, user, cwd, mesos_containerizer_path):
               os.path.join(os.environ['MESOS_DIRECTORY'], TASK_FILESYSTEM_MOUNT_POINT),
               user,
               bash_wrapper % cmdline))
+
+
+def setup_child_subreaping():
+  """
+  This uses the prctl(2) syscall to set the `PR_SET_CHILD_SUBREAPER` flag. This
+  means if any children processes need to be reparented, they will be reparented
+  to this process.
+
+  More documentation here: http://man7.org/linux/man-pages/man2/prctl.2.html
+  and here: https://lwn.net/Articles/474787/
+
+  Callers should reap terminal children to prevent zombies.
+
+  raises OSError if the underlying prctl call fails.
+  raises RuntimeError if libc cannot be found.
+  """
+  log.debug("Calling prctl(2) with PR_SET_CHILD_SUBREAPER")
+  # This constant is taken from prctl.h
+  PR_SET_CHILD_SUBREAPER = 36
+  library_name = ctypes.util.find_library('c')
+  if library_name is None:
+    raise RuntimeError("libc not found")
+  libc = ctypes.CDLL(library_name, use_errno=True)
+  ret = libc.prctl(PR_SET_CHILD_SUBREAPER, 1, 0, 0, 0)
+  if ret != 0:
+    errno = ctypes.get_errno()
+    raise OSError(errno, os.strerror(errno))

http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/core/helper.py
----------------------------------------------------------------------
diff --git a/src/main/python/apache/thermos/core/helper.py b/src/main/python/apache/thermos/core/helper.py
index 68855e1..0811e84 100644
--- a/src/main/python/apache/thermos/core/helper.py
+++ b/src/main/python/apache/thermos/core/helper.py
@@ -219,6 +219,26 @@ class TaskRunnerHelper(object):
     return state.processes[process_name][-1].coordinator_pid
 
   @classmethod
+  def terminate_orphans(cls, state):
+    """
+    Given the state, send SIGTERM to children that are orphaned processes.
+
+    The direct children of the runner will always be coordinators or orphans.
+    """
+    log.debug('TaskRunnerHelper.terminate_orphans()')
+    process_tree = cls.scan_tree(state)
+
+    coordinator_pids = {p[0] for p in process_tree.values() if p[0]}
+    children_pids = {c.pid for c in psutil.Process().children()}
+    orphaned_pids = children_pids - coordinator_pids
+
+    if len(orphaned_pids) > 0:
+      log.info("Orphaned pids detected: %s", orphaned_pids)
+      for p in orphaned_pids:
+        log.debug("SIGTERM pid %s", p)
+        cls.terminate_pid(p)
+
+  @classmethod
   def terminate_process(cls, state, process_name):
     log.debug('TaskRunnerHelper.terminate_process(%s)' % process_name)
     _, pid, _ = cls._get_process_tuple(state, process_name)

http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/core/process.py
----------------------------------------------------------------------
diff --git a/src/main/python/apache/thermos/core/process.py b/src/main/python/apache/thermos/core/process.py
index 3ec43e2..13f9ad5 100644
--- a/src/main/python/apache/thermos/core/process.py
+++ b/src/main/python/apache/thermos/core/process.py
@@ -39,7 +39,7 @@ from twitter.common.lang import Interface
 from twitter.common.quantity import Amount, Data, Time
 from twitter.common.recordio import ThriftRecordReader, ThriftRecordWriter
 
-from apache.thermos.common.process_util import wrap_with_mesos_containerizer
+from apache.thermos.common.process_util import setup_child_subreaping, wrap_with_mesos_containerizer
 
 from gen.apache.aurora.api.constants import TASK_FILESYSTEM_MOUNT_POINT
 from gen.apache.thermos.ttypes import ProcessState, ProcessStatus, RunnerCkpt
@@ -94,6 +94,7 @@ class ProcessBase(object):
   class CheckpointError(Error): pass
   class UnspecifiedSandbox(Error): pass
   class PermissionError(Error): pass
+  class ForkError(Error): pass
 
   CONTROL_WAIT_CHECK_INTERVAL = Amount(100, Time.MILLISECONDS)
   MAXIMUM_CONTROL_WAIT = Amount(1, Time.MINUTES)
@@ -151,8 +152,9 @@ class ProcessBase(object):
       if self._rotate_log_backups <= 0:
         raise ValueError('Log backups cannot be less than one.')
 
-  def _log(self, msg):
-    log.debug('[process:%5s=%s]: %s' % (self._pid, self.name(), msg))
+  def _log(self, msg, exc_info=None):
+    log.debug('[process:%5s=%s]: %s' % (self._pid, self.name(), msg),
+            exc_info=exc_info)
 
   def _getpwuid(self):
     """Returns a tuple of the user (i.e. --user) and current user."""
@@ -283,7 +285,16 @@ class ProcessBase(object):
                           # calls _getpwuid which can raise:
                           #    UnknownUserError
                           #    PermissionError
-    self._pid = self._platform.fork()
+    try:
+      self._pid = self._platform.fork()  # calls setup_child_subreaping which can
+                                         # raise OSError or RuntimeError
+    except (OSError, RuntimeError) as e:
+      # Reraise the exceptions possible from the fork as Process.Error
+      # Note only Python 3 has nice exception chaining, so we do our best here
+      # by logging the original exception and raising ForkError
+      msg = 'Error trying to fork process %s'.format(self._name)
+      self._log(msg, exc_info=True)
+      raise self.ForkError(msg)
     if self._pid == 0:
       self._pid = self._platform.getpid()
       self._wait_for_control()  # can raise CheckpointError
@@ -312,6 +323,9 @@ class RealPlatform(Platform):
     self._fork = fork
 
   def fork(self):
+    # Before we fork, ensure we become the parent of any processes that escape
+    # the cordinator.
+    setup_child_subreaping()
     pid = self._fork()
     if pid == 0:
       self._sanitize()

http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/core/runner.py
----------------------------------------------------------------------
diff --git a/src/main/python/apache/thermos/core/runner.py b/src/main/python/apache/thermos/core/runner.py
index 7b9013d..1b63c08 100644
--- a/src/main/python/apache/thermos/core/runner.py
+++ b/src/main/python/apache/thermos/core/runner.py
@@ -873,6 +873,8 @@ class TaskRunner(object):
     return len(launched) > 0
 
   def _terminate_plan(self, plan):
+    TaskRunnerHelper.terminate_orphans(self.state)
+
     for process in plan.running:
       last_run = self._current_process_run(process)
       if last_run and last_run.state in (ProcessState.FORKED, ProcessState.RUNNING):

http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora
----------------------------------------------------------------------
diff --git a/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora b/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora
new file mode 100644
index 0000000..3a204f6
--- /dev/null
+++ b/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora
@@ -0,0 +1,83 @@
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+hello_loop = Process(
+  name = 'hello',
+  cmdline = """
+    while true; do
+      echo hello world
+      sleep 10
+    done
+  """)
+
+# Write out a python program that self daemonizes via double forking.
+# Asserts that thermos doesn't lose track of double forking proceses. On task
+# tear down it should be given a SIGTERM so it can shut down cleanly.
+write_program = Process(
+  name = "write_program",
+  cmdline = """
+    cat >./daemon.py <<EOL
+import os
+import signal
+import sys
+import time
+
+def handler(signum, frame):
+    os.remove("{{term_file}}")
+    sys.exit(0)
+
+def main():
+    pid = os.fork()
+    if pid > 0:
+        sys.exit(0)
+
+    os.setsid()
+    os.umask(0)
+
+    pid = os.fork()
+    if pid > 0:
+        sys.exit(0)
+
+    signal.signal(signal.SIGTERM, handler)
+    while True:
+        time.sleep(1)
+
+if __name__ == '__main__':
+    main()
+
+EOL
+  """
+)
+
+run_daemon = Process(
+  name = 'run_daemon',
+  cmdline = 'python ./daemon.py'
+)
+
+task = Task(
+  processes = [hello_loop, write_program, run_daemon],
+  constraints = order(write_program, run_daemon),
+  resources = Resources(cpu=1, ram=1*MB, disk=100*MB)
+)
+
+jobs = [
+  Service(
+    cluster = 'devcluster',
+    environment = 'test',
+    role = 'vagrant',
+    name = 'daemonize',
+    task = task
+  )
+]

http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
----------------------------------------------------------------------
diff --git a/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh b/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
index 67702d2..f014b28 100755
--- a/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
+++ b/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
@@ -58,7 +58,7 @@ check_url_live() {
 test_file_removed() {
   local _file=$1
   local _success=0
-  for i in $(seq 1 10); do
+  for i in {1..10}; do
     if [[ ! -e $_file ]]; then
       _success=1
       break
@@ -66,7 +66,7 @@ test_file_removed() {
     sleep 1
   done
 
-  if [[ "$_success" -ne "1" ]]; then
+  if [[ $_success -ne 1 ]]; then
     echo "File was not removed."
     exit 1
   fi
@@ -391,6 +391,19 @@ test_ephemeral_daemon_with_final() {
   test_file_removed $_stop_file  # Removed by 'final_process'.
 }
 
+test_daemonizing_process() {
+  local _cluster=$1 _role=$2 _env=$3 _job=$4 _config=$5
+  local _jobkey="$_cluster/$_role/$_env/$_job"
+  local _term_file=$(mktemp)
+  local _extra_args="--bind term_file=$_term_file"
+
+  test_create $_jobkey $_config $_extra_args
+  test_observer_ui $_cluster $_role $_job
+  test_job_status $_cluster $_role $_env $_job
+  test_kill $_jobkey
+  test_file_removed $_term_file
+}
+
 restore_netrc() {
   mv ~/.netrc.bak ~/.netrc >/dev/null 2>&1 || true
 }
@@ -481,6 +494,8 @@ TEST_CONFIG_UPDATED_FILE=$EXAMPLE_DIR/http_example_updated.aurora
 TEST_BAD_HEALTHCHECK_CONFIG_UPDATED_FILE=$EXAMPLE_DIR/http_example_bad_healthcheck.aurora
 TEST_EPHEMERAL_DAEMON_WITH_FINAL_JOB=ephemeral_daemon_with_final
 TEST_EPHEMERAL_DAEMON_WITH_FINAL_CONFIG_FILE=$TEST_ROOT/ephemeral_daemon_with_final.aurora
+TEST_DAEMONIZING_PROCESS_JOB=daemonize
+TEST_DAEMONIZING_PROCESS_CONFIG_FILE=$TEST_ROOT/test_daemonizing_process.aurora
 
 BASE_ARGS=(
   $TEST_CLUSTER
@@ -509,6 +524,14 @@ TEST_JOB_EPHEMERAL_DAEMON_WITH_FINAL_ARGS=(
   $TEST_EPHEMERAL_DAEMON_WITH_FINAL_CONFIG_FILE
 )
 
+TEST_DAEMONIZING_PROCESS_ARGS=(
+  $TEST_CLUSTER
+  $TEST_ROLE
+  $TEST_ENV
+  $TEST_DAEMONIZING_PROCESS_JOB
+  $TEST_DAEMONIZING_PROCESS_CONFIG_FILE
+)
+
 trap collect_result EXIT
 
 aurorabuild all
@@ -535,6 +558,8 @@ test_basic_auth_unauthenticated  "${TEST_JOB_ARGS[@]}"
 
 test_ephemeral_daemon_with_final "${TEST_JOB_EPHEMERAL_DAEMON_WITH_FINAL_ARGS[@]}"
 
+test_daemonizing_process "${TEST_DAEMONIZING_PROCESS_ARGS[@]}"
+
 /vagrant/src/test/sh/org/apache/aurora/e2e/test_kerberos_end_to_end.sh
 /vagrant/src/test/sh/org/apache/aurora/e2e/test_bypass_leader_redirect_end_to_end.sh
 RETCODE=0