You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by bm...@apache.org on 2018/01/09 02:59:10 UTC

mesos git commit: Abort libprocess when a Process throws an uncaught exception.

Repository: mesos
Updated Branches:
  refs/heads/master cc1a33642 -> 3290b401d


Abort libprocess when a Process throws an uncaught exception.

Previously, we logged the exception, terminated the throwing
Process, and continued to run. However, currently there exists
no known user-level code that I'm aware of that handles the
unexpected termination due to an uncaught exception.

Generally, this means that when an exception is thrown (e.g.
a bad call to std::map::at), the process terminates with a log
message but things get "stuck" and the user has to debug what
is wrong / kill the process.

Libprocess would likely need to provide some primitives to
better support handling unexpected termination of a Process
in order for us to provide a strategy where we continue running.

Review: https://reviews.apache.org/r/64939


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/3290b401
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/3290b401
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/3290b401

Branch: refs/heads/master
Commit: 3290b401d20f2db2933294470ea8a2356a47c305
Parents: cc1a336
Author: Benjamin Mahler <bm...@apache.org>
Authored: Wed Jan 3 17:09:22 2018 -0800
Committer: Benjamin Mahler <bm...@apache.org>
Committed: Mon Jan 8 18:56:27 2018 -0800

----------------------------------------------------------------------
 3rdparty/libprocess/src/process.cpp | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/3290b401/3rdparty/libprocess/src/process.cpp
----------------------------------------------------------------------
diff --git a/3rdparty/libprocess/src/process.cpp b/3rdparty/libprocess/src/process.cpp
index 75cf1d3..53969a7 100644
--- a/3rdparty/libprocess/src/process.cpp
+++ b/3rdparty/libprocess/src/process.cpp
@@ -2816,8 +2816,19 @@ void ProcessManager::resume(ProcessBase* process)
         state == ProcessBase::State::READY);
 
   if (state == ProcessBase::State::BOTTOM) {
-    try { process->initialize(); }
-    catch (...) { terminate = true; }
+    // In the event that the process throws an exception,
+    // we will abort the program.
+    //
+    // TODO(bmahler): Consider providing recovery mechanisms.
+    try {
+      process->initialize();
+    } catch (const std::exception& e) {
+      LOG(FATAL) << "Aborting libprocess: '" << process->pid << "'"
+                 << " threw exception during initialization: " << e.what();
+    } catch (...) {
+      LOG(FATAL) << "Aborting libprocess: '" << process->pid << "'"
+                 << " threw exception during initialization: unknown";
+    }
 
     state = ProcessBase::State::READY;
     process->state.store(state);
@@ -2915,17 +2926,18 @@ void ProcessManager::resume(ProcessBase* process)
       // Determine if we should terminate.
       terminate = event->is<TerminateEvent>();
 
-      // Now service the event.
+      // Now service the event. In the event that the process
+      // throws an exception, we will abort the program.
+      //
+      // TODO(bmahler): Consider providing recovery mechanisms.
       try {
         process->serve(std::move(*event));
       } catch (const std::exception& e) {
-        LOG(ERROR) << "libprocess: " << process->pid
-                   << " terminating due to " << e.what();
-        terminate = true;
+        LOG(FATAL) << "Aborting libprocess: '" << process->pid << "'"
+                   << " threw exception: " << e.what();
       } catch (...) {
-        LOG(ERROR) << "libprocess: " << process->pid
-                   << " terminating due to unknown exception";
-        terminate = true;
+        LOG(FATAL) << "Aborting libprocess: '" << process->pid << "'"
+                   << " threw unknown exception";
       }
 
       delete event;