You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Joseph Wu (JIRA)" <ji...@apache.org> on 2015/11/12 20:12:11 UTC

[jira] [Commented] (MESOS-3863) Investigate the requirements of programmatically re-initializing libprocess

    [ https://issues.apache.org/jira/browse/MESOS-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002673#comment-15002673 ] 

Joseph Wu commented on MESOS-3863:
----------------------------------

Inter-dependency between {{process_manager}} and {{socket_manager}} will complicate things:

* {{process_manager}} holds the {{gc}} and various {{HttpProxy}} processes.
* {{socket_manager}} spawns {{HttpProxy}} processes and relies on {{gc}} to clean them up.
* {{gc}} relies on {{socket_manager}} links to clean up processes.

{{process::finalize}} should:
# Clean up all processes other than {{gc}}.  This will clear all links and delete all {{HttpProxy}} s while {{socket_manager}} still exists.
# Close all sockets via {{SocketManager::close}}.  All of {{socket_manager}} 's state is cleaned up via {{SocketManager::close}}, including termination of {{HttpProxy}} (termination is idempotent, meaning that killing {{HttpProxy}} s via {{process_manager}} is safe).
# At this point, {{socket_manager}} should be empty and only the {{gc}} process should be running.  (Since we're finalizing, assume there are no threads trying to spawn processes.)  {{socket_manager}} can be deleted.
# {{gc}} can be deleted.  This is currently a leaked pointer, so we'll also need to track and delete that.
# {{process_manager}} should be devoid of processes, so we can proceed with cleanup (join threads, stop the {{EventLoop}}, etc).

> Investigate the requirements of programmatically re-initializing libprocess
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-3863
>                 URL: https://issues.apache.org/jira/browse/MESOS-3863
>             Project: Mesos
>          Issue Type: Task
>          Components: libprocess, test
>            Reporter: Joseph Wu
>            Assignee: Joseph Wu
>              Labels: mesosphere
>
> This issue is for investigating what needs to be added/changed in {{process::finalize}} such that {{process::initialize}} will start on a clean slate.  Additional issues will be created once done.  Also see [the parent issue|MESOS-3820].
> {{process::finalize}} should cover the following components:
> * {{__s__}} (the server socket)
> ** {{delete}} should be sufficient.  This closes the socket and thereby prevents any further interaction from it.
> * {{process_manager}}
> ** Related prior work: [MESOS-3158]
> ** Cleans up the garbage collector, help, logging, profiler, statistics, route processes (including [this one|https://github.com/apache/mesos/blob/3bda55da1d0b580a1b7de43babfdc0d30fbc87ea/3rdparty/libprocess/src/process.cpp#L963], which currently leaks a pointer).
> ** Cleans up any other {{spawn}} 'd process.
> ** Manages the {{EventLoop}}.
> * {{Clock}}
> ** The goal here is to clear any timers so that nothing can deference {{process_manager}} while we're finalizing/finalized.  It's probably not important to execute any remaining timers, since we're "shutting down" libprocess.  This means:
> *** The clock should be {{paused}} and {{settled}} before the clean up of {{process_manager}}.
> *** Processes, which might interact with the {{Clock}}, should be cleaned up next.
> *** A new {{Clock::finalize}} method would then clear timers, process-specific clocks, and {{tick}} s; and then {{resume}} the clock.
> * {{__address__}} (the advertised IP and port)
> ** Needs to be cleared after {{process_manager}} has been cleaned up.  Processes use this to communicate events.  If cleared prematurely, {{TerminateEvents}} will not be sent correctly, leading to infinite waits.
> * {{socket_manager}}
> ** The idea here is to close all sockets and deallocate any existing {{HttpProxy}} or {{Encoder}} objects.
> ** All sockets are created via {{__s__}}, so cleaning up the server socket prior will prevent any new activity.
> * {{mime}}
> ** This is effectively a static map.
> ** It should be possible to statically initialize it.
> * Synchronization atomics {{initialized}} & {{initializing}}.
> ** Once cleanup is done, these should be reset.
> *Summary*:
> * Implement {{Clock::finalize}}.  [MESOS-3882]
> * Implement {{~SocketManager}}.
> * Clean up {{mime}}.
> * Wrap everything up in {{process::finalize}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)