You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Greg Mann <gr...@mesosphere.io> on 2017/01/10 21:21:55 UTC

Re: Review Request 54081: Added `--pidfile` option to master and agent binaries.


> On Dec. 6, 2016, 10:42 p.m., Benjamin Mahler wrote:
> > Thanks Ilya!
> > 
> > Have you looked at other pidfile related libraries? Looks like BSD provides some functions for this (they're also available on Linux):
> > https://www.freebsd.org/cgi/man.cgi?query=pidfile&sektion=3&manpath=FreeBSD+6.1-RELEASE
> > https://linux.die.net/man/3/pidfile
> > 
> > It would be great to provide an os-agnostic library within stout for this, e.g.:
> > 
> > ```
> > pidfile::open()
> > pidfile::close()
> > pidfile::remove()
> > ```
> > 
> > I haven't looked too deeply at what we should have these take and return, but I noticed the use of RAII seemed unnecessary in your patch (only the .cpp implementation uses the exposed RAII class? If so, perhaps we can just leave RAII out for now since the callers do not use it). Also it would be great to which process is already running (which is provided by the BSD/Linux functions). Also, another suggestion as a first step here is to avoid Windows support entirely for now if we don't implement the locking aspect of this (as that seems necessary for this to be safely usable?)
> > 
> > We could structure the patches like this:
> > 
> > (1) Addition of pidfile utilities to stout.
> > (2) Tests of the pidfile utilities in stout (should be possible?)
> > (3) Integration of the pidfile utilities into mesos (manually test since the logic will reside in the mains?)
> 
> Ilya Pronin wrote:
>     Thanks for the feedback!
>     
>     > Looks like BSD provides some functions for this (they're also available on Linux)
>     
>     Yes, I saw `libbsd` but since it's not part of POSIX I decided to make things portable.
>     
>     > Also it would be great to which process is already running
>     
>     Good idea! I will add it.
>     
>     > I noticed the use of RAII seemed unnecessary in your patch (only the .cpp implementation uses the exposed RAII class? If so, perhaps we can just leave RAII out for now since the callers do not use it).
>     
>     My intention was to try to be nice and clean up the PID file when daemon exits through `std::exit()` or terminates on `SIGTERM`. For handling the first case I created a static RAII object so its destructor will be called during `std::exit()` cleanup. We could place this object in `main()`. But here comes the second case. Our daemons perform default `SIGTERM` action which is `term`. There's a signal handler in `logging/logging.cpp` where we can unlink the file. For that I introduced a helper function and placed the RAII object inside `common/pid_file.cpp`.
>     
>     So I think a RAII class is a good thing for implementing such thing. I'm OK with moving it to stout and restructuring patches.
>     
>     Also I just thought that I should check if the destructor is executing inside the process that created the PID file to protect from unlinking by forked processes.
>     
>     What do you think?
> 
> Alexander Rukletsov wrote:
>     A problem with the RAII class here is that we sometimes manually have to release the lock (in case of a signal). When I look at this patch I read something like "Hey, you don't have to close & remove the pidfile explicitly, we'll do that for you... well, unless your process is signaled". I'm inclined to require users call `removePidFile()` explicitly to avoid creaing a false feeling of safety.
>     
>     Another question: Why do you want to expose `PIDFile` class in the `.hpp`?
> 
> Ilya Pronin wrote:
>     Well, I wrote it more like "you can remove it manually if you want but if you just forget about it nothing bad will happen" :) The lock will be released automatically when the fd is closed (e.g. on process termination).
>     
>     I think current `SIGTERM` handling is an exceptional case here (even though I assume it is the most frequent way the daemon terminates). Another such case could be calling `std::_Exit()`. These are situations when usual cleanup is not performed and some manual actions may be required.
>     
>     Do we want to always do cleanup manually and explicitly? If this is the accepted way in Mesos then of course RAII is not appliable here.
>     
>     > Why do you want to expose PIDFile class in the .hpp?
>     
>     For testing. Although there's only 1 simple test.
> 
> Benjamin Mahler wrote:
>     Ah ok, I see all of the cases you were looking to cover now!
>     
>     While we figure out if we can leverage RAII and a static global smart pointer, we can start with a simple set of function utilities in stout (such that one could build the RAII wrapper on top of them, if needed).
>     
>     For the static smart pointer, we currently follow Google's style guide rule to ban "static non-POD variables" in order to avoid destructor ordering issues and access during/after destruction issues:
>     https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables
>     
>     We could try to justify an exception here, but it seems that we could just acheive the same behavior using `std::atexit` and `std::at_quick_exit`? Either way, unlinking seems like an "optimization" in that the tooling will have to handle the case where SIGKILL is used and the pidfile remains but has been unlocked. So maybe we could add the `std::atexit` and `std::at_quick_exit` as a distinct patch on top of the "non-optimized" version.
> 
> Ilya Pronin wrote:
>     I agree. Using `std::at_quick_exit()` will even cover an additional case of `std::quick_exit()`.
>     
>     So with free utility functions we will have a design like this (everything is in a namespace):
>     ```c++
>     // Contains path, FD, PID.
>     struct Handle;
>     
>     // Open, lock and write a file at the specified path.
>     Try<Handle> open(const string&);
>     
>     // Close the file.
>     Try<Nothing> close(const Handle&);
>     
>     // Close and remove the file.
>     Try<Nothing> remove(const Handle&);
>     ```
>     
>     Seems like we could still use a class with similar member functions here instead of a C-style handle-struct?

A class with member functions would seem to me to imply that the lifecycle of the pid file is linked to the lifecycle of the class, and it sounds like that's something we want to avoid at the moment?


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54081/#review158240
-----------------------------------------------------------


On Nov. 28, 2016, 12:16 p.m., Ilya Pronin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54081/
> -----------------------------------------------------------
> 
> (Updated Nov. 28, 2016, 12:16 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Vinod Kone.
> 
> 
> Bugs: MESOS-1648
>     https://issues.apache.org/jira/browse/MESOS-1648
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The PID file is created and kept locked while master / agent is running to prevent other instances with the same PID file location from starting.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt d6e213686a44fbca0ed841594ce803151224a5a7 
>   src/Makefile.am dd1626d177b38a6613f18f32bb0668abbb5100e0 
>   src/common/pid_file.hpp PRE-CREATION 
>   src/common/pid_file.cpp PRE-CREATION 
>   src/logging/logging.cpp 70d66a5c396f709e8f27ad0d51315ed6d257f73b 
>   src/master/main.cpp fa7ba1310142a3bef71379ba37fded9b8390aae9 
>   src/slave/main.cpp 8010f8e229e2d820649750c9db0456ecd1b854d3 
>   src/tests/common/pid_file_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/54081/diff/
> 
> 
> Testing
> -------
> 
> Added test to verify that PID file is created upon `PIDFile` object creation and deleted upon its destruction. Ran `make check`.
> 
> 
> Thanks,
> 
> Ilya Pronin
> 
>