You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Andrew Schwartzmeyer <an...@schwartzmeyer.com> on 2017/02/08 21:11:26 UTC

Re: Review Request 56362: Windows: Create the Windows container launcher.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/
-----------------------------------------------------------

(Updated Feb. 8, 2017, 9:11 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
-------

Squash the copy of the `PosixLauncher` and the refactor into one patch; address some review comments.


Summary (updated)
-----------------

Windows: Create the Windows container launcher.


Bugs: MESOS-6892
    https://issues.apache.org/jira/browse/MESOS-6892


Repository: mesos


Description (updated)
-------

Instead of deriving `WindowsLauncher` from `PosixLauncher`,
this commit implements a separate `WindowsLauncher` derived
from `Launcher` parallel to the `PosixLauncher`.

This launcher is then refactored to use Windows' "Job Objects," which
are similar to Linux's cgroups, and enable us to reason about a group
of processes associated with a task/container as a "Job Object"
instead of a root PID and the tree containing its children. The latter
is not a reasonable approach on Windows, and has been the source of
subtle bugs.

The Job Object approach creates a named job with a one-to-one mapping
to the containerizer, and assigns the first process started for the
task to the job object. After being assigned, the Windows kernel
ensures all spawned child processes
(specifically those made with the `CreateProcess` system call) are
also assigned to the named job object. Thus this job object can then
be used to query the process group's resource usage, kill the process
group, and set limits on the process group.

So instead of seeing a process group as a tree and referring to it by
the root process's PID, the `WindowsLauncher` sees a process group as
a named Job Object, the same way the Windows kernel sees it. However,
the containerizer code which interacts with the launcher still refers
to a task group by the singular PID, and so we have a sort of shim
which maps the initial PID to the name of the job object. This is a an
unfortunate consequence of the shared containerizer code being
originally written for POSIX-like systems.

This abstraction sets us up for implementing resource usage limits on
the process group as a whole.


Diffs (updated)
-----

  src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
  src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
  src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
  src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
  src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
  src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/56362/diff/


Testing
-------


Thanks,

Andrew Schwartzmeyer


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Joseph Wu <jo...@mesosphere.io>.

> On Feb. 9, 2017, 4:24 p.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, line 185
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line185>
> >
> >     s/container.get().handle/container->handle/
> 
> Andrew Schwartzmeyer wrote:
>     Ah thanks, didn't know `Option` had `operator->()` defined this way. Why don't we use this elsewhere?

The operator wasn't always available.  We first added it to `Try<>` and then to `Option<>`; for the sake of readability.

We do use it in a couple places, and we like new code to use it where appropriate.


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------


On Feb. 9, 2017, 10:50 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 10:50 a.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, line 148
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line148>
> >
> >     Ditto about the constructor.

Ditto on the fix.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------


On Feb. 9, 2017, 6:50 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 6:50 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, lines 59-60
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line59>
> >
> >     I recall some talk about dis-allowing certain `std::` helpers on the `hashmap` and `hashset` classes.  In which case, we'd need to rethink https://reviews.apache.org/r/56363
> >     
> >     Anyway, for this method, I'd recommend constructing a `hashset` of PIDs and checking/updating this set while recovering.  Because O(N).

Is this really a CPU bottleneck on the agent here? The `PosixLauncher` does the [same O(n) operation](https://github.com/andschwa/mesos/blob/master/src/slave/containerizer/mesos/launcher.cpp#L62) (I point it out because this code is pretty much just a derivation of the former).


> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, line 78
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line78>
> >
> >     Since you are assigning the `Container` is this way, you don't actually need to specify the constructor in the header.
> >     
> >     So either remove the constructor, or use it in this file.

I'm not sure about this. I didn't add the constructor until I compiled and _couldn't_ brace initialize without it. That said, I have some default member initializers. Jo, are we using C++11 or C++17?

Oh, I forgot I needed to click publish. This is the error you get with MSVC if you use the brace initializer without a constructor:

```
17>C:\Users\andschwa\src\mesos\src\slave\containerizer\mesos\windows_launcher.cpp(78): error C2440: 'initializing': cannot convert from 'initializer list' to 'mesos::internal::slave::WindowsLauncher::Container'
```

Now trying without member initializers...


> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, line 162
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line162>
> >
> >     Can you add a TODO here about how executors will die when the agent dies?

Sure, it has been a subject of much debate :D


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------


On Feb. 9, 2017, 6:50 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 6:50 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, line 185
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line185>
> >
> >     s/container.get().handle/container->handle/

Ah thanks, didn't know `Option` had `operator->()` defined this way. Why don't we use this elsewhere?


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------


On Feb. 9, 2017, 6:50 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 6:50 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, lines 59-60
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line59>
> >
> >     I recall some talk about dis-allowing certain `std::` helpers on the `hashmap` and `hashset` classes.  In which case, we'd need to rethink https://reviews.apache.org/r/56363
> >     
> >     Anyway, for this method, I'd recommend constructing a `hashset` of PIDs and checking/updating this set while recovering.  Because O(N).
> 
> Andrew Schwartzmeyer wrote:
>     Is this really a CPU bottleneck on the agent here? The `PosixLauncher` does the [same O(n) operation](https://github.com/andschwa/mesos/blob/master/src/slave/containerizer/mesos/launcher.cpp#L62) (I point it out because this code is pretty much just a derivation of the former).

(I don't mind a better operation; where `n` is `containers.size()` and `p` is number of PIDs to check, one time `O(n)` cache of container PIDs to a hashmap upfront, followed by `O(p)` lookups afterward for `O(p)` instead of `O(p*n)`. But then we should do this for the `PosixLauncher` too.)


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------


On Feb. 9, 2017, 6:50 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 6:50 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, line 78
> > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line78>
> >
> >     Since you are assigning the `Container` is this way, you don't actually need to specify the constructor in the header.
> >     
> >     So either remove the constructor, or use it in this file.
> 
> Andrew Schwartzmeyer wrote:
>     I'm not sure about this. I didn't add the constructor until I compiled and _couldn't_ brace initialize without it. That said, I have some default member initializers. Jo, are we using C++11 or C++17?
>     
>     Oh, I forgot I needed to click publish. This is the error you get with MSVC if you use the brace initializer without a constructor:
>     
>     ```
>     17>C:\Users\andschwa\src\mesos\src\slave\containerizer\mesos\windows_launcher.cpp(78): error C2440: 'initializing': cannot convert from 'initializer list' to 'mesos::internal::slave::WindowsLauncher::Container'
>     ```
>     
>     Now trying without member initializers...

Yup. I guess we're strictly C++11, and not say C++14. If I'm understanding [this](http://en.cppreference.com/w/cpp/language/aggregate_initialization) correctly, until C++14 you can't aggregate initialize with default memeber initializers. Fixing.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------


On Feb. 9, 2017, 6:50 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 6:50 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review165027
-----------------------------------------------------------




src/slave/containerizer/mesos/windows_launcher.cpp (lines 19 - 26)
<https://reviews.apache.org/r/56362/#comment236868>

    Of these, I think only `process/process.hpp` is actually used.



src/slave/containerizer/mesos/windows_launcher.cpp (lines 59 - 60)
<https://reviews.apache.org/r/56362/#comment236867>

    I recall some talk about dis-allowing certain `std::` helpers on the `hashmap` and `hashset` classes.  In which case, we'd need to rethink https://reviews.apache.org/r/56363
    
    Anyway, for this method, I'd recommend constructing a `hashset` of PIDs and checking/updating this set while recovering.  Because O(N).



src/slave/containerizer/mesos/windows_launcher.cpp (line 78)
<https://reviews.apache.org/r/56362/#comment236864>

    Since you are assigning the `Container` is this way, you don't actually need to specify the constructor in the header.
    
    So either remove the constructor, or use it in this file.



src/slave/containerizer/mesos/windows_launcher.cpp (line 148)
<https://reviews.apache.org/r/56362/#comment236865>

    Ditto about the constructor.



src/slave/containerizer/mesos/windows_launcher.cpp (line 162)
<https://reviews.apache.org/r/56362/#comment236869>

    Can you add a TODO here about how executors will die when the agent dies?



src/slave/containerizer/mesos/windows_launcher.cpp (line 185)
<https://reviews.apache.org/r/56362/#comment236866>

    s/container.get().handle/container->handle/


- Joseph Wu


On Feb. 9, 2017, 10:50 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2017, 10:50 a.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/
-----------------------------------------------------------

(Updated March 27, 2017, 3:44 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Bugs: MESOS-6892
    https://issues.apache.org/jira/browse/MESOS-6892


Repository: mesos


Description
-------

Instead of deriving `WindowsLauncher` from `PosixLauncher`,
this commit implements a separate `WindowsLauncher` derived
from `Launcher` parallel to the `PosixLauncher`.

This launcher is then refactored to use Windows' "Job Objects," which
are similar to Linux's cgroups, and enable us to reason about a group
of processes associated with a task/container as a "Job Object"
instead of a root PID and the tree containing its children. The latter
is not a reasonable approach on Windows, and has been the source of
subtle bugs.

The Job Object approach creates a named job with a one-to-one mapping
to the containerizer, and assigns the first process started for the
task to the job object. After being assigned, the Windows kernel
ensures all spawned child processes
(specifically those made with the `CreateProcess` system call) are
also assigned to the named job object. Thus this job object can then
be used to query the process group's resource usage, kill the process
group, and set limits on the process group.

So instead of seeing a process group as a tree and referring to it by
the root process's PID, the `WindowsLauncher` sees a process group as
a named Job Object, the same way the Windows kernel sees it. However,
the containerizer code which interacts with the launcher still refers
to a task group by the singular PID, and so we have a sort of shim
which maps the initial PID to the name of the job object. This is a an
unfortunate consequence of the shared containerizer code being
originally written for POSIX-like systems.

This abstraction sets us up for implementing resource usage limits on
the process group as a whole.


Diffs
-----

  src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
  src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
  src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
  src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
  src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
  src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/56362/diff/4/


Testing
-------


Thanks,

Andrew Schwartzmeyer


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 25, 2017, 1:24 a.m., Joseph Wu wrote:
> > This is definitely the direction we want to head towards (Job objects, without leaking handles).
> > 
> > But, due to the partial duplication of logic in: https://reviews.apache.org/r/56366/
> > I think we should move much of this logic into libprocess.
> > 
> > Here's the gist of what should be done:
> > 
> > * We'll want a new, Windows-only actor that is created in `process::initialize`.  This will look very similar to the `GarbageCollector` process in `include/process/gc.hpp`.  Something like:
> > ```
> > class JobObjectManager : public Process<JobObjectManager>
> > {
> > public:
> >   JobObjectManager() : ProcessBase("__job_object_manager__") {}
> >   virtual ~JobObjectManager() {}
> > 
> >   // NOTE: Maybe you'll want the actor to create the JobObject itself.
> >   // You could even move the CREATE_JOB parent hook into this file.
> >   void manage(const pid_t pid, const std::string& name, const SharedHandle& handle)
> >   {
> >     objects[pid] = {name, handle};
> >     
> >     process::reap(pid)
> >       .onAny(lambda::bind(&Self::cleanup, lambda::_1, pid));
> >   }
> > 
> > protected:
> >   virtual void cleanup(Future<Option<int>> exit_code, const pid_t pid)
> >   {
> >     CHECK(!exit_code.isPending());
> >     CHECK(!exit_code.isDiscarded());
> > 
> >     Try<Nothing> killJobResult = os::kill_job(objects[pid].handle.get());
> >     CHECK(!killJobResult.isError())
> >       << "Failed to kill job object: " + killJobResult.error();
> > 
> >     objects.erase(pid);
> >   }
> > 
> > private:
> >   struct JobData {
> >     std::string name;
> >     SharedHandle handle;
> >   }
> >   std::map<pid, JobData> objects;
> > };
> > ```
> > * You can either expose the actor's methods via `process.hpp` or via moving the ParentHook into the new actor.
> > * The WindowsLauncher would become a thin wrapper around the new actor.
> > * https://reviews.apache.org/r/56366/ should become much simpler.

Hm, okay, I like your idea. Let's chat and go over the design you're proposing.


> On Feb. 25, 2017, 1:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, lines 50-62
> > <https://reviews.apache.org/r/56362/diff/4/?file=1630626#file1630626line50>
> >
> >     Erm... saving the set of PIDs in a prior loop guarantees that the `pids.contains(pid)` condition will be met.

containers != containerStates

The containers is what's already tracked in the launcher; the states is what's being recovered from disk.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review166640
-----------------------------------------------------------


On Feb. 11, 2017, 2:04 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2017, 2:04 a.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.

> On Feb. 25, 2017, 1:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, lines 50-62
> > <https://reviews.apache.org/r/56362/diff/4/?file=1630626#file1630626line50>
> >
> >     Erm... saving the set of PIDs in a prior loop guarantees that the `pids.contains(pid)` condition will be met.
> 
> Andrew Schwartzmeyer wrote:
>     containers != containerStates
>     
>     The containers is what's already tracked in the launcher; the states is what's being recovered from disk.

s/containerStates/states/g


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review166640
-----------------------------------------------------------


On Feb. 11, 2017, 2:04 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2017, 2:04 a.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review166640
-----------------------------------------------------------



This is definitely the direction we want to head towards (Job objects, without leaking handles).

But, due to the partial duplication of logic in: https://reviews.apache.org/r/56366/
I think we should move much of this logic into libprocess.

Here's the gist of what should be done:

* We'll want a new, Windows-only actor that is created in `process::initialize`.  This will look very similar to the `GarbageCollector` process in `include/process/gc.hpp`.  Something like:
```
class JobObjectManager : public Process<JobObjectManager>
{
public:
  JobObjectManager() : ProcessBase("__job_object_manager__") {}
  virtual ~JobObjectManager() {}

  // NOTE: Maybe you'll want the actor to create the JobObject itself.
  // You could even move the CREATE_JOB parent hook into this file.
  void manage(const pid_t pid, const std::string& name, const SharedHandle& handle)
  {
    objects[pid] = {name, handle};
    
    process::reap(pid)
      .onAny(lambda::bind(&Self::cleanup, lambda::_1, pid));
  }

protected:
  virtual void cleanup(Future<Option<int>> exit_code, const pid_t pid)
  {
    CHECK(!exit_code.isPending());
    CHECK(!exit_code.isDiscarded());

    Try<Nothing> killJobResult = os::kill_job(objects[pid].handle.get());
    CHECK(!killJobResult.isError())
      << "Failed to kill job object: " + killJobResult.error();

    objects.erase(pid);
  }

private:
  struct JobData {
    std::string name;
    SharedHandle handle;
  }
  std::map<pid, JobData> objects;
};
```
* You can either expose the actor's methods via `process.hpp` or via moving the ParentHook into the new actor.
* The WindowsLauncher would become a thin wrapper around the new actor.
* https://reviews.apache.org/r/56366/ should become much simpler.


src/slave/containerizer/mesos/windows_launcher.cpp (lines 50 - 62)
<https://reviews.apache.org/r/56362/#comment238647>

    Erm... saving the set of PIDs in a prior loop guarantees that the `pids.contains(pid)` condition will be met.


- Joseph Wu


On Feb. 10, 2017, 6:04 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2017, 6:04 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/
-----------------------------------------------------------

(Updated Feb. 11, 2017, 2:04 a.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
-------

Requested changes.


Bugs: MESOS-6892
    https://issues.apache.org/jira/browse/MESOS-6892


Repository: mesos


Description
-------

Instead of deriving `WindowsLauncher` from `PosixLauncher`,
this commit implements a separate `WindowsLauncher` derived
from `Launcher` parallel to the `PosixLauncher`.

This launcher is then refactored to use Windows' "Job Objects," which
are similar to Linux's cgroups, and enable us to reason about a group
of processes associated with a task/container as a "Job Object"
instead of a root PID and the tree containing its children. The latter
is not a reasonable approach on Windows, and has been the source of
subtle bugs.

The Job Object approach creates a named job with a one-to-one mapping
to the containerizer, and assigns the first process started for the
task to the job object. After being assigned, the Windows kernel
ensures all spawned child processes
(specifically those made with the `CreateProcess` system call) are
also assigned to the named job object. Thus this job object can then
be used to query the process group's resource usage, kill the process
group, and set limits on the process group.

So instead of seeing a process group as a tree and referring to it by
the root process's PID, the `WindowsLauncher` sees a process group as
a named Job Object, the same way the Windows kernel sees it. However,
the containerizer code which interacts with the launcher still refers
to a task group by the singular PID, and so we have a sort of shim
which maps the initial PID to the name of the job object. This is a an
unfortunate consequence of the shared containerizer code being
originally written for POSIX-like systems.

This abstraction sets us up for implementing resource usage limits on
the process group as a whole.


Diffs (updated)
-----

  src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
  src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
  src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
  src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
  src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
  src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/56362/diff/


Testing
-------


Thanks,

Andrew Schwartzmeyer


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/
-----------------------------------------------------------

(Updated Feb. 9, 2017, 6:50 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
-------

Remove a required dependency because Review Bot can't handle it.


Bugs: MESOS-6892
    https://issues.apache.org/jira/browse/MESOS-6892


Repository: mesos


Description
-------

Instead of deriving `WindowsLauncher` from `PosixLauncher`,
this commit implements a separate `WindowsLauncher` derived
from `Launcher` parallel to the `PosixLauncher`.

This launcher is then refactored to use Windows' "Job Objects," which
are similar to Linux's cgroups, and enable us to reason about a group
of processes associated with a task/container as a "Job Object"
instead of a root PID and the tree containing its children. The latter
is not a reasonable approach on Windows, and has been the source of
subtle bugs.

The Job Object approach creates a named job with a one-to-one mapping
to the containerizer, and assigns the first process started for the
task to the job object. After being assigned, the Windows kernel
ensures all spawned child processes
(specifically those made with the `CreateProcess` system call) are
also assigned to the named job object. Thus this job object can then
be used to query the process group's resource usage, kill the process
group, and set limits on the process group.

So instead of seeing a process group as a tree and referring to it by
the root process's PID, the `WindowsLauncher` sees a process group as
a named Job Object, the same way the Windows kernel sees it. However,
the containerizer code which interacts with the launcher still refers
to a task group by the singular PID, and so we have a sort of shim
which maps the initial PID to the name of the job object. This is a an
unfortunate consequence of the shared containerizer code being
originally written for POSIX-like systems.

This abstraction sets us up for implementing resource usage limits on
the process group as a whole.


Diffs
-----

  src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
  src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
  src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
  src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
  src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
  src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/56362/diff/


Testing
-------


Thanks,

Andrew Schwartzmeyer


Re: Review Request 56362: Windows: Create the Windows container launcher.

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/
-----------------------------------------------------------

(Updated Feb. 8, 2017, 9:28 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
-------

Fix dependencies.


Bugs: MESOS-6892
    https://issues.apache.org/jira/browse/MESOS-6892


Repository: mesos


Description
-------

Instead of deriving `WindowsLauncher` from `PosixLauncher`,
this commit implements a separate `WindowsLauncher` derived
from `Launcher` parallel to the `PosixLauncher`.

This launcher is then refactored to use Windows' "Job Objects," which
are similar to Linux's cgroups, and enable us to reason about a group
of processes associated with a task/container as a "Job Object"
instead of a root PID and the tree containing its children. The latter
is not a reasonable approach on Windows, and has been the source of
subtle bugs.

The Job Object approach creates a named job with a one-to-one mapping
to the containerizer, and assigns the first process started for the
task to the job object. After being assigned, the Windows kernel
ensures all spawned child processes
(specifically those made with the `CreateProcess` system call) are
also assigned to the named job object. Thus this job object can then
be used to query the process group's resource usage, kill the process
group, and set limits on the process group.

So instead of seeing a process group as a tree and referring to it by
the root process's PID, the `WindowsLauncher` sees a process group as
a named Job Object, the same way the Windows kernel sees it. However,
the containerizer code which interacts with the launcher still refers
to a task group by the singular PID, and so we have a sort of shim
which maps the initial PID to the name of the job object. This is a an
unfortunate consequence of the shared containerizer code being
originally written for POSIX-like systems.

This abstraction sets us up for implementing resource usage limits on
the process group as a whole.


Diffs
-----

  src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
  src/slave/containerizer/mesos/containerizer.cpp d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
  src/slave/containerizer/mesos/launcher.hpp 79f6eea0ee8e564e90b36208672df150dbc5d540 
  src/slave/containerizer/mesos/launcher.cpp 5114c130efbfb252dde1e85c081f5174e66f57af 
  src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
  src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/56362/diff/


Testing
-------


Thanks,

Andrew Schwartzmeyer