You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Klaus Ma <kl...@cguru.net> on 2015/09/14 19:41:57 UTC

Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/
-----------------------------------------------------------

(Updated Sept. 14, 2015, 10:41 a.m.)


Review request for mesos and Niklas Nielsen.


Summary (updated)
-----------------

Add containerId to ResourceUsage to enable QoS controller to target a container


Bugs: MESOS-2875
    https://issues.apache.org/jira/browse/MESOS-2875


Repository: mesos


Description
-------

We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.

This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.


Diffs
-----

  include/mesos/mesos.proto b1deed4720cab4a89db76a48bc9563bba4f5bf1c 
  include/mesos/slave/oversubscription.proto fa69a95689c8c75765f0800692655e8dde7dde33 
  src/slave/slave.cpp 5e5522e1254a5ed6084de36782753f9aad2894c4 
  src/tests/oversubscription_tests.cpp 0c5edafc139d9bfb6806d007a0af85e80893bb1a 

Diff: https://reviews.apache.org/r/38253/diff/


Testing
-------

make
make check


Thanks,

Klaus Ma


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Niklas Nielsen <ni...@qni.dk>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review99340
-----------------------------------------------------------



include/mesos/mesos.proto (line 820)
<https://reviews.apache.org/r/38253/#comment156232>

    The comment doesn't add a ton of context; let's kill this one or specify that it is the container id for the executor specified in the executor_info field.



include/mesos/slave/oversubscription.proto (lines 41 - 42)
<https://reviews.apache.org/r/38253/#comment156233>

    How about something like:
    
    NOTE: Framework ID and either executor id or executor id **and** container_id must be set to kill a running container. In the latter case, the caller can be assured that the kill request doesn't target a new executor reusing an old executor id.



src/slave/slave.cpp (line 4380)
<https://reviews.apache.org/r/38253/#comment156234>

    Great! Let's include the container id's as well for debugging :)


- Niklas Nielsen


On Sept. 15, 2015, 6:08 a.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2015, 6:08 a.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto b1deed4 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 44865bd 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review99338
-----------------------------------------------------------


Patch looks great!

Reviews applied: [38253]

All tests passed.

- Mesos ReviewBot


On Sept. 15, 2015, 1:08 p.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2015, 1:08 p.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto b1deed4 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 44865bd 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review99420
-----------------------------------------------------------


Patch looks great!

Reviews applied: [38253]

All tests passed.

- Mesos ReviewBot


On Sept. 17, 2015, 5:41 p.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 17, 2015, 5:41 p.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 899d52f 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 93353a1 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Klaus Ma <kl...@gmail.com>.

> On Dec. 8, 2015, 6:34 a.m., Till Toenshoff wrote:
> > include/mesos/mesos.proto, line 821
> > <https://reviews.apache.org/r/38253/diff/3/?file=1076086#file1076086line821>
> >
> >     We missed to update the V1 API with this change, it seems.

Posted a new RR#41066 to address.


- Klaus


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review109207
-----------------------------------------------------------


On Sept. 18, 2015, 1:41 a.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 18, 2015, 1:41 a.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 899d52f 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 93353a1 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Till Toenshoff <to...@me.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review109207
-----------------------------------------------------------



include/mesos/mesos.proto (line 821)
<https://reviews.apache.org/r/38253/#comment168684>

    We missed to update the V1 API with this change, it seems.


- Till Toenshoff


On Sept. 17, 2015, 5:41 p.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 17, 2015, 5:41 p.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 899d52f 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 93353a1 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Klaus Ma <kl...@cguru.net>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/
-----------------------------------------------------------

(Updated Sept. 17, 2015, 5:41 p.m.)


Review request for mesos and Niklas Nielsen.


Changes
-------

Address comments


Bugs: MESOS-2875
    https://issues.apache.org/jira/browse/MESOS-2875


Repository: mesos


Description
-------

We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.

This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.


Diffs (updated)
-----

  include/mesos/mesos.proto 899d52f 
  include/mesos/slave/oversubscription.proto fa69a95 
  src/slave/slave.cpp 93353a1 
  src/tests/oversubscription_tests.cpp 0c5edaf 

Diff: https://reviews.apache.org/r/38253/diff/


Testing
-------

make
make check


Thanks,

Klaus Ma


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Klaus Ma <kl...@cguru.net>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/
-----------------------------------------------------------

(Updated Sept. 15, 2015, 1:08 p.m.)


Review request for mesos and Niklas Nielsen.


Changes
-------

Address comments


Bugs: MESOS-2875
    https://issues.apache.org/jira/browse/MESOS-2875


Repository: mesos


Description
-------

We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.

This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.


Diffs (updated)
-----

  include/mesos/mesos.proto b1deed4 
  include/mesos/slave/oversubscription.proto fa69a95 
  src/slave/slave.cpp 44865bd 
  src/tests/oversubscription_tests.cpp 0c5edaf 

Diff: https://reviews.apache.org/r/38253/diff/


Testing
-------

make
make check


Thanks,

Klaus Ma


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Klaus Ma <kl...@cguru.net>.

> On Sept. 14, 2015, 6:07 p.m., Niklas Nielsen wrote:
> > include/mesos/slave/oversubscription.proto, lines 46-47
> > <https://reviews.apache.org/r/38253/diff/1/?file=1067132#file1067132line46>
> >
> >     We need to add a comment about the semantics of executor_id vs container_id. For example, what happens if both is set? Or do they need to be set together? (Loooks like it from the code below)

Addressed; add comments that executor id must be set if kill specified container.


> On Sept. 14, 2015, 6:07 p.m., Niklas Nielsen wrote:
> > src/slave/slave.cpp, lines 4367-4368
> > <https://reviews.apache.org/r/38253/diff/1/?file=1067133#file1067133line4367>
> >
> >     Style nit:
> >     
> >     ```
> >           const ContainerID& containerId =
> >               kill.has_container_id() ? kill.container_id() : executor->containerId;
> >     ```
> >     
> >     Also, what happens if the container id is wrong? Think we need to do some validation here.

Addressed; if containerId was wrong, log warn message and ignore the kill request.


> On Sept. 14, 2015, 6:07 p.m., Niklas Nielsen wrote:
> > src/tests/oversubscription_tests.cpp, line 859
> > <https://reviews.apache.org/r/38253/diff/1/?file=1067134#file1067134line859>
> >
> >     Mind adding a TODO about doing a full blown object comparison?

Do you mean full comparison on metrics?


- Klaus


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review98880
-----------------------------------------------------------


On Sept. 15, 2015, 1:08 p.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2015, 1:08 p.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto b1deed4 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 44865bd 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Niklas Nielsen <ni...@qni.dk>.

> On Sept. 14, 2015, 11:07 a.m., Niklas Nielsen wrote:
> > src/tests/oversubscription_tests.cpp, line 859
> > <https://reviews.apache.org/r/38253/diff/1/?file=1067134#file1067134line859>
> >
> >     Mind adding a TODO about doing a full blown object comparison?
> 
> Klaus Ma wrote:
>     Do you mean full comparison on metrics?

Scratch that :) Think we are fine. Will drop the issue


- Niklas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review98880
-----------------------------------------------------------


On Sept. 15, 2015, 6:08 a.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2015, 6:08 a.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto b1deed4 
>   include/mesos/slave/oversubscription.proto fa69a95 
>   src/slave/slave.cpp 44865bd 
>   src/tests/oversubscription_tests.cpp 0c5edaf 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Re: Review Request 38253: Add containerId to ResourceUsage to enable QoS controller to target a container

Posted by Niklas Nielsen <ni...@qni.dk>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38253/#review98880
-----------------------------------------------------------



include/mesos/mesos.proto (line 820)
<https://reviews.apache.org/r/38253/#comment155535>

    Let's add a comment about this field



include/mesos/slave/oversubscription.proto (lines 46 - 47)
<https://reviews.apache.org/r/38253/#comment155536>

    We need to add a comment about the semantics of executor_id vs container_id. For example, what happens if both is set? Or do they need to be set together? (Loooks like it from the code below)



src/slave/slave.cpp (lines 4367 - 4368)
<https://reviews.apache.org/r/38253/#comment155538>

    Style nit:
    
    ```
          const ContainerID& containerId =
              kill.has_container_id() ? kill.container_id() : executor->containerId;
    ```
    
    Also, what happens if the container id is wrong? Think we need to do some validation here.



src/tests/oversubscription_tests.cpp (line 807)
<https://reviews.apache.org/r/38253/#comment155551>

    s/to QoS Controller/to the QoS Controller/



src/tests/oversubscription_tests.cpp (line 859)
<https://reviews.apache.org/r/38253/#comment155555>

    Mind adding a TODO about doing a full blown object comparison?


- Niklas Nielsen


On Sept. 14, 2015, 10:41 a.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38253/
> -----------------------------------------------------------
> 
> (Updated Sept. 14, 2015, 10:41 a.m.)
> 
> 
> Review request for mesos and Niklas Nielsen.
> 
> 
> Bugs: MESOS-2875
>     https://issues.apache.org/jira/browse/MESOS-2875
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We should ensure that we are addressing the container which the QoS controller intended to kill. Without this check, we may run into a scenario where the executor has terminated and one with the same id has started in the interim i.e. running in a different container than the one the QoS controller targeted.
> 
> This most likely requires us to add containerId to the ResourceUsage message and encode the containerID in the QoS Correction message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto b1deed4720cab4a89db76a48bc9563bba4f5bf1c 
>   include/mesos/slave/oversubscription.proto fa69a95689c8c75765f0800692655e8dde7dde33 
>   src/slave/slave.cpp 5e5522e1254a5ed6084de36782753f9aad2894c4 
>   src/tests/oversubscription_tests.cpp 0c5edafc139d9bfb6806d007a0af85e80893bb1a 
> 
> Diff: https://reviews.apache.org/r/38253/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>