You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Yifan Gu <yi...@mesosphere.io> on 2014/06/06 20:40:19 UTC

Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

Review request for mesos, Adam B, Niklas Nielsen, and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Adam B <ad...@mesosphere.io>.

> On June 11, 2014, 1:18 a.m., Yifan Gu wrote:
> > src/slave/slave.cpp, line 1204
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1204>
> >
> >     I found that these kind of checkers 
> >     
> >     (check framework/executor/slave state, then do some action)
> >     
> >     appears in several other places. I think I can try to put them in a separate module to make the code more reusable and more readable.
> >     Does anyone know that if there is any JIRA related?
> >     I would like to think about this problem if nobody else is interested, because I really don't like copy-pasting, although I did some here :P
> >
> 
> Yifan Gu wrote:
>     Forgot to say thanks!

Agreed, there are a lot of repeated "checkers" for these kinds of things.
I can't find anything in JIRA for this, so feel free to open up a new ticket.


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45347
-----------------------------------------------------------


On June 11, 2014, 2:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 2:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 11, 2014, 8:18 a.m., Yifan Gu wrote:
> > src/slave/slave.cpp, line 1204
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1204>
> >
> >     I found that these kind of checkers 
> >     
> >     (check framework/executor/slave state, then do some action)
> >     
> >     appears in several other places. I think I can try to put them in a separate module to make the code more reusable and more readable.
> >     Does anyone know that if there is any JIRA related?
> >     I would like to think about this problem if nobody else is interested, because I really don't like copy-pasting, although I did some here :P
> >

Forgot to say thanks!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45347
-----------------------------------------------------------


On June 11, 2014, 8:06 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 8:06 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 11, 2014, 8:18 a.m., Yifan Gu wrote:
> > src/slave/slave.cpp, line 1204
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1204>
> >
> >     I found that these kind of checkers 
> >     
> >     (check framework/executor/slave state, then do some action)
> >     
> >     appears in several other places. I think I can try to put them in a separate module to make the code more reusable and more readable.
> >     Does anyone know that if there is any JIRA related?
> >     I would like to think about this problem if nobody else is interested, because I really don't like copy-pasting, although I did some here :P
> >
> 
> Yifan Gu wrote:
>     Forgot to say thanks!
> 
> Adam B wrote:
>     Agreed, there are a lot of repeated "checkers" for these kinds of things.
>     I can't find anything in JIRA for this, so feel free to open up a new ticket.

New issue created:
https://issues.apache.org/jira/browse/MESOS-1484


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45347
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45347
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80162>

    I found that these kind of checkers 
    
    (check framework/executor/slave state, then do some action)
    
    appears in several other places. I think I can try to put them in a separate module to make the code more reusable and more readable.
    Does anyone know that if there is any JIRA related?
    I would like to think about this problem if nobody else is interested, because I really don't like copy-pasting, although I did some here :P
    


- Yifan Gu


On June 11, 2014, 8:06 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 8:06 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45437
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22313]

All tests passed.

- Mesos ReviewBot


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Ben Mahler <be...@gmail.com>.

> On June 13, 2014, 12:56 a.m., Ben Mahler wrote:
> > Hey Yifan, is this the right diff?

Oh sorry, I was looking at the wrong review.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45563
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45563
-----------------------------------------------------------


Hey Yifan, is this the right diff?

- Ben Mahler


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, lines 864-865
> > <https://reviews.apache.org/r/22313/diff/7/?file=607038#file607038line864>
> >
> >     You should set these expectations just before you launch the second task. Also, why "WillRepeatedly"? Are you expecting more than 2 updates?
> 
> Yifan Gu wrote:
>     I forgot why I added the WillRepeatedly here, but I guess this is because the first task will send a TASK_LOST at the end. I will try to verify this in the new version. Thanks! Vinod!
>

Update:
I think we need to keep WillRepeatedly here.
Since in the cleaning up stage, we are shutting the down the 1, scheduler. 2, master. 3, slave.
However the slave can be terminated before the other two, in which case it will send a TASK_LOST to the master.

Same situation can be found at https://issues.apache.org/jira/browse/MESOS-1460


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------


On June 18, 2014, 5:14 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 18, 2014, 5:14 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1204-1206
> > <https://reviews.apache.org/r/22313/diff/7/?file=607037#file607037line1204>
> >
> >     The slave cannot be in RECOVERING. See the comment on line #1099.

I see! Thank you!


> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1219-1224
> > <https://reviews.apache.org/r/22313/diff/7/?file=607037#file607037line1219>
> >
> >     Is this possible? AFAICT, since the task was added to the executor the executor shouldn't be removed between _runTask() and __runTask(). Even if the executor terminates in between, this task should've been marked 'terminated' but not 'completed' (i.e., waiting for an ACK) and hence the executor won't be removed from the framework's map. Since there is a pending executor, the framework shouldn't be removed.
> >     
> >     So this can be a CHECK_NONTULL(framework) with a comment on why it can be a check.

Good point! I am currently trying to add a test to kill the framework before the containerizer->update() returns to test this. Thanks for pointing out!


> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, lines 803-804
> > <https://reviews.apache.org/r/22313/diff/7/?file=607038#file607038line803>
> >
> >     I don't see where this test is ensuring that resources are isolated before launching tasks?
> >     
> >     I would recommend splitting this into 2 tests
> >     
> >     1) ensure resources are updated before task is launched
> >     
> >     2) containerizer failed causes task lost to be sent.
> >     
> >     i would also recommend writing a test that ensures that framework and executor are not removed between _runTask and __runTask().

Doing!


> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, lines 864-865
> > <https://reviews.apache.org/r/22313/diff/7/?file=607038#file607038line864>
> >
> >     You should set these expectations just before you launch the second task. Also, why "WillRepeatedly"? Are you expecting more than 2 updates?

I forgot why I added the WillRepeatedly here, but I guess this is because the first task will send a TASK_LOST at the end. I will try to verify this in the new version. Thanks! Vinod!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1219-1224
> > <https://reviews.apache.org/r/22313/diff/7/?file=607037#file607037line1219>
> >
> >     Is this possible? AFAICT, since the task was added to the executor the executor shouldn't be removed between _runTask() and __runTask(). Even if the executor terminates in between, this task should've been marked 'terminated' but not 'completed' (i.e., waiting for an ACK) and hence the executor won't be removed from the framework's map. Since there is a pending executor, the framework shouldn't be removed.
> >     
> >     So this can be a CHECK_NONTULL(framework) with a comment on why it can be a check.
> 
> Yifan Gu wrote:
>     Good point! I am currently trying to add a test to kill the framework before the containerizer->update() returns to test this. Thanks for pointing out!
> 
> Yifan Gu wrote:
>     Hi Vinod, I found that the framework does have a chance to be NULL here. Seems that since the executor is not in framework->pending() at this time (it is removed from the pending queue at the beginning of _runTask()), so the executor can be removed.
>     
>     These shutdown executor/framework logic is really not easy to tell from a single glance, so I have done an experiment.
>     
>     I have uploaded a test and logs, it shows that the framework can be removed before __runTask() is called. 
>     I really hope you could take a look to see if I missed some stuff. Thank you!
>     
>     I think maybe I can add the task to the pending queue again before calling the containerizer->update().

You are right. The executor/framework will not be removed in the normal course of things if the executor has a pending task. But they will be removed if the framework/executor was explicitly asked to shutdown by the master. This is the case in your test. So yes, keep the 'if' block instead of CHECK.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------


On June 19, 2014, 1:30 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 1:30 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1219-1224
> > <https://reviews.apache.org/r/22313/diff/7/?file=607037#file607037line1219>
> >
> >     Is this possible? AFAICT, since the task was added to the executor the executor shouldn't be removed between _runTask() and __runTask(). Even if the executor terminates in between, this task should've been marked 'terminated' but not 'completed' (i.e., waiting for an ACK) and hence the executor won't be removed from the framework's map. Since there is a pending executor, the framework shouldn't be removed.
> >     
> >     So this can be a CHECK_NONTULL(framework) with a comment on why it can be a check.
> 
> Yifan Gu wrote:
>     Good point! I am currently trying to add a test to kill the framework before the containerizer->update() returns to test this. Thanks for pointing out!

Hi Vinod, I found that the framework does have a chance to be NULL here. Seems that since the executor is not in framework->pending() at this time (it is removed from the pending queue at the beginning of _runTask()), so the executor can be removed.

These shutdown executor/framework logic is really not easy to tell from a single glance, so I have done an experiment.

I have uploaded a test and logs, it shows that the framework can be removed before __runTask() is called. 
I really hope you could take a look to see if I missed some stuff. Thank you!

I think maybe I can add the task to the pending queue again before calling the containerizer->update().


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, lines 803-804
> > <https://reviews.apache.org/r/22313/diff/7/?file=607038#file607038line803>
> >
> >     I don't see where this test is ensuring that resources are isolated before launching tasks?
> >     
> >     I would recommend splitting this into 2 tests
> >     
> >     1) ensure resources are updated before task is launched
> >     
> >     2) containerizer failed causes task lost to be sent.
> >     
> >     i would also recommend writing a test that ensures that framework and executor are not removed between _runTask and __runTask().
> 
> Yifan Gu wrote:
>     Doing!

Update:

In test one, I tried to catch the update(_, task.resources()) before launching, and wait for a while to make sure that the task won't get launched if the future is not satisfied. And will return TASK_LOST if the future is failed.

In test two, I tried to shutdown the framework between _runTask and __runTask, and make sure that the when the update(_, task.resources()) returns, it will figure out that the framework is terminated and will return directly without sending TASK_LOST.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------


On June 19, 2014, 12:38 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 12:38 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80887>

    The slave cannot be in RECOVERING. See the comment on line #1099.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80889>

    Kill state == RECOVERING.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80912>

    Is this possible? AFAICT, since the task was added to the executor the executor shouldn't be removed between _runTask() and __runTask(). Even if the executor terminates in between, this task should've been marked 'terminated' but not 'completed' (i.e., waiting for an ACK) and hence the executor won't be removed from the framework's map. Since there is a pending executor, the framework shouldn't be removed.
    
    So this can be a CHECK_NONTULL(framework) with a comment on why it can be a check.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80917>

    See my comments above. An executor cannot be removed when it has a pending task.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80919>

    also log the executor id, task id and framework id.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80920>

    include the "(future.isFailed() ? future.failure() : "discarded")" here. though, "discarded" here might not be meaningful for frameworks. hmmm.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment80930>

    I don't see where this test is ensuring that resources are isolated before launching tasks?
    
    I would recommend splitting this into 2 tests
    
    1) ensure resources are updated before task is launched
    
    2) containerizer failed causes task lost to be sent.
    
    i would also recommend writing a test that ensures that framework and executor are not removed between _runTask and __runTask().



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment80928>

    You should set these expectations just before you launch the second task. Also, why "WillRepeatedly"? Are you expecting more than 2 updates?


- Vinod Kone


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 23, 2014, 6:48 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1279-1280
> > <https://reviews.apache.org/r/22313/diff/7-11/?file=607037#file607037line1279>
> >
> >     On second thought lets not expose the future failure/discarded to frameworks. "Failed to update resources of the container" should be enough.

Got it! Will do next week.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46425
-----------------------------------------------------------


On June 19, 2014, 1:30 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 1:30 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46425
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81810>

    Lets not do this. It is weird that a task is in framework->pending AND executor->launchedTasks. It should be in one place or the other.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81813>

    Revert this to 'if'. 



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81811>

    kill this.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81815>

    kill this.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81816>

    s/since the framework is still RUNNING/since it has a pending task/
    
    But as mentioned above, lets revert this to an if.
    
    Also, what happens if the executor is "TERMINATING" or "TERMINATED" here? You should probably just return because executorTerminated() will send a TASK_LOST for this task.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81817>

    On second thought lets not expose the future failure/discarded to frameworks. "Failed to update resources of the container" should be enough.


- Vinod Kone


On June 19, 2014, 1:30 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 1:30 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?
> 
> Vinod Kone wrote:
>     this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.
> 
> Yifan Gu wrote:
>     Ok. Will do, I just reverted this patch.
> 
> Yifan Gu wrote:
>     And I found we need to do this too in reregisterExecutor()...

What did you revert? I still see container update() being called in runTask().


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?
> 
> Vinod Kone wrote:
>     this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.
> 
> Yifan Gu wrote:
>     Ok. Will do, I just reverted this patch.
> 
> Yifan Gu wrote:
>     And I found we need to do this too in reregisterExecutor()...
> 
> Vinod Kone wrote:
>     What did you revert? I still see container update() being called in runTask().
> 
> Yifan Gu wrote:
>     Sorry for confusion. By 'revert' I mean now, I don't put the executor in pending anymore, and I do a check on NULL framework and NULL executor as before.
>     I haven't done the (re)registerExecutor() stuff. Will try to do it tomorrow.
>     
>     But still, I will need some review on this one, since the approach on (re)registerExecutor will almost be the same as this one.
>     Thanks Vinod!
> 
> Vinod Kone wrote:
>     OK. Can you resolve or drop (with comments) the issues that were raised?

Can you add a TODO and open a ticket to track this?


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On Aug. 6, 2014, 1:11 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 1:11 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp c56cac8 
>   src/tests/slave_tests.cpp 3a7fee6 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?
> 
> Vinod Kone wrote:
>     this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.

Ok. Will do, I just reverted this patch.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?

Sounds good, Should I do it in this patch or open a new JIRA?


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On June 19, 2014, 1:30 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 1:30 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?

this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On June 19, 2014, 1:30 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 1:30 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?
> 
> Vinod Kone wrote:
>     this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.
> 
> Yifan Gu wrote:
>     Ok. Will do, I just reverted this patch.
> 
> Yifan Gu wrote:
>     And I found we need to do this too in reregisterExecutor()...
> 
> Vinod Kone wrote:
>     What did you revert? I still see container update() being called in runTask().
> 
> Yifan Gu wrote:
>     Sorry for confusion. By 'revert' I mean now, I don't put the executor in pending anymore, and I do a check on NULL framework and NULL executor as before.
>     I haven't done the (re)registerExecutor() stuff. Will try to do it tomorrow.
>     
>     But still, I will need some review on this one, since the approach on (re)registerExecutor will almost be the same as this one.
>     Thanks Vinod!

OK. Can you resolve or drop (with comments) the issues that were raised?


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?
> 
> Vinod Kone wrote:
>     this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.
> 
> Yifan Gu wrote:
>     Ok. Will do, I just reverted this patch.

And I found we need to do this too in reregisterExecutor()...


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 23, 2014, 6:50 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1185
> > <https://reviews.apache.org/r/22313/diff/11/?file=612826#file612826line1185>
> >
> >     Also, what about launching the tasks after updating resources in registerExecutor()?
> 
> Yifan Gu wrote:
>     Sounds good, Should I do it in this patch or open a new JIRA?
> 
> Vinod Kone wrote:
>     this one is fine. in fact i would prefer to registerExecutor() one first because that is going to affect all executors. the one in launchTask() when executor is RUNNING can be done latter (maybe in a subsequent patch) because that only affects executors that have multiple tasks.
> 
> Yifan Gu wrote:
>     Ok. Will do, I just reverted this patch.
> 
> Yifan Gu wrote:
>     And I found we need to do this too in reregisterExecutor()...
> 
> Vinod Kone wrote:
>     What did you revert? I still see container update() being called in runTask().

Sorry for confusion. By 'revert' I mean now, I don't put the executor in pending anymore, and I do a check on NULL framework and NULL executor as before.
I haven't done the (re)registerExecutor() stuff. Will try to do it tomorrow.

But still, I will need some review on this one, since the approach on (re)registerExecutor will almost be the same as this one.
Thanks Vinod!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review46429
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment81818>

    Also, what about launching the tasks after updating resources in registerExecutor()?


- Vinod Kone


On June 19, 2014, 1:30 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 1:30 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On July 3, 2014, 7:37 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1012
> > <https://reviews.apache.org/r/22313/diff/12/?file=621579#file621579line1012>
> >
> >     Where did "42" hours come from? IIUC, you want to ensure the task wasn't launched because the update is pending?
> >     
> >     One way to do it is to catch the _runTask() dispatch and do a clock::settle(). If the slave were to not wait for update() to finish (which it won't since you fixed it) then scheduler should have got the status update after settling the clock. You can ensure this by ASSERT(status2.isPending()). Does that make sense?

Thanks a lot! Will try this later, but unfortunately, I think I will have to put off this patch for a week...


> On July 3, 2014, 7:37 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1071
> > <https://reviews.apache.org/r/22313/diff/12/?file=621579#file621579line1071>
> >
> >     ditto. move the expectation for offers2 down.

Will do.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review47311
-----------------------------------------------------------


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review47311
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment82975>

    include the executor id.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment82979>

    s/cancelling/ignoring/



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment82976>

    Just say "Failed to update resources of the container"



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment82977>

    kill this. sending a "discarded" message to frameworks is weird because they wouldnt know what that means.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82990>

    since you are controlling the clocks, why do you need to do this?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82992>

    Move the expectation for offers2 down to where you expect this to happen.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82982>

    I'm confused. Why are you creating a new offer with fewer resources? Is it because you want to set the resources in the task below? Why not just create a Resources object?
    
    Also, you should use createTask().



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82980>

    s/status0/status1/
    s/status1/status2/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82986>

    Set the expectation for the second task below where you launch it.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82983>

    ditto. see above.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82985>

    s/containerizer->update()/'Containerizer::update()'/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82984>

    Where did "42" hours come from? IIUC, you want to ensure the task wasn't launched because the update is pending?
    
    One way to do it is to catch the _runTask() dispatch and do a clock::settle(). If the slave were to not wait for update() to finish (which it won't since you fixed it) then scheduler should have got the status update after settling the clock. You can ensure this by ASSERT(status2.isPending()). Does that make sense?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82987>

    This is testing the failure case. What about a test for the successful update case? Can you add that one?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82988>

    s/is/are/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82989>

    s/__runTask/'__runTask()'/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82991>

    ditto. why?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82996>

    ditto. why can't advance the clock instead?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82993>

    ditto. move the expectation for offers2 down.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82994>

    ditto. use createTask().



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82997>

    s/so//
    
    s/send out/receive/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82998>

    How about
    
    "Besides, the TASK_LOST update for task1 will not received because the framework is shutdown."



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82995>

    ditto. use createTask().



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment82999>

    Make sure you do a clock::settle() on __runTask() to ensure that it doesn't throw any errors.


Also, please make sure to run the new tests repeatedly for at least a thousand iterations to make sure they are not flaky.

- Vinod Kone


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review47168
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22313]

All tests passed.

- Mesos ReviewBot


On July 1, 2014, 8:33 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 1, 2014, 8:33 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 605ee4a 
>   src/slave/slave.cpp f42ab60 
>   src/tests/slave_tests.cpp 371a5b8 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On Aug. 5, 2014, 10:22 a.m., Adam B wrote:
> > src/tests/slave_tests.cpp, lines 1170-1172
> > <https://reviews.apache.org/r/22313/diff/17/?file=645030#file645030line1170>
> >
> >     Do you really need to settle before AND after you advance?

Yes, we actually need to settle() before advance(), Otherwise we may advance() before the slave has a chance to call delay() in slave:shutdownExecutor(), which makes the advance() useless.
I have added a comment on this. Thanks for pointing out!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review49580
-----------------------------------------------------------


On Aug. 6, 2014, 12:36 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 12:36 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp ef921e1 
>   src/tests/slave_tests.cpp 432d8a2 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On Aug. 5, 2014, 10:22 a.m., Adam B wrote:
> > Some minor feedback, but I think it's about ready to commit. I'll let @vinodkone (shepherd) give the final word.
> 
> Vinod Kone wrote:
>     i'll take a look once adam's comments are addressed.

Hi Vinod, I addressed Adam's comments, PTAL! Thanks!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review49580
-----------------------------------------------------------


On Aug. 6, 2014, 1:11 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 1:11 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp c56cac8 
>   src/tests/slave_tests.cpp 3a7fee6 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On Aug. 5, 2014, 10:22 a.m., Adam B wrote:
> > Some minor feedback, but I think it's about ready to commit. I'll let @vinodkone (shepherd) give the final word.

i'll take a look once adam's comments are addressed.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review49580
-----------------------------------------------------------


On July 29, 2014, 9:05 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 29, 2014, 9:05 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp a896bb6 
>   src/slave/slave.cpp 1d56918 
>   src/tests/slave_tests.cpp e45255a 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On Aug. 5, 2014, 10:22 a.m., Adam B wrote:
> > src/tests/slave_tests.cpp, lines 1238-1239
> > <https://reviews.apache.org/r/22313/diff/17/?file=645030#file645030line1238>
> >
> >     I'm not sure the Times(2) is being enforced if you follow it with WillRepeatedly. Maybe you should do .WillOnce(RUNNING).WillOnce(RUNNING) instead.

Thanks for your reminder Adam,
So I have tested with Times(1) and Times(3), both will fail the test.
And I have checked the documentation, it says the Times(n) will enforce the cardinalities.
https://code.google.com/p/googlemock/wiki/ForDummies#Cardinalities:_How_Many_Times_Will_It_Be_Called?


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review49580
-----------------------------------------------------------


On Aug. 5, 2014, 11:57 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp ef921e1 
>   src/tests/slave_tests.cpp 432d8a2 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review49580
-----------------------------------------------------------

Ship it!


Some minor feedback, but I think it's about ready to commit. I'll let @vinodkone (shepherd) give the final word.


src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment86807>

    Remove the space before ',' since it should immediately follow the frameworkId.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86808>

    Times(1) is unnecessary/implicit; remove.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86817>

    Why change the resources in the offer itself?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86818>

    Move this down to the createTask/tasks.push_back section. It's confusing just under the "Wait for offers" comment. Ditto for the other tests.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86809>

    Times(1) is unnecessary



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86810>

    Unnecessary



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86811>

    Do you really need to settle before AND after you advance?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86812>

    Unnecessary



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86813>

    Why are you modifying the offer object here?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86814>

    Unnecessary



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86815>

    Comment should be updated to match this test.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86816>

    I'm not sure the Times(2) is being enforced if you follow it with WillRepeatedly. Maybe you should do .WillOnce(RUNNING).WillOnce(RUNNING) instead.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment86819>

    Why change the resources offered?


- Adam B


On July 29, 2014, 2:05 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated July 29, 2014, 2:05 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp a896bb6 
>   src/slave/slave.cpp 1d56918 
>   src/tests/slave_tests.cpp e45255a 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.

> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1254
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1254>
> >
> >     Why are you failing the promise here? Don't you want to set it to Nothing and then make sure that task doesn't launch? Otherwise, what you are testing is "containerizer update failure" path and not "executor/framework removed" path.
> 
> Yifan Gu wrote:
>     Umm, I can return Nothing here, but I think since the "executor/framework removed" path is visited first, so either will test this path.

that is true, but what if someone in the future comes along and changes the order of the if statements in __runTask()?


> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1314
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1314>
> >
> >     Kill this? Why would there be subsequent updates here?
> 
> Yifan Gu wrote:
>     I remember that last time I ignored the subsequent updates, it will cause the tests to fail.
>     https://issues.apache.org/jira/browse/MESOS-1460
>     
>     But I have run these tests for more than 3000 times without .WillRepeatedly(Return()); They haven't failed yet...

maybe it failed earlier because you didn't have the .WillRepeatedly() expectation below where you set the expectation for status2.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------


On Aug. 12, 2014, 11:07 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 11:07 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 787bd05 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <my...@msn.com>.

> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1254
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1254>
> >
> >     Why are you failing the promise here? Don't you want to set it to Nothing and then make sure that task doesn't launch? Otherwise, what you are testing is "containerizer update failure" path and not "executor/framework removed" path.
> 
> Yifan Gu wrote:
>     Umm, I can return Nothing here, but I think since the "executor/framework removed" path is visited first, so either will test this path.
> 
> Vinod Kone wrote:
>     that is true, but what if someone in the future comes along and changes the order of the if statements in __runTask()?

Maybe I am wrong or missing something, but I can't think of any reasons why someone want to change the order of the if statements. Shouldn't we always test those assumptions when we enter a continuation? 
Could you give some examples? Thanks!


> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1314
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1314>
> >
> >     Kill this? Why would there be subsequent updates here?
> 
> Yifan Gu wrote:
>     I remember that last time I ignored the subsequent updates, it will cause the tests to fail.
>     https://issues.apache.org/jira/browse/MESOS-1460
>     
>     But I have run these tests for more than 3000 times without .WillRepeatedly(Return()); They haven't failed yet...
> 
> Vinod Kone wrote:
>     maybe it failed earlier because you didn't have the .WillRepeatedly() expectation below where you set the expectation for status2.

So I removed WillRepeatedly() for the first status update expectation, but still keep them in the later expectations.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------


On Aug. 12, 2014, 11:07 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 11:07 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 787bd05 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1338
> > <https://reviews.apache.org/r/22313/diff/24/?file=653348#file653348line1338>
> >
> >     I think you also want to return immediately if executor is in TERMINATING/TERMINATED state.
> 
> Yifan Gu wrote:
>     Thanks!
>     I think I need to add tests to test the case of:
>     1, the framework is in terminating/terminated.
>     2, the executor is in terminating/terminated.
>     3, the executor is removed...
>     
>     I am not sure if there is any chance that a framework is not NULL but the executor is NULL...

Looks like when the framework has more than one executors, then it's possible that one of the executors in removed, and the framework is not removed...


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------


On Aug. 12, 2014, 7:28 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 7:28 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 787bd05 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1097
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1097>
> >
> >     I think you can get rid of 'runTask' future and just work with 'updateCall' future. Any reason to need both?

Thanks! Vinod!
Now I changed to

AWAIT_READY(update);
EXPECT_TRUE(status2.isPending());

Do you think

AWAIT_READY(update);
Clock::pause();
Clock::settle();
EXPECT_TRUE(status2.isPending());
Clock::resume();

will be better?


> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1254
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1254>
> >
> >     Why are you failing the promise here? Don't you want to set it to Nothing and then make sure that task doesn't launch? Otherwise, what you are testing is "containerizer update failure" path and not "executor/framework removed" path.

Umm, I can return Nothing here, but I think since the "executor/framework removed" path is visited first, so either will test this path.


> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/tests/slave_tests.cpp, line 1314
> > <https://reviews.apache.org/r/22313/diff/24/?file=653349#file653349line1314>
> >
> >     Kill this? Why would there be subsequent updates here?

I remember that last time I ignored the subsequent updates, it will cause the tests to fail.
https://issues.apache.org/jira/browse/MESOS-1460

But I have run these tests for more than 3000 times without .WillRepeatedly(Return()); They haven't failed yet...


> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1338
> > <https://reviews.apache.org/r/22313/diff/24/?file=653348#file653348line1338>
> >
> >     I think you also want to return immediately if executor is in TERMINATING/TERMINATED state.

Thanks!
I think I need to add tests to test the case of:
1, the framework is in terminating/terminated.
2, the executor is in terminating/terminated.
3, the executor is removed...

I am not sure if there is any chance that a framework is not NULL but the executor is NULL...


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------


On Aug. 12, 2014, 6:10 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 6:10 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 787bd05 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <my...@msn.com>.

> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1338
> > <https://reviews.apache.org/r/22313/diff/24/?file=653348#file653348line1338>
> >
> >     I think you also want to return immediately if executor is in TERMINATING/TERMINATED state.
> 
> Yifan Gu wrote:
>     Thanks!
>     I think I need to add tests to test the case of:
>     1, the framework is in terminating/terminated.
>     2, the executor is in terminating/terminated.
>     3, the executor is removed...
>     
>     I am not sure if there is any chance that a framework is not NULL but the executor is NULL...
> 
> Yifan Gu wrote:
>     Looks like when the framework has more than one executors, then it's possible that one of the executors in removed, and the framework is not removed...
>
> 
> Yifan Gu wrote:
>     I added two more tests to test these pathes.
>     
>     Let me take another time to review that before ping you :)

Hi Vinod, I have updated the diff, I think it's ready to review despite those open issues.
PTAL if you have time, thanks for the patience!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------


On Aug. 14, 2014, 6:56 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 14, 2014, 6:56 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 1eaab04 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On Aug. 11, 2014, 10:56 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1338
> > <https://reviews.apache.org/r/22313/diff/24/?file=653348#file653348line1338>
> >
> >     I think you also want to return immediately if executor is in TERMINATING/TERMINATED state.
> 
> Yifan Gu wrote:
>     Thanks!
>     I think I need to add tests to test the case of:
>     1, the framework is in terminating/terminated.
>     2, the executor is in terminating/terminated.
>     3, the executor is removed...
>     
>     I am not sure if there is any chance that a framework is not NULL but the executor is NULL...
> 
> Yifan Gu wrote:
>     Looks like when the framework has more than one executors, then it's possible that one of the executors in removed, and the framework is not removed...
>

I added two more tests to test these pathes.

Let me take another time to review that before ping you :)


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------


On Aug. 12, 2014, 11:07 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 11:07 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 787bd05 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50239
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment87891>

    I don't think we need to send an update here because when an executor terminates, we send the update for this task (since it is launched tasks) in executorTerminated(). So just return here.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment87890>

    I think you also want to return immediately if executor is in TERMINATING/TERMINATED state.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87892>

    s/finishing/finishes/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87895>

    Why do you need to advance the clock? reviveOffers() should do an immediate allocation.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87904>

    s/updateCall/update/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87903>

    s/runTask/_runTask/
    
    new line after this.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87906>

    I think you can get rid of 'runTask' future and just work with 'updateCall' future. Any reason to need both?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87907>

    ditto. i don't think you need to advance the clock?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87908>

    s/updateCall/update/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87909>

    Why is the driver.join() here instead of immediately after driver.stop() above?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87920>

    Why are you failing the promise here? Don't you want to set it to Nothing and then make sure that task doesn't launch? Otherwise, what you are testing is "containerizer update failure" path and not "executor/framework removed" path.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87921>

    This should be split up for task1 expectation here and task2 expectation later when that task is launched.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87922>

    declare status2 where its expectation is declared.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87923>

    Kill this? Why would there be subsequent updates here?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment87941>

    no need to advance?


- Vinod Kone


On Aug. 6, 2014, 1:11 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 1:11 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp c56cac8 
>   src/tests/slave_tests.cpp 3a7fee6 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50094
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22313]

All tests passed.

- Mesos ReviewBot


On Aug. 6, 2014, 1:11 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 1:11 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp c56cac8 
>   src/tests/slave_tests.cpp 3a7fee6 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <my...@msn.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 14, 2014, 6:56 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 9d4607e 
  src/slave/slave.cpp 1eaab04 
  src/tests/slave_tests.cpp 69be28f 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <my...@msn.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 14, 2014, 6:54 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 9d4607e 
  src/slave/slave.cpp 1eaab04 
  src/tests/slave_recovery_tests.cpp b53353c 
  src/tests/slave_tests.cpp 69be28f 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review50433
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22313]

All tests passed.

- Mesos ReviewBot


On Aug. 12, 2014, 11:07 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 11:07 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 9d4607e 
>   src/slave/slave.cpp 787bd05 
>   src/tests/slave_tests.cpp 69be28f 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 12, 2014, 11:07 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Added two more tests to test when framework is removed/terminating, executor is removed/terminated.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 9d4607e 
  src/slave/slave.cpp 787bd05 
  src/tests/slave_tests.cpp 69be28f 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 12, 2014, 7:28 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 9d4607e 
  src/slave/slave.cpp 787bd05 
  src/tests/slave_tests.cpp 69be28f 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 12, 2014, 6:10 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 9d4607e 
  src/slave/slave.cpp 787bd05 
  src/tests/slave_tests.cpp 69be28f 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review49886
-----------------------------------------------------------


Ready for another review from @vinodkone (and others)?

- Adam B


On Aug. 5, 2014, 6:11 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2014, 6:11 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp c12cd0a 
>   src/slave/slave.cpp c56cac8 
>   src/tests/slave_tests.cpp 3a7fee6 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
> SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
> SlaveTest.LaunchTaskAfterContainerizerUpdate
> 
> ./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*
> 
> successful times > 2000
> 
> make check
> 
> 
> File Attachments
> ----------------
> 
> framework will exit
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
> log
>   https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 1:11 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp c56cac8 
  src/tests/slave_tests.cpp 3a7fee6 

Diff: https://reviews.apache.org/r/22313/diff/


Testing (updated)
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 12:37 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 12:37 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 12:36 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Addressed Adam's review comments.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing (updated)
-------

SlaveTest.WillNotLaunchTaskBeforeContainerizerUpdate
SlaveTest.WillNotLaunchTaskIfFrameworkIsRemoved
SlaveTest.LaunchTaskAfterContainerizerUpdate

./bin/mesos-tests.sh --gtest_repeat=-1 --gtest_break_on_failure --gtest_filter=*LaunchTask*

successful times > 2000


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 12:17 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 5, 2014, 11:57 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 5, 2014, 11:49 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated Aug. 5, 2014, 11:49 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp c12cd0a 
  src/slave/slave.cpp ef921e1 
  src/tests/slave_tests.cpp 432d8a2 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated July 29, 2014, 9:05 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

addressed comments.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp a896bb6 
  src/slave/slave.cpp 1d56918 
  src/tests/slave_tests.cpp e45255a 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated July 29, 2014, 8:41 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp a896bb6 
  src/slave/slave.cpp 1d56918 
  src/tests/slave_tests.cpp e45255a 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated July 29, 2014, 8:17 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp a896bb6 
  src/slave/slave.cpp 1d56918 
  src/tests/slave_tests.cpp e45255a 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated July 29, 2014, 1:37 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp a896bb6 
  src/slave/slave.cpp 1d56918 
  src/tests/slave_tests.cpp e45255a 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated July 28, 2014, 8:34 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp a896bb6 
  src/slave/slave.cpp 1d56918 
  src/tests/slave_tests.cpp e45255a 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated July 1, 2014, 8:33 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Reverted.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 605ee4a 
  src/slave/slave.cpp f42ab60 
  src/tests/slave_tests.cpp 371a5b8 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 19, 2014, 1:30 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Fixed expected resources.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 19, 2014, 12:38 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 19, 2014, 12:29 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Added allocation_interval and executor_shutdown_grace_period flags to make the test fast.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 18, 2014, 11:49 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Addressed some comments.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 18, 2014, 5:14 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


File Attachments (updated)
----------------

framework will exit
  https://reviews.apache.org/media/uploaded/files/2014/06/18/fbe73273-7aa9-4faa-b1c5-003ab03042a9__issue-886.diff
log
  https://reviews.apache.org/media/uploaded/files/2014/06/18/84d801a0-5c2a-4bb9-901b-e1962031461c__log


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 11, 2014, 9:32 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Wait future.
Add some EXPECT_CALL to eliminate gtest warnings.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 11, 2014, 9:05 a.m., Adam B wrote:
> > src/slave/slave.cpp, line 1207
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1207>
> >
> >     Am I blind? Where does 'executor' come from? You have executorId to start with, and it's not until later in this function that "Executor* executor = framework->getExecutor(executorId);"
> >     Maybe you should pass the containerId as a parameter as well, in case the executor exited before your containerizer->update() future failed.

diff 5 is a wrong file... Please forget it. I am sorry.


> On June 11, 2014, 9:05 a.m., Adam B wrote:
> > src/slave/slave.cpp, line 1231
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1231>
> >
> >     s/statsu/status/

Will do. Thanks!


> On June 11, 2014, 9:05 a.m., Adam B wrote:
> > src/tests/slave_tests.cpp, line 870
> > <https://reviews.apache.org/r/22313/diff/5/?file=606995#file606995line870>
> >
> >     First, AWAIT_READY(offers2);

Will do.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45350
-----------------------------------------------------------


On June 11, 2014, 8:50 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 8:50 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Adam B <ad...@mesosphere.io>.

> On June 11, 2014, 2:05 a.m., Adam B wrote:
> > src/slave/slave.cpp, line 1207
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1207>
> >
> >     Am I blind? Where does 'executor' come from? You have executorId to start with, and it's not until later in this function that "Executor* executor = framework->getExecutor(executorId);"
> >     Maybe you should pass the containerId as a parameter as well, in case the executor exited before your containerizer->update() future failed.
> 
> Yifan Gu wrote:
>     diff 5 is a wrong file... Please forget it. I am sorry.
> 
> Adam B wrote:
>     New diff makes more sense, although maybe you should be checking the future before everything else, like in diff 5?
> 
> Yifan Gu wrote:
>     I think if the slave/framework/executor/ isn't alive or is going to die, then the future doesn't matter. 
>     I think in such case the update result is a stale message.

Sounds good to me. You can drop the issue.


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45350
-----------------------------------------------------------


On June 11, 2014, 2:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 2:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 11, 2014, 9:05 a.m., Adam B wrote:
> > src/slave/slave.cpp, line 1207
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1207>
> >
> >     Am I blind? Where does 'executor' come from? You have executorId to start with, and it's not until later in this function that "Executor* executor = framework->getExecutor(executorId);"
> >     Maybe you should pass the containerId as a parameter as well, in case the executor exited before your containerizer->update() future failed.
> 
> Yifan Gu wrote:
>     diff 5 is a wrong file... Please forget it. I am sorry.
> 
> Adam B wrote:
>     New diff makes more sense, although maybe you should be checking the future before everything else, like in diff 5?

I think if the slave/framework/executor/ isn't alive or is going to die, then the future doesn't matter. 
I think in such case the update result is a stale message.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45350
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 11, 2014, 9:05 a.m., Adam B wrote:
> > src/slave/slave.cpp, line 1231
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1231>
> >
> >     s/statsu/status/
> 
> Yifan Gu wrote:
>     Will do. Thanks!

Well I found this one appears in several other places, I will fix them.


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45350
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Adam B <ad...@mesosphere.io>.

> On June 11, 2014, 2:05 a.m., Adam B wrote:
> > src/slave/slave.cpp, line 1207
> > <https://reviews.apache.org/r/22313/diff/5/?file=606994#file606994line1207>
> >
> >     Am I blind? Where does 'executor' come from? You have executorId to start with, and it's not until later in this function that "Executor* executor = framework->getExecutor(executorId);"
> >     Maybe you should pass the containerId as a parameter as well, in case the executor exited before your containerizer->update() future failed.
> 
> Yifan Gu wrote:
>     diff 5 is a wrong file... Please forget it. I am sorry.

New diff makes more sense, although maybe you should be checking the future before everything else, like in diff 5?


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45350
-----------------------------------------------------------


On June 11, 2014, 1:50 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 1:50 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45350
-----------------------------------------------------------


Looks great. Just a couple of comments from me, and then I'll let Vinod/Ian give the final shepherd seal of approval.


src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80165>

    Am I blind? Where does 'executor' come from? You have executorId to start with, and it's not until later in this function that "Executor* executor = framework->getExecutor(executorId);"
    Maybe you should pass the containerId as a parameter as well, in case the executor exited before your containerizer->update() future failed.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment80166>

    s/statsu/status/



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/22313/#comment80168>

    First, AWAIT_READY(offers2);


- Adam B


On June 11, 2014, 1:50 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 1:50 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 11, 2014, 8:50 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Replace the wrong diff file


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 11, 2014, 8:06 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Added more slave, framework, executor status check in __runTask.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Dominic Hamon <dh...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45251
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22313]

All tests passed.

- Dominic Hamon


On June 6, 2014, 9:31 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 6, 2014, 9:31 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45030
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22313]

All tests passed.

- Mesos ReviewBot


On June 6, 2014, 9:31 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 6, 2014, 9:31 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 6, 2014, 9:31 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

Fix a typo.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 6, 2014, 9:30 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 6, 2014, 7 p.m., Ian Downes wrote:
> > src/slave/slave.cpp, line 1211
> > <https://reviews.apache.org/r/22313/diff/2/?file=604829#file604829line1211>
> >
> >     Could the framework be removed while the update is happening?

Good catch! Thanks!
I think I can simply check the NULL pointer here instead of putting the task into the pending map of the framework. Because even though the framework won't get shutdown when it has a pending task.
The executor will have a chance to be shutdown and cause a NULL pointer.

Please point me out if I am making some wrong assumption.
Thanks


> On June 6, 2014, 7 p.m., Ian Downes wrote:
> > src/slave/slave.cpp, line 1176
> > <https://reviews.apache.org/r/22313/diff/2/?file=604829#file604829line1176>
> >
> >     Please remove these TODOs not that you've addressed them.

Will do.


> On June 6, 2014, 7 p.m., Ian Downes wrote:
> > src/slave/slave.cpp, line 1217
> > <https://reviews.apache.org/r/22313/diff/2/?file=604829#file604829line1217>
> >
> >     So the task is lost but what happens then; should we consider destroying the container?

Good point!
But I think since the executor can be reused later. Maybe we can let it continue to live here for a while?
Cuz otherwise when the executor gets removed later, the container will be destroyed. IMO


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review44952
-----------------------------------------------------------


On June 6, 2014, 6:48 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 6, 2014, 6:48 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Ian Downes <ia...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review44952
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment79535>

    Please remove these TODOs not that you've addressed them.



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment79537>

    Could the framework be removed while the update is happening?



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment79538>

    ditto



src/slave/slave.cpp
<https://reviews.apache.org/r/22313/#comment79536>

    So the task is lost but what happens then; should we consider destroying the container?


- Ian Downes


On June 6, 2014, 11:48 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 6, 2014, 11:48 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 6, 2014, 6:48 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
-------

I and IanD will shepherd this -- vinod.


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu


Re: Review Request 22313: MESOS-886: Prevented slave from launching tasks before containerize's update completes.

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/
-----------------------------------------------------------

(Updated June 6, 2014, 6:44 p.m.)


Review request for mesos, Adam B, Niklas Nielsen, and Vinod Kone.


Changes
-------

rebase


Bugs: MESOS-886
    https://issues.apache.org/jira/browse/MESOS-886


Repository: mesos-git


Description
-------

Added __runTask() to wait for the completion of containerizer->update() and check the result before sending RunTaskMessage.


Diffs (updated)
-----

  src/slave/slave.hpp 34687e5 
  src/slave/slave.cpp 643c088 
  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22313/diff/


Testing
-------

SlaveTest, CancelTaskIfContainerizerFails

Which tests that if the containerizer->update() return a Failure, the task won't be launched and the scheduler will get TASK_LOST.

make check


Thanks,

Yifan Gu