You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Nishant Suneja <ni...@gmail.com> on 2014/12/26 19:29:01 UTC

Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

Review request for mesos.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description
-------

As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
when the onReady() callback is made for the container's future object, instead of starting the
timer synchronously when the launchExecutor() method of the Framework class is invoked.


Diffs
-----

  src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
  src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 

Diff: https://reviews.apache.org/r/29437/diff/


Testing
-------

make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.
> 
> Nishant Suneja wrote:
>     Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.
>     
>     Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.
> 
> Timothy Chen wrote:
>     One way to verify this is to use FUTURE_DISPATCH (you can search for examples in existing test). FUTURE_DISPATCH returns a future, which becomes ready when the method provided is dispatched (which means queued to serve in libprocess). Delay shouldn't queue it until the timeout happen.
> 
> Nishant Suneja wrote:
>     Yeah.. FUTURE_DISPATCH should do the job.
>     However, a question about the failure testcase. When we invoke Clock::advance() by a certain duration, any registered timer object which falls within that duration gets
>     called. For 2nd test, I can use the AWAIT_READY() macro on the future, and then advancing the clock should result in our "registerExecutorTimeout" being called, which is good.
>     
>     Although, for the 1st test case, before advancing the clock, I would want to make sure that the launch() function has actually returned, and our current code triggers the timer (otherwise, I would be advancing the clock, when no timer object has actually been set, so obviously nothing will get fired by advancing the clock).
>     EXPECT_CALL gives us a handle to that Future object, but how do I determine that launch() method has returned (still in pending state though), so i can advance the clock ? i.e. on what condition should I wait here?
> 
> Timothy Chen wrote:
>     EXPECT_CALL is called when the method is invoked, and whatever action you chain it with (.WillOnce(Returns(....)) is called immediately, so for example if you want to wait on the call, you can use EXPECT_CALL(containerizer, launch(_, _, _......)).WillOnce(FutureSatisify(&future)) and then AWAIT_READY(future), which the EXPECT_CALL will set the future state to be ready when the launch method is called.
>     
>     I think what I would do though is to create a promise, and return that promise's future instead:
>     
>     Promise<Nothing> promise;
>     EXPECT_CALL(containerizer, launch(_, _, _,....))
>       .WillOnce(Return(promise.future()));
>       
>     And then Clock::pause(), Clock::advance(timeout), and verify that the executor timeout is not ready ASSERT_TRUE(timeoutFuture.isPending())
>     
>     then satisfy the promise so the future is set to ready:
>     
>     promise.set(Nothing())
>     
>     After wards advance the clock and verify that it is ready.
> 
> Nishant Suneja wrote:
>     Hey Timothy
>     
>     So, I went through the documentation of gmock, and got a better understanding of the magic MACROS.
>     
>     That said, I still have 2 questions:
>     
>     1) During a unit test run, where we instantiate a master and a slave, how many processes are we actually dealing with ? 
>        Logic says that we have 1 test case process, and its 2 child processes (slave and master).
>        I ask this because it seems that we are sharing the containerizer object between test case processs and its child (slave) process.
>        Also, the callback in the ACTION part of the EXPECT_CALL, I believe that its been invoked in the context of the slave process.
>        Just wanted to clarify if I am on the correct page.
>        
>     2) So, if the above statement is correct, in the code below, we seem to have a race. We would expect this test case to pass with our
>        new code. However, in my patch, the call to delay() happens in onReady() method of the launch future object, which happens
>        AFTER invocation of containerizer::launch().
>        Now, its very well possible, that we pause and advance the clock in the parent test case process, before the delay() 
>        gets called in the slave process, in which case the timer wont fire at all.
>        
>     3) Which brings me to my third question, as to are we sharing the same clock across these 3 processes ?
>        Clock class (for eg.) is packaged as part of libprocess library, which can be shared among these 3 processes,
>        but there would be separate copies of static/global data of this class for the 3 processes (only the code segment should be shared).
>        So, will the timer set by one process, be visible to another process at all ?
>        
>        
>        CODE:
>     
>        Future<Nothing> launch;
>        EXPECT_CALL(containerizer, launch(_, _, _, _, _, _, _))
>         .WillOnce(DoAll(FutureSatisfy(&launch),
>                         Return(true)));
>     
>       //future object for call to "registerExecutorTimeout"
>       Future<Nothing> executorRegistrationTimeout =
>         FUTURE_DISPATCH(_, &Slave::registerExecutorTimeout);
>     
>       driver.launchTasks(offers.get()[0].id(), tasks);
>     
>       AWAIT_READY(launch);
>      
>       Clock::pause();
>     
>       Clock::advance(flags.executor_registration_timeout);
>     
>       //ensure that the "registerExecutorTimeout" HAS fired
>       ASSERT_TRUE(executorRegistrationTimeout.isReady());
>     
>       Clock::resume();

Ok..I have updated the diff with the testcase.

I was able to answer my questions. As for question-1), For unit tests, we are running all the components in the same process, rather than spawning one process per component. Essentially, we seem to be relying on a single thread event driven kind of mechanism for unit tests.
So, my question 3) becomes irrelevant.

For question-2, the trick was to use two futures (a future and a promise), one to track the call of the launch() method, and the 2nd one to trigger the registration timeout.


Additionally, I have moved the delay() call from onReady() callback to onAny() callback, since we want the timer to be set, even if the future fails or is dicarded.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 31, 2014, 3:01 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 3:01 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onAny() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::AsynchronousExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.
> 
> Nishant Suneja wrote:
>     Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.
>     
>     Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.

One way to verify this is to use FUTURE_DISPATCH (you can search for examples in existing test). FUTURE_DISPATCH returns a future, which becomes ready when the method provided is dispatched (which means queued to serve in libprocess). Delay shouldn't queue it until the timeout happen.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.
> 
> Nishant Suneja wrote:
>     Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.
>     
>     Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.
> 
> Timothy Chen wrote:
>     One way to verify this is to use FUTURE_DISPATCH (you can search for examples in existing test). FUTURE_DISPATCH returns a future, which becomes ready when the method provided is dispatched (which means queued to serve in libprocess). Delay shouldn't queue it until the timeout happen.

Yeah.. FUTURE_DISPATCH should do the job.
However, a question about the failure testcase. When we invoke Clock::advance() by a certain duration, any registered timer object which falls within that duration gets
called. For 2nd test, I can use the AWAIT_READY() macro on the future, and then advancing the clock should result in our "registerExecutorTimeout" being called, which is good.

Although, for the 1st test case, before advancing the clock, I would want to make sure that the launch() function has actually returned, and our current code triggers the timer (otherwise, I would be advancing the clock, when no timer object has actually been set, so obviously nothing will get fired by advancing the clock).
EXPECT_CALL gives us a handle to that Future object, but how do I determine that launch() method has returned (still in pending state though), so i can advance the clock ? i.e. on what condition should I wait here?


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.
> 
> Nishant Suneja wrote:
>     Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.
>     
>     Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.
> 
> Timothy Chen wrote:
>     One way to verify this is to use FUTURE_DISPATCH (you can search for examples in existing test). FUTURE_DISPATCH returns a future, which becomes ready when the method provided is dispatched (which means queued to serve in libprocess). Delay shouldn't queue it until the timeout happen.
> 
> Nishant Suneja wrote:
>     Yeah.. FUTURE_DISPATCH should do the job.
>     However, a question about the failure testcase. When we invoke Clock::advance() by a certain duration, any registered timer object which falls within that duration gets
>     called. For 2nd test, I can use the AWAIT_READY() macro on the future, and then advancing the clock should result in our "registerExecutorTimeout" being called, which is good.
>     
>     Although, for the 1st test case, before advancing the clock, I would want to make sure that the launch() function has actually returned, and our current code triggers the timer (otherwise, I would be advancing the clock, when no timer object has actually been set, so obviously nothing will get fired by advancing the clock).
>     EXPECT_CALL gives us a handle to that Future object, but how do I determine that launch() method has returned (still in pending state though), so i can advance the clock ? i.e. on what condition should I wait here?

EXPECT_CALL is called when the method is invoked, and whatever action you chain it with (.WillOnce(Returns(....)) is called immediately, so for example if you want to wait on the call, you can use EXPECT_CALL(containerizer, launch(_, _, _......)).WillOnce(FutureSatisify(&future)) and then AWAIT_READY(future), which the EXPECT_CALL will set the future state to be ready when the launch method is called.

I think what I would do though is to create a promise, and return that promise's future instead:

Promise<Nothing> promise;
EXPECT_CALL(containerizer, launch(_, _, _,....))
  .WillOnce(Return(promise.future()));
  
And then Clock::pause(), Clock::advance(timeout), and verify that the executor timeout is not ready ASSERT_TRUE(timeoutFuture.isPending())

then satisfy the promise so the future is set to ready:

promise.set(Nothing())

After wards advance the clock and verify that it is ready.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?

I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
I cannot access, since they are local variables, and NOT the member variables of any class.

Any ideas as to how do I get around it, without introducting any extra member variables ?


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?

The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.
> 
> Nishant Suneja wrote:
>     Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.
>     
>     Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.
> 
> Timothy Chen wrote:
>     One way to verify this is to use FUTURE_DISPATCH (you can search for examples in existing test). FUTURE_DISPATCH returns a future, which becomes ready when the method provided is dispatched (which means queued to serve in libprocess). Delay shouldn't queue it until the timeout happen.
> 
> Nishant Suneja wrote:
>     Yeah.. FUTURE_DISPATCH should do the job.
>     However, a question about the failure testcase. When we invoke Clock::advance() by a certain duration, any registered timer object which falls within that duration gets
>     called. For 2nd test, I can use the AWAIT_READY() macro on the future, and then advancing the clock should result in our "registerExecutorTimeout" being called, which is good.
>     
>     Although, for the 1st test case, before advancing the clock, I would want to make sure that the launch() function has actually returned, and our current code triggers the timer (otherwise, I would be advancing the clock, when no timer object has actually been set, so obviously nothing will get fired by advancing the clock).
>     EXPECT_CALL gives us a handle to that Future object, but how do I determine that launch() method has returned (still in pending state though), so i can advance the clock ? i.e. on what condition should I wait here?
> 
> Timothy Chen wrote:
>     EXPECT_CALL is called when the method is invoked, and whatever action you chain it with (.WillOnce(Returns(....)) is called immediately, so for example if you want to wait on the call, you can use EXPECT_CALL(containerizer, launch(_, _, _......)).WillOnce(FutureSatisify(&future)) and then AWAIT_READY(future), which the EXPECT_CALL will set the future state to be ready when the launch method is called.
>     
>     I think what I would do though is to create a promise, and return that promise's future instead:
>     
>     Promise<Nothing> promise;
>     EXPECT_CALL(containerizer, launch(_, _, _,....))
>       .WillOnce(Return(promise.future()));
>       
>     And then Clock::pause(), Clock::advance(timeout), and verify that the executor timeout is not ready ASSERT_TRUE(timeoutFuture.isPending())
>     
>     then satisfy the promise so the future is set to ready:
>     
>     promise.set(Nothing())
>     
>     After wards advance the clock and verify that it is ready.

Hey Timothy

So, I went through the documentation of gmock, and got a better understanding of the magic MACROS.

That said, I still have 2 questions:

1) During a unit test run, where we instantiate a master and a slave, how many processes are we actually dealing with ? 
   Logic says that we have 1 test case process, and its 2 child processes (slave and master).
   I ask this because it seems that we are sharing the containerizer object between test case processs and its child (slave) process.
   Also, the callback in the ACTION part of the EXPECT_CALL, I believe that its been invoked in the context of the slave process.
   Just wanted to clarify if I am on the correct page.
   
2) So, if the above statement is correct, in the code below, we seem to have a race. We would expect this test case to pass with our
   new code. However, in my patch, the call to delay() happens in onReady() method of the launch future object, which happens
   AFTER invocation of containerizer::launch().
   Now, its very well possible, that we pause and advance the clock in the parent test case process, before the delay() 
   gets called in the slave process, in which case the timer wont fire at all.
   
3) Which brings me to my third question, as to are we sharing the same clock across these 3 processes ?
   Clock class (for eg.) is packaged as part of libprocess library, which can be shared among these 3 processes,
   but there would be separate copies of static/global data of this class for the 3 processes (only the code segment should be shared).
   So, will the timer set by one process, be visible to another process at all ?
   
   
   CODE:

   Future<Nothing> launch;
   EXPECT_CALL(containerizer, launch(_, _, _, _, _, _, _))
    .WillOnce(DoAll(FutureSatisfy(&launch),
                    Return(true)));

  //future object for call to "registerExecutorTimeout"
  Future<Nothing> executorRegistrationTimeout =
    FUTURE_DISPATCH(_, &Slave::registerExecutorTimeout);

  driver.launchTasks(offers.get()[0].id(), tasks);

  AWAIT_READY(launch);
 
  Clock::pause();

  Clock::advance(flags.executor_registration_timeout);

  //ensure that the "registerExecutorTimeout" HAS fired
  ASSERT_TRUE(executorRegistrationTimeout.isReady());

  Clock::resume();


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.

Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.

Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.
> 
> Nishant Suneja wrote:
>     Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
>     However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).
>     
>     So, again, any ideas ?
> 
> Timothy Chen wrote:
>     The easiest way is to simply advance the clock (Clock::advance(....)) until the timeout, and verify that the delay call is not yet called. Then you satisfy the future and advance the time again and verify it is called.
> 
> Nishant Suneja wrote:
>     Ok...And whats the best way to verify that the "Slave::registerExecutorTimeout" has been called (after advancing the clock). I could check the state of the executor, and verify that its TERMINATING, which would mean that this method has executed.
>     
>     Any other cleaner way to verify this callback ? I believe that EXPECT_CALL macro only works for classes with MOCK_METHOD set of methods defined. So, I cant use that.
> 
> Timothy Chen wrote:
>     One way to verify this is to use FUTURE_DISPATCH (you can search for examples in existing test). FUTURE_DISPATCH returns a future, which becomes ready when the method provided is dispatched (which means queued to serve in libprocess). Delay shouldn't queue it until the timeout happen.
> 
> Nishant Suneja wrote:
>     Yeah.. FUTURE_DISPATCH should do the job.
>     However, a question about the failure testcase. When we invoke Clock::advance() by a certain duration, any registered timer object which falls within that duration gets
>     called. For 2nd test, I can use the AWAIT_READY() macro on the future, and then advancing the clock should result in our "registerExecutorTimeout" being called, which is good.
>     
>     Although, for the 1st test case, before advancing the clock, I would want to make sure that the launch() function has actually returned, and our current code triggers the timer (otherwise, I would be advancing the clock, when no timer object has actually been set, so obviously nothing will get fired by advancing the clock).
>     EXPECT_CALL gives us a handle to that Future object, but how do I determine that launch() method has returned (still in pending state though), so i can advance the clock ? i.e. on what condition should I wait here?
> 
> Timothy Chen wrote:
>     EXPECT_CALL is called when the method is invoked, and whatever action you chain it with (.WillOnce(Returns(....)) is called immediately, so for example if you want to wait on the call, you can use EXPECT_CALL(containerizer, launch(_, _, _......)).WillOnce(FutureSatisify(&future)) and then AWAIT_READY(future), which the EXPECT_CALL will set the future state to be ready when the launch method is called.
>     
>     I think what I would do though is to create a promise, and return that promise's future instead:
>     
>     Promise<Nothing> promise;
>     EXPECT_CALL(containerizer, launch(_, _, _,....))
>       .WillOnce(Return(promise.future()));
>       
>     And then Clock::pause(), Clock::advance(timeout), and verify that the executor timeout is not ready ASSERT_TRUE(timeoutFuture.isPending())
>     
>     then satisfy the promise so the future is set to ready:
>     
>     promise.set(Nothing())
>     
>     After wards advance the clock and verify that it is ready.
> 
> Nishant Suneja wrote:
>     Hey Timothy
>     
>     So, I went through the documentation of gmock, and got a better understanding of the magic MACROS.
>     
>     That said, I still have 2 questions:
>     
>     1) During a unit test run, where we instantiate a master and a slave, how many processes are we actually dealing with ? 
>        Logic says that we have 1 test case process, and its 2 child processes (slave and master).
>        I ask this because it seems that we are sharing the containerizer object between test case processs and its child (slave) process.
>        Also, the callback in the ACTION part of the EXPECT_CALL, I believe that its been invoked in the context of the slave process.
>        Just wanted to clarify if I am on the correct page.
>        
>     2) So, if the above statement is correct, in the code below, we seem to have a race. We would expect this test case to pass with our
>        new code. However, in my patch, the call to delay() happens in onReady() method of the launch future object, which happens
>        AFTER invocation of containerizer::launch().
>        Now, its very well possible, that we pause and advance the clock in the parent test case process, before the delay() 
>        gets called in the slave process, in which case the timer wont fire at all.
>        
>     3) Which brings me to my third question, as to are we sharing the same clock across these 3 processes ?
>        Clock class (for eg.) is packaged as part of libprocess library, which can be shared among these 3 processes,
>        but there would be separate copies of static/global data of this class for the 3 processes (only the code segment should be shared).
>        So, will the timer set by one process, be visible to another process at all ?
>        
>        
>        CODE:
>     
>        Future<Nothing> launch;
>        EXPECT_CALL(containerizer, launch(_, _, _, _, _, _, _))
>         .WillOnce(DoAll(FutureSatisfy(&launch),
>                         Return(true)));
>     
>       //future object for call to "registerExecutorTimeout"
>       Future<Nothing> executorRegistrationTimeout =
>         FUTURE_DISPATCH(_, &Slave::registerExecutorTimeout);
>     
>       driver.launchTasks(offers.get()[0].id(), tasks);
>     
>       AWAIT_READY(launch);
>      
>       Clock::pause();
>     
>       Clock::advance(flags.executor_registration_timeout);
>     
>       //ensure that the "registerExecutorTimeout" HAS fired
>       ASSERT_TRUE(executorRegistrationTimeout.isReady());
>     
>       Clock::resume();
> 
> Nishant Suneja wrote:
>     Ok..I have updated the diff with the testcase.
>     
>     I was able to answer my questions. As for question-1), For unit tests, we are running all the components in the same process, rather than spawning one process per component. Essentially, we seem to be relying on a single thread event driven kind of mechanism for unit tests.
>     So, my question 3) becomes irrelevant.
>     
>     For question-2, the trick was to use two futures (a future and a promise), one to track the call of the launch() method, and the 2nd one to trigger the registration timeout.
>     
>     
>     Additionally, I have moved the delay() call from onReady() callback to onAny() callback, since we want the timer to be set, even if the future fails or is dicarded.

Hi Nishant, sorry didn't get time to reply you. Glad you figured out your questions.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 31, 2014, 5:11 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 5:11 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onAny() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::AsynchronousExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?
> 
> Timothy Chen wrote:
>     you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.

Thats a neat way to get hold of the future object. So, for testing I have to basically verify that the timer object for Slave's "registerExecutor" is not created, till we reach the READY state of the future object. Now, I could have done something crazy like iterate the map of timer objects in the Clock class, and verify that we dont have this timer object till we are in READY state of the future object.
However, this map is a defined as static in clock.cpp, so I wouldnt be able to access it in the testcase, without making the map global (not a good idea).

So, again, any ideas ?


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Dec. 26, 2014, 9:11 p.m., Timothy Chen wrote:
> > Can you add a unit test?
> 
> Nishant Suneja wrote:
>     I wanted to add one, but I would need access to the "Future" object for the container launch, or "Timer" object for registration timeout trigger, both of which
>     I cannot access, since they are local variables, and NOT the member variables of any class.
>     
>     Any ideas as to how do I get around it, without introducting any extra member variables ?

you can get hold of the future by passing in a mock containerizzer and perform a EXPECT_CALL(comtainerizer, launch(.....)) and return a future that you instantiate in test.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66171
-----------------------------------------------------------


Can you add a unit test?

- Timothy Chen


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66170
-----------------------------------------------------------


Patch looks great!

Reviews applied: [29437]

All tests passed.

- Mesos ReviewBot


On Dec. 26, 2014, 6:29 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2014, 6:29 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onReady() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Dec. 31, 2014, 6:12 a.m., Timothy Chen wrote:
> > src/tests/slave_tests.cpp, line 1649
> > <https://reviews.apache.org/r/29437/diff/3/?file=804508#file804508line1649>
> >
> >     Why settle?

I settle here to ensure that we have invoked and finished "Slave::executorLaunched" method, because this method does the actual registration timeout scheduling.
Note the defer() callback in onAny() method of the launch future, in Framework::launchExecutor. Since, execution of onAny() method just enqueues, the "executorLaunched" method in ProcessManager, we would want to settle here.


On Dec. 31, 2014, 6:12 a.m., Nishant Suneja wrote:
> > Have you looked at all the other tests that refers to executor_registration_timeout, and made sure they all passed as well? 
> > Since timing is a pretty sensitive for mesos test, I suggest you also try to run all the tests on repeat (./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=100 --verbose) and make sure all tests still pass.

Yeah.. I did that.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66447
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66447
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/29437/#comment109997>

    We usually don't comment what the code logic is doing, as it's self explanatory here. What we usually put in the comments is why, can you put down the reason why we start the timer after the launch future is set?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment109998>

    It's unclear what asynchronous here means. It seems originally the delay(timeout) is happening async already, it's just when the delay is actually scheduled. I think just mentioning that the test is checking the timeout happens after the containerizer launch future completes, and remove the Asynchronous in the command and the test name.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110001>

    We are starting to move away from the space between ending brackets. So please remove the extra space:
    
    Try<PID<Master>>



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110000>

    About commenting style, all comments needs to end with period, within 70 char width and have a space in the beginning.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110002>

    There should be only one wait call as well, let's use WillOnce to verify the right behavior.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment109999>

    Recover should only get called once right? (.WillOnce)



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110003>

    Fix comment style mentioned above.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110004>

    Fix comment style mentioned above.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110005>

    Fix comment style mentioned above, and everywhere else.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110006>

    We typically name the future the same name as the function, AWAIT_READY(launch).



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110007>

    Why settle here? seems unncessary.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110008>

    Why settle?



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110009>

    Seems pretty self-explanatory?


Have you looked at all the other tests that refers to executor_registration_timeout, and made sure they all passed as well? 
Since timing is a pretty sensitive for mesos test, I suggest you also try to run all the tests on repeat (./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=100 --verbose) and make sure all tests still pass.

- Timothy Chen


On Dec. 31, 2014, 5:11 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 5:11 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onAny() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::AsynchronousExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66448
-----------------------------------------------------------


Patch looks great!

Reviews applied: [29437]

All tests passed.

- Mesos ReviewBot


On Dec. 31, 2014, 5:11 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 5:11 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
> when the onAny() callback is made for the container's future object, instead of starting the
> timer synchronously when the launchExecutor() method of the Framework class is invoked.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::AsynchronousExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66494
-----------------------------------------------------------


Patch looks great!

Reviews applied: [29437]

All tests passed.

- Mesos ReviewBot


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review67005
-----------------------------------------------------------


Patch looks great!

Reviews applied: [29437]

All tests passed.

- Mesos ReviewBot


On Jan. 7, 2015, 10:24 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 10:24 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1ab 
>   src/slave/constants.cpp 2a99b11 
>   src/slave/flags.hpp 670997d 
>   src/slave/slave.hpp 70bd8c1 
>   src/slave/slave.cpp 50b5781 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36 
>   src/tests/containerizer.hpp 24b014f 
>   src/tests/slave_tests.cpp f2896a1 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Jiang Yan Xu <ya...@jxu.me>.

> On July 14, 2015, 10:21 a.m., Timothy Chen wrote:
> > Nishant are you still around to help rebase this? Sorry I think we dropped this review somehow.

He responded on https://issues.apache.org/jira/browse/MESOS-999. I did some work on it but ended up not pushing it. Would you like to comment on the ticket?


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review91637
-----------------------------------------------------------


On Jan. 7, 2015, 11:54 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 11:54 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/constants.cpp 2a99b1105af0e2d62b956fa96988f477937a46bd 
>   src/slave/flags.hpp 670997dc3a702cd5edf33f2e5824c5e4dfe4ecef 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp c50cbc799d4793243f184f9fe628b69a020adc66 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorNoLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review91637
-----------------------------------------------------------


Nishant are you still around to help rebase this? Sorry I think we dropped this review somehow.

- Timothy Chen


On Jan. 8, 2015, 7:54 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 8, 2015, 7:54 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/constants.cpp 2a99b1105af0e2d62b956fa96988f477937a46bd 
>   src/slave/flags.hpp 670997dc3a702cd5edf33f2e5824c5e4dfe4ecef 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp c50cbc799d4793243f184f9fe628b69a020adc66 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorNoLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review67179
-----------------------------------------------------------


Patch looks great!

Reviews applied: [29437]

All tests passed.

- Mesos ReviewBot


On Jan. 8, 2015, 7:54 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 8, 2015, 7:54 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/constants.cpp 2a99b1105af0e2d62b956fa96988f477937a46bd 
>   src/slave/flags.hpp 670997dc3a702cd5edf33f2e5824c5e4dfe4ecef 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp c50cbc799d4793243f184f9fe628b69a020adc66 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorNoLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Niklas Nielsen <ni...@qni.dk>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review87245
-----------------------------------------------------------


@tnachen: What is blocked this? Looks like it got a ship it, but stopped review-cycles somehow?

- Niklas Nielsen


On Jan. 7, 2015, 11:54 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 11:54 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/constants.cpp 2a99b1105af0e2d62b956fa96988f477937a46bd 
>   src/slave/flags.hpp 670997dc3a702cd5edf33f2e5824c5e4dfe4ecef 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp c50cbc799d4793243f184f9fe628b69a020adc66 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorNoLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Jan. 8, 2015, 7:54 a.m.)


Review request for mesos and Timothy Chen.


Changes
-------

Adding a unit test to check that the launch timer gets cancelled, if the container launches successfully.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description
-------

As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.

Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.


Diffs (updated)
-----

  src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
  src/slave/constants.cpp 2a99b1105af0e2d62b956fa96988f477937a46bd 
  src/slave/flags.hpp 670997dc3a702cd5edf33f2e5824c5e4dfe4ecef 
  src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
  src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
  src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
  src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
  src/tests/slave_tests.cpp c50cbc799d4793243f184f9fe628b69a020adc66 

Diff: https://reviews.apache.org/r/29437/diff/


Testing (updated)
-------

Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
Added the unit test : SlaveTest:: ExecutorNoLaunchTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Jan. 7, 2015, 10:24 a.m.)


Review request for mesos and Timothy Chen.


Changes
-------

The previous patch update resulted in a mesos test failure, unrelated to this patch. Re-trying the same patch.
All the tests run fine on my local machine.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description
-------

As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.

Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.


Diffs (updated)
-----

  src/slave/constants.hpp fd1c1ab 
  src/slave/constants.cpp 2a99b11 
  src/slave/flags.hpp 670997d 
  src/slave/slave.hpp 70bd8c1 
  src/slave/slave.cpp 50b5781 
  src/tests/composing_containerizer_tests.cpp 5ab5a36 
  src/tests/containerizer.hpp 24b014f 
  src/tests/slave_tests.cpp f2896a1 

Diff: https://reviews.apache.org/r/29437/diff/


Testing
-------

Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Jan. 7, 2015, 7:55 a.m.)


Review request for mesos and Timothy Chen.


Changes
-------

Implementing Timothy's review comments.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description
-------

As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.

Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.


Diffs (updated)
-----

  src/slave/constants.hpp fd1c1ab 
  src/slave/constants.cpp 2a99b11 
  src/slave/flags.hpp 670997d 
  src/slave/slave.hpp 70bd8c1 
  src/slave/slave.cpp 50b5781 
  src/tests/composing_containerizer_tests.cpp 5ab5a36 
  src/tests/containerizer.hpp 24b014f 
  src/tests/slave_tests.cpp f2896a1 

Diff: https://reviews.apache.org/r/29437/diff/


Testing
-------

Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Jan. 7, 2015, 7:15 a.m., Timothy Chen wrote:
> > Hi Nishant, thanks for the updated review. Just FYI, our preferred patches are actually small and logical commits. So in your example, is one patch just for executor registration timeout, and another review for executor launch timeout. And you can depend one review on top of another. The larger the patch the harder it is to accept and commit it.

Thanks for the update Timothy. I will keep that in mind from the next patch onwards.
Also, any thoughs on my earlier comment, in case you missed it:


"Yeah.. I have implemented that.
So, I was writing a testcase to verify the container launch timeout logic, which has 2 conditions to be tested::

a) container launch fails, and timeout action triggers.
b) container launch happens, and timeout action is cancelled.

I wanted to test both of them in the same testcase. However, once the "launchExecutor()" call has been made for the default executor testing the first case, 2nd case doesnt call the "launchExecutor()" because the framework already has an in-memory pointer for it, and skips the launch.

What would be the easiest way to trigger launchExecutor() the 2nd time ? I wanted to get a handle of the Framework structure, and call destoryExecutor(), before launching the tasks again, but couldnt find an easy way to get the handle."


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66990
-----------------------------------------------------------


On Jan. 7, 2015, 6:41 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 6:41 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1ab 
>   src/slave/constants.cpp 2a99b11 
>   src/slave/flags.hpp 670997d 
>   src/slave/slave.hpp 70bd8c1 
>   src/slave/slave.cpp 50b5781 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36 
>   src/tests/containerizer.hpp 24b014f 
>   src/tests/slave_tests.cpp f2896a1 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Jan. 7, 2015, 7:15 a.m., Timothy Chen wrote:
> > Hi Nishant, thanks for the updated review. Just FYI, our preferred patches are actually small and logical commits. So in your example, is one patch just for executor registration timeout, and another review for executor launch timeout. And you can depend one review on top of another. The larger the patch the harder it is to accept and commit it.
> 
> Nishant Suneja wrote:
>     Thanks for the update Timothy. I will keep that in mind from the next patch onwards.
>     Also, any thoughs on my earlier comment, in case you missed it:
>     
>     
>     "Yeah.. I have implemented that.
>     So, I was writing a testcase to verify the container launch timeout logic, which has 2 conditions to be tested::
>     
>     a) container launch fails, and timeout action triggers.
>     b) container launch happens, and timeout action is cancelled.
>     
>     I wanted to test both of them in the same testcase. However, once the "launchExecutor()" call has been made for the default executor testing the first case, 2nd case doesnt call the "launchExecutor()" because the framework already has an in-memory pointer for it, and skips the launch.
>     
>     What would be the easiest way to trigger launchExecutor() the 2nd time ? I wanted to get a handle of the Framework structure, and call destoryExecutor(), before launching the tasks again, but couldnt find an easy way to get the handle."

Hey Timothy

Since this patch became a little too big, I have broken down this patch into 2 separate patches: 29718 and 29720.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66990
-----------------------------------------------------------


On Jan. 8, 2015, 7:54 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 8, 2015, 7:54 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/constants.cpp 2a99b1105af0e2d62b956fa96988f477937a46bd 
>   src/slave/flags.hpp 670997dc3a702cd5edf33f2e5824c5e4dfe4ecef 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp c50cbc799d4793243f184f9fe628b69a020adc66 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorNoLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66990
-----------------------------------------------------------


Hi Nishant, thanks for the updated review. Just FYI, our preferred patches are actually small and logical commits. So in your example, is one patch just for executor registration timeout, and another review for executor launch timeout. And you can depend one review on top of another. The larger the patch the harder it is to accept and commit it.

- Timothy Chen


On Jan. 7, 2015, 6:41 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 6:41 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1ab 
>   src/slave/constants.cpp 2a99b11 
>   src/slave/flags.hpp 670997d 
>   src/slave/slave.hpp 70bd8c1 
>   src/slave/slave.cpp 50b5781 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36 
>   src/tests/containerizer.hpp 24b014f 
>   src/tests/slave_tests.cpp f2896a1 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Jan. 7, 2015, 7:25 a.m., Timothy Chen wrote:
> > src/tests/slave_tests.cpp, line 1694
> > <https://reviews.apache.org/r/29437/diff/5/?file=808529#file808529line1694>
> >
> >     This shouldn't be called at all right? Remove if it's not expected.

The ShutDown() routine gets the list of containers from the containerizer, where this call is invoked.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66992
-----------------------------------------------------------


On Jan. 7, 2015, 7:55 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 7:55 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1ab 
>   src/slave/constants.cpp 2a99b11 
>   src/slave/flags.hpp 670997d 
>   src/slave/slave.hpp 70bd8c1 
>   src/slave/slave.cpp 50b5781 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36 
>   src/tests/containerizer.hpp 24b014f 
>   src/tests/slave_tests.cpp f2896a1 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66992
-----------------------------------------------------------



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110796>

    Recover should only happen once.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110797>

    This shouldn't be called at all right? Remove if it's not expected.



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110794>

    doesnt -> doesn't. Let's make the comment style consistent as well.
    Please capitalize the first letter of the sentence.
    
    // Set the.....



src/tests/slave_tests.cpp
<https://reviews.apache.org/r/29437/#comment110795>

    You can ignore the .Times(1) as it's the default


- Timothy Chen


On Jan. 7, 2015, 6:41 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 6:41 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1ab 
>   src/slave/constants.cpp 2a99b11 
>   src/slave/flags.hpp 670997d 
>   src/slave/slave.hpp 70bd8c1 
>   src/slave/slave.cpp 50b5781 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36 
>   src/tests/containerizer.hpp 24b014f 
>   src/tests/slave_tests.cpp f2896a1 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66993
-----------------------------------------------------------


Patch looks great!

Reviews applied: [29437]

All tests passed.

- Mesos ReviewBot


On Jan. 7, 2015, 6:41 a.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2015, 6:41 a.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.
> 
> Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp fd1c1ab 
>   src/slave/constants.cpp 2a99b11 
>   src/slave/flags.hpp 670997d 
>   src/slave/slave.hpp 70bd8c1 
>   src/slave/slave.cpp 50b5781 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36 
>   src/tests/containerizer.hpp 24b014f 
>   src/tests/slave_tests.cpp f2896a1 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Jan. 7, 2015, 6:41 a.m.)


Review request for mesos and Timothy Chen.


Changes
-------

As per discussion with Timothy, adding the "executor container launch timeout" timer.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description (updated)
-------

As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending.

Also, a new executor launch timer has been added. This timer gates the time in which a successful executor container launch should happen. The executor registration timer starts after the successful container launch.


Diffs (updated)
-----

  src/slave/constants.hpp fd1c1ab 
  src/slave/constants.cpp 2a99b11 
  src/slave/flags.hpp 670997d 
  src/slave/slave.hpp 70bd8c1 
  src/slave/slave.cpp 50b5781 
  src/tests/composing_containerizer_tests.cpp 5ab5a36 
  src/tests/containerizer.hpp 24b014f 
  src/tests/slave_tests.cpp f2896a1 

Diff: https://reviews.apache.org/r/29437/diff/


Testing (updated)
-------

Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
Added the unit test : SlaveTest:: ExecutorLaunchTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?
> 
> Nishant Suneja wrote:
>     I concur with Timothy. I think having a separate timeout to track the container launch itself makes sense.
> 
> Timothy Chen wrote:
>     Hi Nishant, can you also add another timer for gating the launch? Once we have both PRs then I think we are in a better state to merge both.
> 
> Tom Arnfeld wrote:
>     Just caught wind of this change and it's good to see! Going to be very useful over here. Can I request that there's also a command line flag for the slave to change the launch timeout? :-)

@Timothy: For launch timeout, if the launch doesnt happen within the specified timeout duration, what action would we want to take ? We could possibly use the same callback "registerExecutorTimeout", which effectively moves the executor to TERMINATING state (for gc later on) and calls the containerizer destroy for that container.

Also, what default value would we want for this timeout ?


@Tom: yeah..I think its a good idea. I can add that.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?

I concur with Timothy. I think having a separate timeout to track the container launch itself makes sense.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Tom Arnfeld <ta...@me.com>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?
> 
> Nishant Suneja wrote:
>     I concur with Timothy. I think having a separate timeout to track the container launch itself makes sense.
> 
> Timothy Chen wrote:
>     Hi Nishant, can you also add another timer for gating the launch? Once we have both PRs then I think we are in a better state to merge both.

Just caught wind of this change and it's good to see! Going to be very useful over here. Can I request that there's also a command line flag for the slave to change the launch timeout? :-)


- Tom


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?
> 
> Nishant Suneja wrote:
>     I concur with Timothy. I think having a separate timeout to track the container launch itself makes sense.

Hi Nishant, can you also add another timer for gating the launch? Once we have both PRs then I think we are in a better state to merge both.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?
> 
> Nishant Suneja wrote:
>     I concur with Timothy. I think having a separate timeout to track the container launch itself makes sense.
> 
> Timothy Chen wrote:
>     Hi Nishant, can you also add another timer for gating the launch? Once we have both PRs then I think we are in a better state to merge both.
> 
> Tom Arnfeld wrote:
>     Just caught wind of this change and it's good to see! Going to be very useful over here. Can I request that there's also a command line flag for the slave to change the launch timeout? :-)
> 
> Nishant Suneja wrote:
>     @Timothy: For launch timeout, if the launch doesnt happen within the specified timeout duration, what action would we want to take ? We could possibly use the same callback "registerExecutorTimeout", which effectively moves the executor to TERMINATING state (for gc later on) and calls the containerizer destroy for that container.
>     
>     Also, what default value would we want for this timeout ?
>     
>     
>     @Tom: yeah..I think its a good idea. I can add that.

I think we want to perform the same action (registerExecutorTimeout) actually. If it's going to be shared, I think we should rename so it's not executor registration specific.
The other part is to make sure the timeout is cancelled when launch is finished before issuing the next delay call.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?
> 
> Nishant Suneja wrote:
>     I concur with Timothy. I think having a separate timeout to track the container launch itself makes sense.
> 
> Timothy Chen wrote:
>     Hi Nishant, can you also add another timer for gating the launch? Once we have both PRs then I think we are in a better state to merge both.
> 
> Tom Arnfeld wrote:
>     Just caught wind of this change and it's good to see! Going to be very useful over here. Can I request that there's also a command line flag for the slave to change the launch timeout? :-)
> 
> Nishant Suneja wrote:
>     @Timothy: For launch timeout, if the launch doesnt happen within the specified timeout duration, what action would we want to take ? We could possibly use the same callback "registerExecutorTimeout", which effectively moves the executor to TERMINATING state (for gc later on) and calls the containerizer destroy for that container.
>     
>     Also, what default value would we want for this timeout ?
>     
>     
>     @Tom: yeah..I think its a good idea. I can add that.
> 
> Timothy Chen wrote:
>     I think we want to perform the same action (registerExecutorTimeout) actually. If it's going to be shared, I think we should rename so it's not executor registration specific.
>     The other part is to make sure the timeout is cancelled when launch is finished before issuing the next delay call.

Yeah.. I have implemented that.
So, I was writing a testcase to verify the container launch timeout logic, which has 2 conditions to be tested::

a) container launch fails, and timeout action triggers.
b) container launch happens, and timeout action is cancelled.

I wanted to test both of them in the same testcase. However, once the "launchExecutor()" call has been made for the default executor testing the first case, 2nd case doesnt call the "launchExecutor()" because the framework already has an in-memory pointer for it, and skips the launch.

What would be the easiest way to trigger launchExecutor() the 2nd time ? I wanted to get a handle of the Framework structure, and call destoryExecutor(), before launching the tasks again, but couldnt find an easy way to get the handle.


- Nishant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?

Ben that's a good point, previously the registration timeout gates the launch not to take forever, however it is too coarse of a timeout as it included the time to launch and also the time it take for the executor to register after launch.

Now it will only time out on the latter case, I think we do need a seperate timeout that checks for the launch itself. Ben what do you think?


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


What happened previously if a launch takes forever?

What happens now if a launch takes forever?

- Ben Mahler


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.

> On Jan. 2, 2015, 7:28 p.m., Timothy Chen wrote:
> > Ship It!

Thanks, the patch LGTM, I'll probably modify some of the comments but overall looks ok to me. I'll let this sit for a day and commit this.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66542
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66542
-----------------------------------------------------------

Ship it!


Ship It!

- Timothy Chen


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Dec. 31, 2014, 11:57 p.m.)


Review request for mesos and Timothy Chen.


Changes
-------

Implemeting Timothy's review comments.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description (updated)
-------

As part of this bug fix, I have trigerred the executor registration timeout timer after the container's future object is set, instead of starting the timer when the container launch is still pending


Diffs (updated)
-----

  src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
  src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
  src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
  src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 

Diff: https://reviews.apache.org/r/29437/diff/


Testing (updated)
-------

Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Dec. 31, 2014, 5:11 a.m.)


Review request for mesos and Timothy Chen.


Changes
-------

Fixing unit test execution error.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description
-------

As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
when the onAny() callback is made for the container's future object, instead of starting the
timer synchronously when the launchExecutor() method of the Framework class is invoked.


Diffs (updated)
-----

  src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
  src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
  src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
  src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 

Diff: https://reviews.apache.org/r/29437/diff/


Testing
-------

Added the unit test : SlaveTest::AsynchronousExecutorRegistrationTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja


Re: Review Request 29437: Bug fix: Start the executor registration timer, only when the container has launched successfully

Posted by Nishant Suneja <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/
-----------------------------------------------------------

(Updated Dec. 31, 2014, 3:01 a.m.)


Review request for mesos and Timothy Chen.


Changes
-------

Adding a unit test case.


Bugs: MESOS-999
    https://issues.apache.org/jira/browse/MESOS-999


Repository: mesos-git


Description (updated)
-------

As part of this bug fix, I have trigerred the executor registration timeout timer asychronously,
when the onAny() callback is made for the container's future object, instead of starting the
timer synchronously when the launchExecutor() method of the Framework class is invoked.


Diffs (updated)
-----

  src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
  src/tests/composing_containerizer_tests.cpp 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
  src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
  src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 

Diff: https://reviews.apache.org/r/29437/diff/


Testing (updated)
-------

Added the unit test : SlaveTest::AsynchronousExecutorRegistrationTimeoutTrigger
make check succeeds.


Thanks,

Nishant Suneja