You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2015/12/09 21:02:16 UTC

Re: Review Request 40429: Report executor exit to framework schedulers.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40429/#review109593
-----------------------------------------------------------


Few things you need to do before this can get committed

1) Send an email to the mailing list announcing this change. I hope none of the existing schedulers crash when they receive this callback, but we want to make sure.
2) Update the upgrade and changelog docs.
3) Add a NOTE to the executorLost() (and even slaveLost()) method in the C++/Java/Python interfaces that this is not reliably delivered.

Another, easier option of course is to not do this change in the scheduler driver and live with the fact that this event is only delivered to HTTP schedulers and not driver based schedulers.


src/sched/sched.cpp (lines 219 - 223)
<https://reviews.apache.org/r/40429/#comment169169>

    pull this up to #209.



src/sched/sched.cpp (lines 221 - 222)
<https://reviews.apache.org/r/40429/#comment169170>

    flip the order of these two.



src/sched/sched.cpp (line 1053)
<https://reviews.apache.org/r/40429/#comment169171>

    move this upto #1009.



src/sched/sched.cpp (line 1075)
<https://reviews.apache.org/r/40429/#comment169172>

    remove "!"



src/tests/scheduler_event_call_tests.cpp (line 615)
<https://reviews.apache.org/r/40429/#comment169175>

    Can you also add/update a test that uses scheduler driver to ensure that this callback is called Perhaps, MasterSlaveReconciliationTest.SlaveReregisterTerminatedExecutor ?
    
    I would imagine you would need to update a lot more tests that don't expect this callback but now get this. It's likely that GMOCK only throws a warning but doesn't error out. You can see the warnings if you run the tests *without* verbose mode.


- Vinod Kone


On Nov. 18, 2015, 9:50 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40429/
> -----------------------------------------------------------
> 
> (Updated Nov. 18, 2015, 9:50 p.m.)
> 
> 
> Review request for mesos, Adam B and Vinod Kone.
> 
> 
> Bugs: MESOS-313
>     https://issues.apache.org/jira/browse/MESOS-313
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Report executor exit to framework schedulers. This is a MVP to start the work of notifying scheduler on scheduler refresh.
> 
> Next step would be sending this message reliabily, and/or splitting Event::FAILURE for slave failure and executor termination.
> 
> 
> Diffs
> -----
> 
>   src/sched/sched.cpp a6faf92ff99cd79c3817684581862fecd1608048 
>   src/tests/scheduler_event_call_tests.cpp 39f67a8243db8073d1c9c92c7aeb71854143131d 
> 
> Diff: https://reviews.apache.org/r/40429/diff/
> 
> 
> Testing
> -------
> 
> Modified test for SchedulerDriverEventTest.Failure, which verifies that MockScheduler::executorLost is invoked.
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Re: Review Request 40429: Report executor exit to framework schedulers.

Posted by Zhitao Li <zh...@gmail.com>.

> On Dec. 9, 2015, 8:02 p.m., Vinod Kone wrote:
> > Few things you need to do before this can get committed
> > 
> > 1) Send an email to the mailing list announcing this change. I hope none of the existing schedulers crash when they receive this callback, but we want to make sure.
> > 2) Update the upgrade and changelog docs.
> > 3) Add a NOTE to the executorLost() (and even slaveLost()) method in the C++/Java/Python interfaces that this is not reliably delivered.
> > 
> > Another, easier option of course is to not do this change in the scheduler driver and live with the fact that this event is only delivered to HTTP schedulers and not driver based schedulers.

Please review the document change. I'll post the draft for email to dev mailing list here later.


> On Dec. 9, 2015, 8:02 p.m., Vinod Kone wrote:
> > src/tests/scheduler_event_call_tests.cpp, line 616
> > <https://reviews.apache.org/r/40429/diff/3/?file=1131539#file1131539line616>
> >
> >     Can you also add/update a test that uses scheduler driver to ensure that this callback is called Perhaps, MasterSlaveReconciliationTest.SlaveReregisterTerminatedExecutor ?
> >     
> >     
> >     I would imagine you would need to update a lot more tests that don't expect this callback but now get this. It's likely that GMOCK only throws a warning but doesn't error out. You can see the warnings if you run the tests *without* verbose mode.

I changed MockScheduler to a StrictMock in another branch and fixed all tests in that branch, so I think the fix here should be complete.

Vinod, please let me know if you want me to send that out too.


- Zhitao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40429/#review109593
-----------------------------------------------------------


On Dec. 14, 2015, 8:09 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40429/
> -----------------------------------------------------------
> 
> (Updated Dec. 14, 2015, 8:09 p.m.)
> 
> 
> Review request for mesos, Adam B and Vinod Kone.
> 
> 
> Bugs: MESOS-313
>     https://issues.apache.org/jira/browse/MESOS-313
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Report executor exit to framework schedulers. This is a MVP to start the work of notifying scheduler on scheduler refresh.
> 
> Next step would be sending this message reliabily, and/or splitting Event::FAILURE for slave failure and executor termination.
> 
> 
> Diffs
> -----
> 
>   CHANGELOG dbefa5df9e9183155bee532193148988dfc1fb84 
>   docs/app-framework-development-guide.md 4a43a93d080bdac37b8aee91748fea7552a1cc67 
>   docs/upgrades.md 7c1f1814680078380ca33bbc27421675ffe61d60 
>   include/mesos/scheduler.hpp 049c041286f3167e79cc5ea8a9e0bf8d42569832 
>   src/java/src/org/apache/mesos/Scheduler.java 4f048830a2c47f747033c60730cc770cb2578815 
>   src/python/interface/src/mesos/interface/__init__.py 4be502fd83029ad5fc798696caf9e27fd95f7482 
>   src/sched/sched.cpp 44eb4f50e8ed84297268d94a3a0320c843ff6d8c 
>   src/tests/fault_tolerance_tests.cpp ba657d0e1d8515cffd1b37925bf91a84b2feaef1 
>   src/tests/gc_tests.cpp f939d27c58177fba052fbcd9d6c9a572d052df52 
>   src/tests/master_slave_reconciliation_tests.cpp 9afa826006fa7129da1a9c1ac8c389c0e051f717 
>   src/tests/master_tests.cpp 865fa4a71f4bae2a218cd2c4e10873222d1ea3c4 
>   src/tests/scheduler_event_call_tests.cpp 03f0332ef75bbe7c4947bd6daf55d40384570f18 
>   src/tests/slave_tests.cpp 4975bea8a7a701e0414426760692720f73dea7f5 
> 
> Diff: https://reviews.apache.org/r/40429/diff/
> 
> 
> Testing
> -------
> 
> Modified test for SchedulerDriverEventTest.Failure, which verifies that MockScheduler::executorLost is invoked.
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>