You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2013/08/03 03:11:20 UTC
Review Request 13253: Fixed slave to not recover completed executors.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13253/
-----------------------------------------------------------
Review request for mesos, Benjamin Hindman and Ben Mahler.
Bugs: MESOS-612
https://issues.apache.org/jira/browse/MESOS-612
Repository: mesos-git
Description
-------
Added a sentinel file to executor checkpoint data. This allows slave/isolator/sum to skip recovery of executors that were completed( terminated and all their updates acked).
Also, cleaned up some code.
Diffs
-----
src/slave/cgroups_isolator.cpp 0faf7d50d76887cad02267ab11827664a4b63476
src/slave/paths.hpp 9d2a2a40048bbe594723ba3f19aa10eaf1935926
src/slave/process_isolator.cpp cd794f6cb301a8c00a4c0ef906f95e53959ed905
src/slave/slave.cpp 7f6e6b456890db438092f19a22e4dd816bb33d04
src/slave/state.hpp 08e36174a1d88c342ba7a189ed413163bfd22fd8
src/slave/state.cpp e910ab71b8b667a076c0fdf31e3322e52fef1b17
src/slave/status_update_manager.cpp 9e9e4e2a47a609d65ed69a57de595852144a86c8
src/tests/slave_recovery_tests.cpp 1871e3ba41e65dcbd4b95779dda068f6a1a2ecb3
Diff: https://reviews.apache.org/r/13253/diff/
Testing
-------
make check
Thanks,
Vinod Kone
Re: Review Request 13253: Fixed slave to not recover completed executors.
Posted by Vinod Kone <vi...@gmail.com>.
> On Aug. 5, 2013, 6:33 p.m., Ben Mahler wrote:
> > src/slave/paths.hpp, lines 58-79
> > <https://reviews.apache.org/r/13253/diff/2/?file=336034#file336034line58>
> >
> > path::join here would avoid mistakes with double forward slashes or missing forward slashes. Just a note.
your wish is my command.
> On Aug. 5, 2013, 6:33 p.m., Ben Mahler wrote:
> > src/slave/state.cpp, lines 380-382
> > <https://reviews.apache.org/r/13253/diff/2/?file=336038#file336038line380>
> >
> > state.completed = os::exists(path);
doh..thanks
> On Aug. 5, 2013, 6:33 p.m., Ben Mahler wrote:
> > src/slave/state.cpp, line 296
> > <https://reviews.apache.org/r/13253/diff/2/?file=336038#file336038line296>
> >
> > Consider making a default constructor that sets this false, or a constructor that takes all of the arguments.
i'll punt on this for consistency.
- Vinod
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13253/#review24652
-----------------------------------------------------------
On Aug. 3, 2013, 8:33 p.m., Vinod Kone wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13253/
> -----------------------------------------------------------
>
> (Updated Aug. 3, 2013, 8:33 p.m.)
>
>
> Review request for mesos, Benjamin Hindman and Ben Mahler.
>
>
> Bugs: MESOS-612
> https://issues.apache.org/jira/browse/MESOS-612
>
>
> Repository: mesos-git
>
>
> Description
> -------
>
> Added a sentinel file to executor checkpoint data. This allows slave/isolator/sum to skip recovery of executors that were completed( terminated and all their updates acked).
>
> Also, cleaned up some code.
>
>
> Diffs
> -----
>
> src/slave/cgroups_isolator.cpp 0faf7d50d76887cad02267ab11827664a4b63476
> src/slave/paths.hpp 9d2a2a40048bbe594723ba3f19aa10eaf1935926
> src/slave/process_isolator.cpp cd794f6cb301a8c00a4c0ef906f95e53959ed905
> src/slave/slave.cpp 7f6e6b456890db438092f19a22e4dd816bb33d04
> src/slave/state.hpp 08e36174a1d88c342ba7a189ed413163bfd22fd8
> src/slave/state.cpp e910ab71b8b667a076c0fdf31e3322e52fef1b17
> src/slave/status_update_manager.cpp 9e9e4e2a47a609d65ed69a57de595852144a86c8
> src/tests/slave_recovery_tests.cpp 1871e3ba41e65dcbd4b95779dda068f6a1a2ecb3
>
> Diff: https://reviews.apache.org/r/13253/diff/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Vinod Kone
>
>
Re: Review Request 13253: Fixed slave to not recover completed executors.
Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13253/#review24652
-----------------------------------------------------------
Ship it!
src/slave/paths.hpp
<https://reviews.apache.org/r/13253/#comment48753>
path::join here would avoid mistakes with double forward slashes or missing forward slashes. Just a note.
src/slave/slave.cpp
<https://reviews.apache.org/r/13253/#comment48770>
s/is being cleaned up/is completed/ ?
src/slave/slave.cpp
<https://reviews.apache.org/r/13253/#comment48767>
This explanation of 'completed' would be nice in the RunState struct.
s/don't bother recovering it/we do not need to recover it/ ?
src/slave/slave.cpp
<https://reviews.apache.org/r/13253/#comment48768>
newline
src/slave/state.hpp
<https://reviews.apache.org/r/13253/#comment48765>
Can you add a comment as to what 'completed' means?
src/slave/state.cpp
<https://reviews.apache.org/r/13253/#comment48757>
Consider making a default constructor that sets this false, or a constructor that takes all of the arguments.
src/slave/state.cpp
<https://reviews.apache.org/r/13253/#comment48761>
state.completed = os::exists(path);
- Ben Mahler
On Aug. 3, 2013, 8:33 p.m., Vinod Kone wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13253/
> -----------------------------------------------------------
>
> (Updated Aug. 3, 2013, 8:33 p.m.)
>
>
> Review request for mesos, Benjamin Hindman and Ben Mahler.
>
>
> Bugs: MESOS-612
> https://issues.apache.org/jira/browse/MESOS-612
>
>
> Repository: mesos-git
>
>
> Description
> -------
>
> Added a sentinel file to executor checkpoint data. This allows slave/isolator/sum to skip recovery of executors that were completed( terminated and all their updates acked).
>
> Also, cleaned up some code.
>
>
> Diffs
> -----
>
> src/slave/cgroups_isolator.cpp 0faf7d50d76887cad02267ab11827664a4b63476
> src/slave/paths.hpp 9d2a2a40048bbe594723ba3f19aa10eaf1935926
> src/slave/process_isolator.cpp cd794f6cb301a8c00a4c0ef906f95e53959ed905
> src/slave/slave.cpp 7f6e6b456890db438092f19a22e4dd816bb33d04
> src/slave/state.hpp 08e36174a1d88c342ba7a189ed413163bfd22fd8
> src/slave/state.cpp e910ab71b8b667a076c0fdf31e3322e52fef1b17
> src/slave/status_update_manager.cpp 9e9e4e2a47a609d65ed69a57de595852144a86c8
> src/tests/slave_recovery_tests.cpp 1871e3ba41e65dcbd4b95779dda068f6a1a2ecb3
>
> Diff: https://reviews.apache.org/r/13253/diff/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Vinod Kone
>
>
Re: Review Request 13253: Fixed slave to not recover completed executors.
Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13253/
-----------------------------------------------------------
(Updated Aug. 6, 2013, 5:53 p.m.)
Review request for mesos, Benjamin Hindman and Ben Mahler.
Changes
-------
benm's. nnfr.
Bugs: MESOS-612
https://issues.apache.org/jira/browse/MESOS-612
Repository: mesos-git
Description
-------
Added a sentinel file to executor checkpoint data. This allows slave/isolator/sum to skip recovery of executors that were completed( terminated and all their updates acked).
Also, cleaned up some code.
Diffs (updated)
-----
src/slave/cgroups_isolator.cpp 7f6d13ede40c913899cb7a4f6ebea3056d3fa491
src/slave/paths.hpp 9d2a2a40048bbe594723ba3f19aa10eaf1935926
src/slave/process_isolator.cpp cb074485af9af1ea7c659dcd6fa50c035c5442f2
src/slave/slave.cpp 9cd7754b647dde21267f1990edb7d4e1425beacd
src/slave/state.hpp 08e36174a1d88c342ba7a189ed413163bfd22fd8
src/slave/state.cpp e910ab71b8b667a076c0fdf31e3322e52fef1b17
src/slave/status_update_manager.cpp e17ecf4b10423d3239ba0752ea0953e21a61483a
src/tests/slave_recovery_tests.cpp c451e0f4c571a646d375aa89e806e1a4058d39e7
Diff: https://reviews.apache.org/r/13253/diff/
Testing
-------
make check
Thanks,
Vinod Kone
Re: Review Request 13253: Fixed slave to not recover completed executors.
Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13253/
-----------------------------------------------------------
(Updated Aug. 3, 2013, 8:33 p.m.)
Review request for mesos, Benjamin Hindman and Ben Mahler.
Changes
-------
rebased.
Bugs: MESOS-612
https://issues.apache.org/jira/browse/MESOS-612
Repository: mesos-git
Description
-------
Added a sentinel file to executor checkpoint data. This allows slave/isolator/sum to skip recovery of executors that were completed( terminated and all their updates acked).
Also, cleaned up some code.
Diffs (updated)
-----
src/slave/cgroups_isolator.cpp 0faf7d50d76887cad02267ab11827664a4b63476
src/slave/paths.hpp 9d2a2a40048bbe594723ba3f19aa10eaf1935926
src/slave/process_isolator.cpp cd794f6cb301a8c00a4c0ef906f95e53959ed905
src/slave/slave.cpp 7f6e6b456890db438092f19a22e4dd816bb33d04
src/slave/state.hpp 08e36174a1d88c342ba7a189ed413163bfd22fd8
src/slave/state.cpp e910ab71b8b667a076c0fdf31e3322e52fef1b17
src/slave/status_update_manager.cpp 9e9e4e2a47a609d65ed69a57de595852144a86c8
src/tests/slave_recovery_tests.cpp 1871e3ba41e65dcbd4b95779dda068f6a1a2ecb3
Diff: https://reviews.apache.org/r/13253/diff/
Testing
-------
make check
Thanks,
Vinod Kone