You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Bannier <bb...@apache.org> on 2019/08/28 09:12:20 UTC

Review Request 71385: Added restart logic for failing resource providers.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71385/
-----------------------------------------------------------

Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.


Bugs: MESOS-8400
    https://issues.apache.org/jira/browse/MESOS-8400


Repository: mesos


Description
-------

This patch adds restart logic to the resource provider daemon. We now
watch launched providers and restart them should they become terminal.


Diffs
-----

  src/resource_provider/daemon.cpp 2fd82ad5749e3948c590ce5e9816566a3627b885 
  src/resource_provider/local.hpp 75ce0f2e4a744685f2b701ecce269995f5ddaafb 
  src/resource_provider/storage/provider.hpp ccd09dfe826d89c2775939bf132697956429c289 
  src/resource_provider/storage/provider.cpp f180af8c17f735acb18029b6e4cf2942b5536bf4 
  src/tests/storage_local_resource_provider_tests.cpp 05daf2a19145d0da2672bbaa5ae061369b2504f5 


Diff: https://reviews.apache.org/r/71385/diff/1/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 71385: Added restart logic for failing resource providers.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71385/#review217481
-----------------------------------------------------------



Patch looks great!

Reviews applied: [71382, 71383, 71384, 71385]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Aug. 28, 2019, 11:12 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71385/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2019, 11:12 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8400
>     https://issues.apache.org/jira/browse/MESOS-8400
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds restart logic to the resource provider daemon. We now
> watch launched providers and restart them should they become terminal.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/daemon.cpp 2fd82ad5749e3948c590ce5e9816566a3627b885 
>   src/resource_provider/local.hpp 75ce0f2e4a744685f2b701ecce269995f5ddaafb 
>   src/resource_provider/storage/provider.hpp ccd09dfe826d89c2775939bf132697956429c289 
>   src/resource_provider/storage/provider.cpp f180af8c17f735acb18029b6e4cf2942b5536bf4 
>   src/tests/storage_local_resource_provider_tests.cpp 05daf2a19145d0da2672bbaa5ae061369b2504f5 
> 
> 
> Diff: https://reviews.apache.org/r/71385/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 71385: Added restart logic for failing resource providers.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71385/#review217565
-----------------------------------------------------------



Patch looks great!

Reviews applied: [71382, 71383, 71384, 71385]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Sept. 4, 2019, 12:42 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71385/
> -----------------------------------------------------------
> 
> (Updated Sept. 4, 2019, 12:42 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8400
>     https://issues.apache.org/jira/browse/MESOS-8400
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds restart logic to the resource provider daemon. We now
> watch launched providers and restart them should they become terminal.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/daemon.cpp 2fd82ad5749e3948c590ce5e9816566a3627b885 
>   src/resource_provider/local.hpp 75ce0f2e4a744685f2b701ecce269995f5ddaafb 
>   src/resource_provider/storage/provider.hpp ccd09dfe826d89c2775939bf132697956429c289 
>   src/resource_provider/storage/provider.cpp 0a8dc26e66db0242474bcbbd0b2ff9cec81c58f5 
>   src/tests/storage_local_resource_provider_tests.cpp 089aa9787a66d737267179ad461be0c0a99d5c63 
> 
> 
> Diff: https://reviews.apache.org/r/71385/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 71385: Added restart logic for failing resource providers.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71385/
-----------------------------------------------------------

(Updated Sept. 4, 2019, 2:42 p.m.)


Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.


Changes
-------

Address issue raised by Jan


Bugs: MESOS-8400
    https://issues.apache.org/jira/browse/MESOS-8400


Repository: mesos


Description
-------

This patch adds restart logic to the resource provider daemon. We now
watch launched providers and restart them should they become terminal.


Diffs (updated)
-----

  src/resource_provider/daemon.cpp 2fd82ad5749e3948c590ce5e9816566a3627b885 
  src/resource_provider/local.hpp 75ce0f2e4a744685f2b701ecce269995f5ddaafb 
  src/resource_provider/storage/provider.hpp ccd09dfe826d89c2775939bf132697956429c289 
  src/resource_provider/storage/provider.cpp 0a8dc26e66db0242474bcbbd0b2ff9cec81c58f5 
  src/tests/storage_local_resource_provider_tests.cpp 089aa9787a66d737267179ad461be0c0a99d5c63 


Diff: https://reviews.apache.org/r/71385/diff/2/

Changes: https://reviews.apache.org/r/71385/diff/1-2/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 71385: Added restart logic for failing resource providers.

Posted by Benjamin Bannier <bb...@apache.org>.

> On Sept. 3, 2019, 3:28 p.m., Jan Schlicht wrote:
> > src/resource_provider/daemon.cpp
> > Lines 536 (patched)
> > <https://reviews.apache.org/r/71385/diff/1/?file=2163045#file2163045line536>
> >
> >     Should we restart here? Though the future currently won't be set to ready, it might do that in the future. If a resource provider is exited normally, is it supposed to be restarted?

Very good point. I fixed the handling to return early if the container exited normally (this will currently not be triggered as nothing makes the `Future` ready).


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71385/#review217546
-----------------------------------------------------------


On Sept. 4, 2019, 2:42 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71385/
> -----------------------------------------------------------
> 
> (Updated Sept. 4, 2019, 2:42 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8400
>     https://issues.apache.org/jira/browse/MESOS-8400
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds restart logic to the resource provider daemon. We now
> watch launched providers and restart them should they become terminal.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/daemon.cpp 2fd82ad5749e3948c590ce5e9816566a3627b885 
>   src/resource_provider/local.hpp 75ce0f2e4a744685f2b701ecce269995f5ddaafb 
>   src/resource_provider/storage/provider.hpp ccd09dfe826d89c2775939bf132697956429c289 
>   src/resource_provider/storage/provider.cpp 0a8dc26e66db0242474bcbbd0b2ff9cec81c58f5 
>   src/tests/storage_local_resource_provider_tests.cpp 089aa9787a66d737267179ad461be0c0a99d5c63 
> 
> 
> Diff: https://reviews.apache.org/r/71385/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 71385: Added restart logic for failing resource providers.

Posted by Jan Schlicht <ja...@d2iq.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71385/#review217546
-----------------------------------------------------------


Fix it, then Ship it!





src/resource_provider/daemon.cpp
Lines 536 (patched)
<https://reviews.apache.org/r/71385/#comment304805>

    Should we restart here? Though the future currently won't be set to ready, it might do that in the future. If a resource provider is exited normally, is it supposed to be restarted?


- Jan Schlicht


On Aug. 28, 2019, 11:12 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71385/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2019, 11:12 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8400
>     https://issues.apache.org/jira/browse/MESOS-8400
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds restart logic to the resource provider daemon. We now
> watch launched providers and restart them should they become terminal.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/daemon.cpp 2fd82ad5749e3948c590ce5e9816566a3627b885 
>   src/resource_provider/local.hpp 75ce0f2e4a744685f2b701ecce269995f5ddaafb 
>   src/resource_provider/storage/provider.hpp ccd09dfe826d89c2775939bf132697956429c289 
>   src/resource_provider/storage/provider.cpp f180af8c17f735acb18029b6e4cf2942b5536bf4 
>   src/tests/storage_local_resource_provider_tests.cpp 05daf2a19145d0da2672bbaa5ae061369b2504f5 
> 
> 
> Diff: https://reviews.apache.org/r/71385/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>