You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Isabel Jimenez <co...@isabeljimenez.com> on 2014/06/02 06:14:07 UTC

Review Request 22123: Failover boolean to prevent using large timeout values

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22123/
-----------------------------------------------------------

Review request for mesos, Benjamin Hindman, Dominic Hamon, and Till Toenshoff.


Bugs: MESOS-1118
    https://issues.apache.org/jira/browse/MESOS-1118


Repository: mesos-git


Description
-------

I think the name of the boolean is a bit confusing, I could change it into 'nofailover' which I think to be clearer.


Diffs
-----

  include/mesos/mesos.proto 82388e1 
  src/master/master.cpp 766a0e3 

Diff: https://reviews.apache.org/r/22123/diff/


Testing
-------

make check


Thanks,

Isabel Jimenez


Re: Review Request 22123: Failover boolean to prevent using large timeout values

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22123/#review44483
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22123]

All tests passed.

- Mesos ReviewBot


On June 2, 2014, 4:14 a.m., Isabel Jimenez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22123/
> -----------------------------------------------------------
> 
> (Updated June 2, 2014, 4:14 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Dominic Hamon, and Till Toenshoff.
> 
> 
> Bugs: MESOS-1118
>     https://issues.apache.org/jira/browse/MESOS-1118
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> I think the name of the boolean is a bit confusing, I could change it into 'nofailover' which I think to be clearer.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 82388e1 
>   src/master/master.cpp 766a0e3 
> 
> Diff: https://reviews.apache.org/r/22123/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Isabel Jimenez
> 
>


Re: Review Request 22123: Failover boolean to prevent using large timeout values

Posted by Isabel Jimenez <co...@isabeljimenez.com>.

> On June 26, 2014, 8:39 a.m., Adam B wrote:
> > include/mesos/mesos.proto, line 128
> > <https://reviews.apache.org/r/22123/diff/1/?file=601126#file601126line128>
> >
> >     Please add some documentation to the FrameworkInfo comment that explains what a value of failover=true means and when it should be used.

There is a comment about this on https://issues.apache.org/jira/browse/MESOS-1118, do you have something in mind ?


- Isabel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22123/#review46725
-----------------------------------------------------------


On June 2, 2014, 4:14 a.m., Isabel Jimenez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22123/
> -----------------------------------------------------------
> 
> (Updated June 2, 2014, 4:14 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Dominic Hamon, and Till Toenshoff.
> 
> 
> Bugs: MESOS-1118
>     https://issues.apache.org/jira/browse/MESOS-1118
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> I think the name of the boolean is a bit confusing, I could change it into 'nofailover' which I think to be clearer.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 82388e1 
>   src/master/master.cpp 766a0e3 
> 
> Diff: https://reviews.apache.org/r/22123/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Isabel Jimenez
> 
>


Re: Review Request 22123: Failover boolean to prevent using large timeout values

Posted by Adam B <ad...@mesosphere.io>.

> On June 26, 2014, 1:39 a.m., Adam B wrote:
> > include/mesos/mesos.proto, line 128
> > <https://reviews.apache.org/r/22123/diff/1/?file=601126#file601126line128>
> >
> >     Please add some documentation to the FrameworkInfo comment that explains what a value of failover=true means and when it should be used.
> 
> Isabel Jimenez wrote:
>     There is a comment about this on https://issues.apache.org/jira/browse/MESOS-1118, do you have something in mind ?

I'm just imagining that I'm a framework author looking at mesos.proto. I can see that it defaults to false, but I'd like to see a sentence or two in the comment above FrameworkInfo that tells me when I should set failover=true.

I still don't entirely understand what problem we're trying to solve. Please correct me if I'm wrong, but from what I can read in the code, when I first register my framework, I would set failover=false and failover_timeout=X (original default) to deactivate the framework on exit/disconnect and wait X before removing it completely. I would set failover=false and failover_timeout=0 to immediately remove the framework upon exit/disconnect. I would set failover=true (timeout irrelevant) to deactivate on exit/disconnect and never removeFramework.
According to MESOS-703, this timeout/flag cannot be changed during re-registrations yet.

This two-flag state is really confusing. I think I prefer the suggestion that negative values (since '0' is apparently valid) imply a "failover forever" state. But you would also have to enable negative Durations https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp#L33
Maybe Duration::max() is enough to imply "forever"? Or if you do use two flags, I think "failover_forever=true" would read better.


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22123/#review46725
-----------------------------------------------------------


On June 1, 2014, 9:14 p.m., Isabel Jimenez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22123/
> -----------------------------------------------------------
> 
> (Updated June 1, 2014, 9:14 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Dominic Hamon, and Till Toenshoff.
> 
> 
> Bugs: MESOS-1118
>     https://issues.apache.org/jira/browse/MESOS-1118
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> I think the name of the boolean is a bit confusing, I could change it into 'nofailover' which I think to be clearer.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 82388e1 
>   src/master/master.cpp 766a0e3 
> 
> Diff: https://reviews.apache.org/r/22123/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Isabel Jimenez
> 
>


Re: Review Request 22123: Failover boolean to prevent using large timeout values

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22123/#review46725
-----------------------------------------------------------


Not sure if this review is still going anywhere, but I thought I'd put in my comments.


include/mesos/mesos.proto
<https://reviews.apache.org/r/22123/#comment82280>

    Please add some documentation to the FrameworkInfo comment that explains what a value of failover=true means and when it should be used.



src/master/master.cpp
<https://reviews.apache.org/r/22123/#comment82282>

    Moving this out of the while loop means that a framework without a failover_timeout will use the last valid failover_timeout, not the original default value. You should reinitialize failoverTimeout to the default value at the beginning of each iteration.



src/master/master.cpp
<https://reviews.apache.org/r/22123/#comment82281>

    Tabbing?


- Adam B


On June 1, 2014, 9:14 p.m., Isabel Jimenez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22123/
> -----------------------------------------------------------
> 
> (Updated June 1, 2014, 9:14 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Dominic Hamon, and Till Toenshoff.
> 
> 
> Bugs: MESOS-1118
>     https://issues.apache.org/jira/browse/MESOS-1118
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> I think the name of the boolean is a bit confusing, I could change it into 'nofailover' which I think to be clearer.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 82388e1 
>   src/master/master.cpp 766a0e3 
> 
> Diff: https://reviews.apache.org/r/22123/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Isabel Jimenez
> 
>


Re: Review Request 22123: Failover boolean to prevent using large timeout values

Posted by Dominic Hamon <dh...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22123/#review46833
-----------------------------------------------------------



src/master/master.cpp
<https://reviews.apache.org/r/22123/#comment82404>

    please log info that we are setting the failover timeout and log info that we are not in the else case.


- Dominic Hamon


On June 1, 2014, 9:14 p.m., Isabel Jimenez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22123/
> -----------------------------------------------------------
> 
> (Updated June 1, 2014, 9:14 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Dominic Hamon, and Till Toenshoff.
> 
> 
> Bugs: MESOS-1118
>     https://issues.apache.org/jira/browse/MESOS-1118
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> I think the name of the boolean is a bit confusing, I could change it into 'nofailover' which I think to be clearer.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 82388e1 
>   src/master/master.cpp 766a0e3 
> 
> Diff: https://reviews.apache.org/r/22123/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Isabel Jimenez
> 
>