You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2014/10/29 00:08:20 UTC

Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/
-----------------------------------------------------------

Review request for mesos and Ben Mahler.


Bugs: MESOS-1903
    https://issues.apache.org/jira/browse/MESOS-1903


Repository: mesos-git


Description
-------

Uses the same backoff (except no initial backoff) strategy used by the slave during registration.


Diffs
-----

  include/mesos/scheduler.hpp 42e4e279d059801cd85955fd04995b60051a2b5e 
  src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
  src/local/constants.hpp PRE-CREATION 
  src/local/constants.cpp PRE-CREATION 
  src/local/flags.hpp 54e88319afc68007ff5d7c0d0179b685ef845c87 
  src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
  src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 

Diff: https://reviews.apache.org/r/27315/diff/


Testing
-------

make check


Thanks,

Vinod Kone


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review59106
-----------------------------------------------------------


Patch looks great!

Reviews applied: [27315]

All tests passed.

- Mesos ReviewBot


On Oct. 29, 2014, 11:16 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 29, 2014, 11:16 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/sched/constants.hpp PRE-CREATION 
>   src/sched/constants.cpp PRE-CREATION 
>   src/sched/flags.hpp PRE-CREATION 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review60284
-----------------------------------------------------------


Patch looks great!

Reviews applied: [27315]

All tests passed.

- Mesos ReviewBot


On Nov. 6, 2014, 10:20 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Nov. 6, 2014, 10:20 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 9ab3b9c05d435d18ed1c2966f695857fa205e9fd 
>   src/sched/constants.hpp PRE-CREATION 
>   src/sched/constants.cpp PRE-CREATION 
>   src/sched/flags.hpp PRE-CREATION 
>   src/sched/sched.cpp e5f828d0bf9dd03a01920634abfae685a7861b44 
>   src/slave/slave.hpp 5b082fc6956238d27f5d3352a3b40ce778a1a3a4 
>   src/slave/slave.cpp dbfd1a8101d78dee8ea3ac19d990a6a7892e59be 
>   src/tests/fault_tolerance_tests.cpp 372c4fdec1ef70991d4d550d8d73ac363436fdb9 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/
-----------------------------------------------------------

(Updated Nov. 6, 2014, 10:20 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

benm's comments. NNFR.


Bugs: MESOS-1903
    https://issues.apache.org/jira/browse/MESOS-1903


Repository: mesos-git


Description
-------

Uses the same backoff (except no initial backoff) strategy used by the slave during registration.


Diffs (updated)
-----

  src/Makefile.am 9ab3b9c05d435d18ed1c2966f695857fa205e9fd 
  src/sched/constants.hpp PRE-CREATION 
  src/sched/constants.cpp PRE-CREATION 
  src/sched/flags.hpp PRE-CREATION 
  src/sched/sched.cpp e5f828d0bf9dd03a01920634abfae685a7861b44 
  src/slave/slave.hpp 5b082fc6956238d27f5d3352a3b40ce778a1a3a4 
  src/slave/slave.cpp dbfd1a8101d78dee8ea3ac19d990a6a7892e59be 
  src/tests/fault_tolerance_tests.cpp 372c4fdec1ef70991d4d550d8d73ac363436fdb9 

Diff: https://reviews.apache.org/r/27315/diff/


Testing
-------

make check


Thanks,

Vinod Kone


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Vinod Kone <vi...@gmail.com>.

> On Oct. 31, 2014, 10:23 p.m., Ben Mahler wrote:
> > src/sched/constants.hpp, line 18
> > <https://reviews.apache.org/r/27315/diff/2/?file=742036#file742036line18>
> >
> >     newline here?

done.


> On Oct. 31, 2014, 10:23 p.m., Ben Mahler wrote:
> > src/sched/sched.cpp, line 252
> > <https://reviews.apache.org/r/27315/diff/2/?file=742039#file742039line252>
> >
> >     What is the (s) conveying here?
> >     
> >     Could you just say, "similar to the slave"?

done.


> On Oct. 31, 2014, 10:23 p.m., Ben Mahler wrote:
> > src/sched/sched.cpp, lines 503-525
> > <https://reviews.apache.org/r/27315/diff/2/?file=742039#file742039line503>
> >
> >     How about the following? We can do this for the slave too to make it a bit easier (it's a bit hard to understand currently with 'duration' as a name):
> >     
> >     ```
> >     void doReliableRegistration(Duration maxBackoff)
> >     {
> >       ...
> >       
> >       maxBackoff = std::min(maxBackoff, scheduler:: REGISTRATION_RETRY_INTERVAL_DEFAULT_MAX);
> >       
> >       if (framework.has_failover_timeout()) {
> >         ...
> >         
> >         // Don't approach the failover timeout too closely.
> >         maxBackoff = std::min(maxBackoff, timeout.get() / 10);
> >       }
> >       
> >       Duration delay = std::min(
> >           duration * ((double) ::random() / RAND_MAX),
> >           maxBackoff);
> >           
> >       VLOG(1) << "will retry registration in " << delay << " if necessary";
> >       
> >       process::delay(delay, self(), &Self::doReliableRegistration, maxBackoff * 2);
> >     }
> >     ```
> >     
> >     Notice that we don't need to store duration_ at all since it's the responsibility of doReliableRegistration to bound the maximum anyway!
> >     
> >     Could we do this cleanup in the slave too?

done. thanks!

one change though,

Duration delay = maxBackoff * ((double) ::random() / RAND_MAX);


> On Oct. 31, 2014, 10:23 p.m., Ben Mahler wrote:
> > src/sched/sched.cpp, line 506
> > <https://reviews.apache.org/r/27315/diff/2/?file=742039#file742039line506>
> >
> >     How about s/duration/timeout/ ?

done.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review59102
-----------------------------------------------------------


On Oct. 29, 2014, 11:16 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 29, 2014, 11:16 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/sched/constants.hpp PRE-CREATION 
>   src/sched/constants.cpp PRE-CREATION 
>   src/sched/flags.hpp PRE-CREATION 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review59102
-----------------------------------------------------------

Ship it!


Looks good, just once concern about bounding for large failover timeouts and one note about cleanup.


src/sched/constants.hpp
<https://reviews.apache.org/r/27315/#comment100418>

    newline here?



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100428>

    What is the (s) conveying here?
    
    Could you just say, "similar to the slave"?



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100698>

    How about the following? We can do this for the slave too to make it a bit easier (it's a bit hard to understand currently with 'duration' as a name):
    
    ```
    void doReliableRegistration(Duration maxBackoff)
    {
      ...
      
      maxBackoff = std::min(maxBackoff, scheduler:: REGISTRATION_RETRY_INTERVAL_DEFAULT_MAX);
      
      if (framework.has_failover_timeout()) {
        ...
        
        // Don't approach the failover timeout too closely.
        maxBackoff = std::min(maxBackoff, timeout.get() / 10);
      }
      
      Duration delay = std::min(
          duration * ((double) ::random() / RAND_MAX),
          maxBackoff);
          
      VLOG(1) << "will retry registration in " << delay << " if necessary";
      
      process::delay(delay, self(), &Self::doReliableRegistration, maxBackoff * 2);
    }
    ```
    
    Notice that we don't need to store duration_ at all since it's the responsibility of doReliableRegistration to bound the maximum anyway!
    
    Could we do this cleanup in the slave too?



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100699>

    How about s/duration/timeout/ ?



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100700>

    We should bound this as well, to ensure that we don't backoff *too* much!


- Ben Mahler


On Oct. 29, 2014, 11:16 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 29, 2014, 11:16 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/sched/constants.hpp PRE-CREATION 
>   src/sched/constants.cpp PRE-CREATION 
>   src/sched/flags.hpp PRE-CREATION 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Cody Maloney <co...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review59075
-----------------------------------------------------------


LGTM

- Cody Maloney


On Oct. 29, 2014, 11:16 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 29, 2014, 11:16 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/sched/constants.hpp PRE-CREATION 
>   src/sched/constants.cpp PRE-CREATION 
>   src/sched/flags.hpp PRE-CREATION 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/
-----------------------------------------------------------

(Updated Oct. 29, 2014, 11:16 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

After discussing w/ BenM, decided to move the constants and flags to "src/sched" directory instead of "src/local" to be more precise. Note that the new (event/call) driver in "src/scheduler" doesn't do reliable registration, so there are no updates to it.


Bugs: MESOS-1903
    https://issues.apache.org/jira/browse/MESOS-1903


Repository: mesos-git


Description
-------

Uses the same backoff (except no initial backoff) strategy used by the slave during registration.


Diffs (updated)
-----

  src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
  src/sched/constants.hpp PRE-CREATION 
  src/sched/constants.cpp PRE-CREATION 
  src/sched/flags.hpp PRE-CREATION 
  src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
  src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 

Diff: https://reviews.apache.org/r/27315/diff/


Testing
-------

make check


Thanks,

Vinod Kone


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Vinod Kone <vi...@gmail.com>.

> On Oct. 29, 2014, 7:36 p.m., Dominic Hamon wrote:
> > src/sched/sched.cpp, line 1089
> > <https://reviews.apache.org/r/27315/diff/1/?file=736571#file736571line1089>
> >
> >     it looks like you derefence this everywhere.. so why did it need to become a pointer? Why not a concrete instance member variable?

n/a after the restructure.


> On Oct. 29, 2014, 7:36 p.m., Dominic Hamon wrote:
> > src/sched/sched.cpp, line 255
> > <https://reviews.apache.org/r/27315/diff/1/?file=736571#file736571line255>
> >
> >     the two places this is referenced it is doubled. Please add a comment why or just store the doubled factor somewhere to simplify the readability of the code.

actually, no need for the double. i just copy pasted from slave code but didn't realize that the semantics are slightly different for driver, i.e., the first retry is tried between [0, b] instead of [0, 2b]. fixed.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review59034
-----------------------------------------------------------


On Oct. 28, 2014, 11:08 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2014, 11:08 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   include/mesos/scheduler.hpp 42e4e279d059801cd85955fd04995b60051a2b5e 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/local/constants.hpp PRE-CREATION 
>   src/local/constants.cpp PRE-CREATION 
>   src/local/flags.hpp 54e88319afc68007ff5d7c0d0179b685ef845c87 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Dominic Hamon <dh...@twopensource.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review59034
-----------------------------------------------------------



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100315>

    the two places this is referenced it is doubled. Please add a comment why or just store the doubled factor somewhere to simplify the readability of the code.



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100313>

    maybe set this on line 503 and then override if necessary. This ensures it is set and is clearer to read.



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100314>

    it looks like you derefence this everywhere.. so why did it need to become a pointer? Why not a concrete instance member variable?


- Dominic Hamon


On Oct. 28, 2014, 4:08 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2014, 4:08 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   include/mesos/scheduler.hpp 42e4e279d059801cd85955fd04995b60051a2b5e 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/local/constants.hpp PRE-CREATION 
>   src/local/constants.cpp PRE-CREATION 
>   src/local/flags.hpp 54e88319afc68007ff5d7c0d0179b685ef845c87 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Vinod Kone <vi...@gmail.com>.

> On Oct. 29, 2014, 2:43 a.m., Cody Maloney wrote:
> > src/sched/sched.cpp, line 506
> > <https://reviews.apache.org/r/27315/diff/1/?file=736571#file736571line506>
> >
> >     maxBackoff is never set if duration.isNone.

good catch! thanks. fixed.


> On Oct. 29, 2014, 2:43 a.m., Cody Maloney wrote:
> > include/mesos/scheduler.hpp, line 443
> > <https://reviews.apache.org/r/27315/diff/1/?file=736566#file736566line443>
> >
> >     It would be more precise to use an Optional here rather than write the rules/semantics (if I'm reading the rest of the code correctly) into a pointer.
> >     
> >     Also means we don't have to manually clean up in the destructor.

i restructured the code and this is n/a.


> On Oct. 29, 2014, 2:43 a.m., Cody Maloney wrote:
> > src/sched/sched.cpp, line 519
> > <https://reviews.apache.org/r/27315/diff/1/?file=736571#file736571line519>
> >
> >     It would be nice to use <random> here (The headers at least are around in gcc 4.4). 
> >     
> >     Doing this sort of math creates significantly biased randon numbers.

added a TODO for now. i can fix it here and in slave in a subsequent review.


> On Oct. 29, 2014, 2:43 a.m., Cody Maloney wrote:
> > src/sched/sched.cpp, line 522
> > <https://reviews.apache.org/r/27315/diff/1/?file=736571#file736571line522>
> >
> >     nextDuration/durationNext would be a more descriptive variable name.

i'll leave it as is for consistency with the naming in slave.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review58951
-----------------------------------------------------------


On Oct. 28, 2014, 11:08 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2014, 11:08 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   include/mesos/scheduler.hpp 42e4e279d059801cd85955fd04995b60051a2b5e 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/local/constants.hpp PRE-CREATION 
>   src/local/constants.cpp PRE-CREATION 
>   src/local/flags.hpp 54e88319afc68007ff5d7c0d0179b685ef845c87 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 27315: Updated scheduler driver to exponentially backoff during registration retries.

Posted by Cody Maloney <co...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27315/#review58951
-----------------------------------------------------------



include/mesos/scheduler.hpp
<https://reviews.apache.org/r/27315/#comment100173>

    It would be more precise to use an Optional here rather than write the rules/semantics (if I'm reading the rest of the code correctly) into a pointer.
    
    Also means we don't have to manually clean up in the destructor.



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100167>

    maxBackoff is never set if duration.isNone.



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100169>

    It would be nice to use <random> here (The headers at least are around in gcc 4.4). 
    
    Doing this sort of math creates significantly biased randon numbers.



src/sched/sched.cpp
<https://reviews.apache.org/r/27315/#comment100170>

    nextDuration/durationNext would be a more descriptive variable name.


- Cody Maloney


On Oct. 28, 2014, 11:08 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27315/
> -----------------------------------------------------------
> 
> (Updated Oct. 28, 2014, 11:08 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1903
>     https://issues.apache.org/jira/browse/MESOS-1903
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Uses the same backoff (except no initial backoff) strategy used by the slave during registration.
> 
> 
> Diffs
> -----
> 
>   include/mesos/scheduler.hpp 42e4e279d059801cd85955fd04995b60051a2b5e 
>   src/Makefile.am 374f284e1ac839fbcd8a28171b1ff4fbe8a17bd4 
>   src/local/constants.hpp PRE-CREATION 
>   src/local/constants.cpp PRE-CREATION 
>   src/local/flags.hpp 54e88319afc68007ff5d7c0d0179b685ef845c87 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd 
> 
> Diff: https://reviews.apache.org/r/27315/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>