You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mesos.apache.org by Jie Yu <yu...@gmail.com> on 2014/03/06 23:24:41 UTC

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/
-----------------------------------------------------------

(Updated March 6, 2014, 10:24 p.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

Rebased.


Bugs: MESOS-984
    https://issues.apache.org/jira/browse/MESOS-984


Repository: mesos-git


Description
-------

Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:

void checked(const Future<Metadata::Status>& future)	
{
  if (!future.isReady()) {
    promise.fail(
        future.isFailed() ?
        "Failed to get replica status: " + future.failure() :
        "Not expecting discarded future");
    terminate(self());
    return;
  }
  ...
}

Another drawback is: discarding becomes difficult and hard to reason.

Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!

Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.

This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.


Diffs (updated)
-----

  src/log/recover.cpp e611a4e 

Diff: https://reviews.apache.org/r/18584/diff/


Testing
-------

make check


Thanks,

Jie Yu

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Jie Yu <yu...@gmail.com>.


> On March 6, 2014, 11:58 p.m., Ben Mahler wrote:
> > src/log/recover.cpp, lines 132-134
> > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line132>
> >
> >     How do you feel about pulling the watch and broadcast apart:
> >     
> >     chain = watch()
> >       .then(defer(self(), &Self::broadcast))
> >       .then(defer(self(), &Self::receive, lambda::_1))
> >       .onAny(defer(self(), &Self::finished, lambda::_1));
> >     
> >     
> >     Then we could make '_broadcast' only do the setting
> >     
> >     Future<Nothing> broadcasted()
> >     {
> >         responses = _responses;
> >     
> >         // Reset the counters.
> >         responsesReceived.clear();
> >         lowestBeginPosition = None();
> >         highestEndPosition = None();
> >     
> >         return Nothing();
> >     }
> >     
> >     This allows 'handle' to become 'receive' and '_handle' to become 'received':
> >     
> >       Future<Option<RecoverResponse> > receive()
> >       {
> >         // Instead of using a for loop here, we use select to process
> >         // responses one after another so that we can ignore the rest if
> >         // we have collected enough responses.
> >         return select(responses)
> >           .then(defer(self(), &Self::received, lambda::_1));
> >       }
> >     
> >

Liked it. Done.


> On March 6, 2014, 11:58 p.m., Ben Mahler wrote:
> > src/log/recover.cpp, line 179
> > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line179>
> >
> >     Is there a race between watch() and broadcast() that can cause us to select() on an empty set and get stuck?
> >     
> >     watch() -> returns size == quorum
> >     // Now ZK blip, 0 network members.
> >     broadcast() -> returns empty set
> >     select(empty_set) -> stuck pending?
> >     
> >     Is this safe? Is there a timeout built-in at a higher layer to bail if we get stuck?

Solved.


> On March 6, 2014, 11:58 p.m., Ben Mahler wrote:
> > src/log/recover.cpp, lines 415-418
> > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line415>
> >
> >     Do we still need this now that finalize() is a no-op?

I prefer to add a LOG line indicating the termination of the recover process.


> On March 6, 2014, 11:58 p.m., Ben Mahler wrote:
> > src/log/recover.cpp, lines 496-511
> > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line496>
> >
> >     Do we need to have these helpers that implicitly set 'status' instead of just returning the Metadata::Status?
> >     
> >     It looks like we can remove the optimization in updateReplicaStatus and avoid the need to store a 'status' member.
> >     
> >     If the optimization is important, Replica::update(Metadata::Status) could avoid the write if there's no change, right?

Liked it. Done.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/#review36447
-----------------------------------------------------------


On March 6, 2014, 10:24 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18584/
> -----------------------------------------------------------
> 
> (Updated March 6, 2014, 10:24 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:
> 
> void checked(const Future<Metadata::Status>& future)	
> {
>   if (!future.isReady()) {
>     promise.fail(
>         future.isFailed() ?
>         "Failed to get replica status: " + future.failure() :
>         "Not expecting discarded future");
>     terminate(self());
>     return;
>   }
>   ...
> }
> 
> Another drawback is: discarding becomes difficult and hard to reason.
> 
> Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!
> 
> Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.
> 
> This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.
> 
> 
> Diffs
> -----
> 
>   src/log/recover.cpp e611a4e 
> 
> Diff: https://reviews.apache.org/r/18584/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Ben Mahler <be...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/#review36447
-----------------------------------------------------------



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67422>

    How do you feel about pulling the watch and broadcast apart:
    
    chain = watch()
      .then(defer(self(), &Self::broadcast))
      .then(defer(self(), &Self::receive, lambda::_1))
      .onAny(defer(self(), &Self::finished, lambda::_1));
    
    
    Then we could make '_broadcast' only do the setting
    
    Future<Nothing> broadcasted()
    {
        responses = _responses;
    
        // Reset the counters.
        responsesReceived.clear();
        lowestBeginPosition = None();
        highestEndPosition = None();
    
        return Nothing();
    }
    
    This allows 'handle' to become 'receive' and '_handle' to become 'received':
    
      Future<Option<RecoverResponse> > receive()
      {
        // Instead of using a for loop here, we use select to process
        // responses one after another so that we can ignore the rest if
        // we have collected enough responses.
        return select(responses)
          .then(defer(self(), &Self::received, lambda::_1));
      }
    
    



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67427>

    Is there a race between watch() and broadcast() that can cause us to select() on an empty set and get stuck?
    
    watch() -> returns size == quorum
    // Now ZK blip, 0 network members.
    broadcast() -> returns empty set
    select(empty_set) -> stuck pending?
    
    Is this safe? Is there a timeout built-in at a higher layer to bail if we get stuck?



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67393>

    How about: "Starting replica recovery"



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67394>

    Do we still need this now that finalize() is a no-op?



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67406>

    Do we need to have these helpers that implicitly set 'status' instead of just returning the Metadata::Status?
    
    It looks like we can remove the optimization in updateReplicaStatus and avoid the need to store a 'status' member.
    
    If the optimization is important, Replica::update(Metadata::Status) could avoid the write if there's no change, right?



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67408>

    Can you move discard up to the beginning of the private block?


- Ben Mahler


On March 6, 2014, 10:24 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18584/
> -----------------------------------------------------------
> 
> (Updated March 6, 2014, 10:24 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:
> 
> void checked(const Future<Metadata::Status>& future)	
> {
>   if (!future.isReady()) {
>     promise.fail(
>         future.isFailed() ?
>         "Failed to get replica status: " + future.failure() :
>         "Not expecting discarded future");
>     terminate(self());
>     return;
>   }
>   ...
> }
> 
> Another drawback is: discarding becomes difficult and hard to reason.
> 
> Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!
> 
> Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.
> 
> This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.
> 
> 
> Diffs
> -----
> 
>   src/log/recover.cpp e611a4e 
> 
> Diff: https://reviews.apache.org/r/18584/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Ben Mahler <be...@gmail.com>.


> On March 7, 2014, 7:45 p.m., Ben Mahler wrote:
> > src/log/recover.cpp, lines 387-389
> > <https://reviews.apache.org/r/18584/diff/4/?file=513745#file513745line387>
> >
> >     I'm wondering if we can flatten the continuation chain here:
> >     
> >     Current:
> >     recover
> >         runRecoverProtocol
> >         checkRecoverProtocol
> >             updateReplicaStatus(RECOVERING)
> >             catchup
> >                 getOwnership
> >                 updateReplicaStatus(VOTING)
> >     
> >     It's a bit tricky because "checking the recover protocol" implies "updating the replica state" and doing "catchup" when necessary.
> >     
> >     A flatter approach could be:
> >     recover
> >         runRecoverProtocol
> >         checkRecoverProtocol
> >         updateReplicaStatus(RECOVERING)
> >         catchup
> >             getOwnership
> >         updateReplicaStatus(VOTING)
> >     
> >     The unfortunate part of this alternative approach is that we can no longer pass in 'begin' and 'end' to catchup. We could store an Option<Interval> to capture the catchup interval in checkRecoverProtocol() and CHECK_SOME(interval) in catchup().
> >     
> >     
> >     The chain would then look like this:
> >     
> >     recover()
> >     {
> >         
> >         if (status == Metadata::VOTING) {
> >           // No need to do recovery.
> >           return Nothing();
> >         } else {
> >           return runRecoverProtocol(quorum, network)
> >             .then(defer(self(), &Self::checkRecoverProtocol, lambda::_1))
> >             .then(defer(self(), &Self::updateReplicaStatus, Metadata::RECOVERING))
> >             .then(defer(self(), &Self::catchup))
> >             .then(defer(self(), &Self::updateReplicaStatus, Metadata::VOTING));
> >         }
> >     }
> >     
> >     And catchup takes no arguments:
> >     
> >       Future<Nothing> catchup()
> >       {
> >         CHECK_SOME(interval);
> >     
> >         CHECK_LT(interval.get().lower(), interval.get().upper());
> >     
> >         LOG(INFO) << "Starting catch-up from position ["
> >                   << interval.get().lower() << " to "
> >                   << interval.get().upper() << ")";
> >     
> >         IntervalSet<uint64_t> positions;
> >         positions += interval.get();
> >     
> >         ...
> >       }
> >     
> >     The flatter style might be a bit clearer since we can see the replica status changes in the outer change, and "checking" the recover protocol result is now a pure "check" instead of a "check" and a replica status change. Now, catchup() also doesn't perform an implicit status update change. On the other hand, having to store the 'interval' is unfortunate.
> >     
> >     I'll leave this up to you, since both approaches have their tradeoffs.

Okay, after chatting with Jie, this may not work so well with r/18600. So changing s/checkRecoverProtocol/_recover/ is more clear about the fact that recovery is taking place inside this method.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/#review36553
-----------------------------------------------------------


On March 7, 2014, 5:20 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18584/
> -----------------------------------------------------------
> 
> (Updated March 7, 2014, 5:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:
> 
> void checked(const Future<Metadata::Status>& future)	
> {
>   if (!future.isReady()) {
>     promise.fail(
>         future.isFailed() ?
>         "Failed to get replica status: " + future.failure() :
>         "Not expecting discarded future");
>     terminate(self());
>     return;
>   }
>   ...
> }
> 
> Another drawback is: discarding becomes difficult and hard to reason.
> 
> Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!
> 
> Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.
> 
> This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.
> 
> 
> Diffs
> -----
> 
>   src/log/recover.cpp e611a4e 
> 
> Diff: https://reviews.apache.org/r/18584/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Ben Mahler <be...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/#review36553
-----------------------------------------------------------


Looks good, if you end up changing the chain in RecoverProcess we should do another pass, let me know which approach you think is nicer.


src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67556>

    CHECK_SOME



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67569>

    I'm wondering if we can flatten the continuation chain here:
    
    Current:
    recover
        runRecoverProtocol
        checkRecoverProtocol
            updateReplicaStatus(RECOVERING)
            catchup
                getOwnership
                updateReplicaStatus(VOTING)
    
    It's a bit tricky because "checking the recover protocol" implies "updating the replica state" and doing "catchup" when necessary.
    
    A flatter approach could be:
    recover
        runRecoverProtocol
        checkRecoverProtocol
        updateReplicaStatus(RECOVERING)
        catchup
            getOwnership
        updateReplicaStatus(VOTING)
    
    The unfortunate part of this alternative approach is that we can no longer pass in 'begin' and 'end' to catchup. We could store an Option<Interval> to capture the catchup interval in checkRecoverProtocol() and CHECK_SOME(interval) in catchup().
    
    
    The chain would then look like this:
    
    recover()
    {
        
        if (status == Metadata::VOTING) {
          // No need to do recovery.
          return Nothing();
        } else {
          return runRecoverProtocol(quorum, network)
            .then(defer(self(), &Self::checkRecoverProtocol, lambda::_1))
            .then(defer(self(), &Self::updateReplicaStatus, Metadata::RECOVERING))
            .then(defer(self(), &Self::catchup))
            .then(defer(self(), &Self::updateReplicaStatus, Metadata::VOTING));
        }
    }
    
    And catchup takes no arguments:
    
      Future<Nothing> catchup()
      {
        CHECK_SOME(interval);
    
        CHECK_LT(interval.get().lower(), interval.get().upper());
    
        LOG(INFO) << "Starting catch-up from position ["
                  << interval.get().lower() << " to "
                  << interval.get().upper() << ")";
    
        IntervalSet<uint64_t> positions;
        positions += interval.get();
    
        ...
      }
    
    The flatter style might be a bit clearer since we can see the replica status changes in the outer change, and "checking" the recover protocol result is now a pure "check" instead of a "check" and a replica status change. Now, catchup() also doesn't perform an implicit status update change. On the other hand, having to store the 'interval' is unfortunate.
    
    I'll leave this up to you, since both approaches have their tradeoffs.



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67560>

    Looks like this will fit on one line now?


- Ben Mahler


On March 7, 2014, 5:20 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18584/
> -----------------------------------------------------------
> 
> (Updated March 7, 2014, 5:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:
> 
> void checked(const Future<Metadata::Status>& future)	
> {
>   if (!future.isReady()) {
>     promise.fail(
>         future.isFailed() ?
>         "Failed to get replica status: " + future.failure() :
>         "Not expecting discarded future");
>     terminate(self());
>     return;
>   }
>   ...
> }
> 
> Another drawback is: discarding becomes difficult and hard to reason.
> 
> Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!
> 
> Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.
> 
> This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.
> 
> 
> Diffs
> -----
> 
>   src/log/recover.cpp e611a4e 
> 
> Diff: https://reviews.apache.org/r/18584/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Ben Mahler <be...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/#review36571
-----------------------------------------------------------

Ship it!


Ship It!

- Ben Mahler


On March 7, 2014, 5:20 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18584/
> -----------------------------------------------------------
> 
> (Updated March 7, 2014, 5:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:
> 
> void checked(const Future<Metadata::Status>& future)	
> {
>   if (!future.isReady()) {
>     promise.fail(
>         future.isFailed() ?
>         "Failed to get replica status: " + future.failure() :
>         "Not expecting discarded future");
>     terminate(self());
>     return;
>   }
>   ...
> }
> 
> Another drawback is: discarding becomes difficult and hard to reason.
> 
> Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!
> 
> Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.
> 
> This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.
> 
> 
> Diffs
> -----
> 
>   src/log/recover.cpp e611a4e 
> 
> Diff: https://reviews.apache.org/r/18584/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Jie Yu <yu...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/
-----------------------------------------------------------

(Updated March 7, 2014, 8:35 p.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

BenM's.


Bugs: MESOS-984
    https://issues.apache.org/jira/browse/MESOS-984


Repository: mesos-git


Description
-------

Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:

void checked(const Future<Metadata::Status>& future)	
{
  if (!future.isReady()) {
    promise.fail(
        future.isFailed() ?
        "Failed to get replica status: " + future.failure() :
        "Not expecting discarded future");
    terminate(self());
    return;
  }
  ...
}

Another drawback is: discarding becomes difficult and hard to reason.

Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!

Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.

This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.


Diffs (updated)
-----

  src/log/recover.cpp e611a4e 

Diff: https://reviews.apache.org/r/18584/diff/


Testing
-------

make check


Thanks,

Jie Yu

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Jie Yu <yu...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/
-----------------------------------------------------------

(Updated March 7, 2014, 5:20 p.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

Removed unneeded member field.


Bugs: MESOS-984
    https://issues.apache.org/jira/browse/MESOS-984


Repository: mesos-git


Description
-------

Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:

void checked(const Future<Metadata::Status>& future)	
{
  if (!future.isReady()) {
    promise.fail(
        future.isFailed() ?
        "Failed to get replica status: " + future.failure() :
        "Not expecting discarded future");
    terminate(self());
    return;
  }
  ...
}

Another drawback is: discarding becomes difficult and hard to reason.

Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!

Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.

This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.


Diffs (updated)
-----

  src/log/recover.cpp e611a4e 

Diff: https://reviews.apache.org/r/18584/diff/


Testing
-------

make check


Thanks,

Jie Yu

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Posted by Jie Yu <yu...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/
-----------------------------------------------------------

(Updated March 7, 2014, 5:39 a.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

BenM's comments.


Bugs: MESOS-984
    https://issues.apache.org/jira/browse/MESOS-984


Repository: mesos-git


Description
-------

Previously, we use onAny to connect most of the steps. As a result, we see a lot of the error checking code like the following:

void checked(const Future<Metadata::Status>& future)	
{
  if (!future.isReady()) {
    promise.fail(
        future.isFailed() ?
        "Failed to get replica status: " + future.failure() :
        "Not expecting discarded future");
    terminate(self());
    return;
  }
  ...
}

Another drawback is: discarding becomes difficult and hard to reason.

Now, I cleaned up the code to use the continuation style (i.e., chaining using Future.then). The code becomes much more clear now!

Also, I factored out the recover protocol part from the log recover process to make the log recover process less cumbersome.

This patch does not change any semantics, but is for MESOS-984 (log auto initialization for registrar). Will follow up with a patch to integrate the log auto initialization based on this patch.


Diffs (updated)
-----

  src/log/recover.cpp e611a4e 

Diff: https://reviews.apache.org/r/18584/diff/


Testing
-------

make check


Thanks,

Jie Yu