You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Jie Yu <yu...@gmail.com> on 2013/10/25 23:51:44 UTC

Review Request 14946: Catch-up Replicated Log 3: added recover process.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

Review request for mesos and Benjamin Hindman.


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs
-----

  src/Makefile.am a2d8242 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
Sounds good. I'll do it today.


On Wed, Nov 6, 2013 at 11:21 AM, Benjamin Hindman <be...@berkeley.edu> wrote:

> How about we do:
>
> Shared<T> share(); -- in Owned<T>
>
> and maybe even rename 'upgrade()' to 'own()' in Shared<T>!
>
> Ben.
>
>
> On Wed, Nov 6, 2013 at 9:17 AM, Jie Yu <yu...@gmail.com> wrote:
>
>>
>>
>> > On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
>> > > src/messages/log.proto, line 186
>> > > <
>> https://reviews.apache.org/r/14946/diff/6/?file=377561#file377561line186>
>> > >
>> > >     I think reusing the RecoverRequest for both the network broadcast
>> and a RecoverProcess adds unnecessarily complexity. Why can't the
>> RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica
>> to set update it's metadata?
>> >
>> > Jie Yu wrote:
>> >     That was my initial solution. The problem is that you need to mark
>> setState (or update) as a const function, which is OK because Replica is a
>> wrapper anyway, but is not intuitive. What do you think?
>> >
>> >     Or we may want to introduce a WritableShared<T>(const Shared<T>&
>> shared) to allow writing?
>> >
>> > Benjamin Hindman wrote:
>> >     I guess I'm still unclear about the recovering semantics in the
>> first place. What does it mean to have both a RecoverProcess and a
>> CoordinatorProcess sharing a replica simultaneously? Can we not give the
>> RecoverProcess an Owned<Replica> and then after it completes make it shared?
>>
>> Oh! Haven't thought about in this way. Yeah, that should work.
>>
>> We either need to add
>> Shared(const Owned<T>& owned);  -- in Shared<T>
>>
>> or
>> Shared<T> downgrade(); -- in Owned<T>
>>
>> Which one do you prefer?
>>
>>
>> - Jie
>>
>>
>> -----------------------------------------------------------
>> This is an automatically generated e-mail. To reply, visit:
>> https://reviews.apache.org/r/14946/#review28259
>> -----------------------------------------------------------
>>
>>
>> On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote:
>> >
>> > -----------------------------------------------------------
>> > This is an automatically generated e-mail. To reply, visit:
>> > https://reviews.apache.org/r/14946/
>> > -----------------------------------------------------------
>> >
>> > (Updated Nov. 5, 2013, 12:52 a.m.)
>> >
>> >
>> > Review request for mesos and Benjamin Hindman.
>> >
>> >
>> > Bugs: MESOS-736
>> >     https://issues.apache.org/jira/browse/MESOS-736
>> >
>> >
>> > Repository: mesos-git
>> >
>> >
>> > Description
>> > -------
>> >
>> > This is the last patch of a series of patches that implement catch-up
>> replicated log.
>> >
>> > Here is summary of this patch:
>> > 1) Introduced RecoverRequest/RecoverResponse in log.proto.
>> > 2) Replaced Promise with ReplicaInfo in log.proto as we need persist
>> recovery information (maintain backwards compatibility).
>> > 3) A 2-phase empty log start-up algorithm.
>> > 4) Added a test to test two competing recover processes.
>> >
>> > This is a joint work with Yan Xu.
>> >
>> >
>> > Diffs
>> > -----
>> >
>> >   src/Makefile.am 9780d07
>> >   src/log/log.hpp 77edc7a
>> >   src/log/recover.hpp PRE-CREATION
>> >   src/log/recover.cpp PRE-CREATION
>> >   src/log/replica.hpp d1f5ead
>> >   src/log/replica.cpp 59a6ff3
>> >   src/messages/log.proto 3d5859f
>> >   src/tests/log_tests.cpp ff5f86c
>> >
>> > Diff: https://reviews.apache.org/r/14946/diff/
>> >
>> >
>> > Testing
>> > -------
>> >
>> > bin/mesos-tests.sh
>> --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
>> >
>> >
>> > Thanks,
>> >
>> > Jie Yu
>> >
>> >
>>
>>
>

Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Benjamin Hindman <be...@berkeley.edu>.
How about we do:

Shared<T> share(); -- in Owned<T>

and maybe even rename 'upgrade()' to 'own()' in Shared<T>!

Ben.


On Wed, Nov 6, 2013 at 9:17 AM, Jie Yu <yu...@gmail.com> wrote:

>
>
> > On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
> > > src/messages/log.proto, line 186
> > > <
> https://reviews.apache.org/r/14946/diff/6/?file=377561#file377561line186>
> > >
> > >     I think reusing the RecoverRequest for both the network broadcast
> and a RecoverProcess adds unnecessarily complexity. Why can't the
> RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica
> to set update it's metadata?
> >
> > Jie Yu wrote:
> >     That was my initial solution. The problem is that you need to mark
> setState (or update) as a const function, which is OK because Replica is a
> wrapper anyway, but is not intuitive. What do you think?
> >
> >     Or we may want to introduce a WritableShared<T>(const Shared<T>&
> shared) to allow writing?
> >
> > Benjamin Hindman wrote:
> >     I guess I'm still unclear about the recovering semantics in the
> first place. What does it mean to have both a RecoverProcess and a
> CoordinatorProcess sharing a replica simultaneously? Can we not give the
> RecoverProcess an Owned<Replica> and then after it completes make it shared?
>
> Oh! Haven't thought about in this way. Yeah, that should work.
>
> We either need to add
> Shared(const Owned<T>& owned);  -- in Shared<T>
>
> or
> Shared<T> downgrade(); -- in Owned<T>
>
> Which one do you prefer?
>
>
> - Jie
>
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14946/#review28259
> -----------------------------------------------------------
>
>
> On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote:
> >
> > -----------------------------------------------------------
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/14946/
> > -----------------------------------------------------------
> >
> > (Updated Nov. 5, 2013, 12:52 a.m.)
> >
> >
> > Review request for mesos and Benjamin Hindman.
> >
> >
> > Bugs: MESOS-736
> >     https://issues.apache.org/jira/browse/MESOS-736
> >
> >
> > Repository: mesos-git
> >
> >
> > Description
> > -------
> >
> > This is the last patch of a series of patches that implement catch-up
> replicated log.
> >
> > Here is summary of this patch:
> > 1) Introduced RecoverRequest/RecoverResponse in log.proto.
> > 2) Replaced Promise with ReplicaInfo in log.proto as we need persist
> recovery information (maintain backwards compatibility).
> > 3) A 2-phase empty log start-up algorithm.
> > 4) Added a test to test two competing recover processes.
> >
> > This is a joint work with Yan Xu.
> >
> >
> > Diffs
> > -----
> >
> >   src/Makefile.am 9780d07
> >   src/log/log.hpp 77edc7a
> >   src/log/recover.hpp PRE-CREATION
> >   src/log/recover.cpp PRE-CREATION
> >   src/log/replica.hpp d1f5ead
> >   src/log/replica.cpp 59a6ff3
> >   src/messages/log.proto 3d5859f
> >   src/tests/log_tests.cpp ff5f86c
> >
> > Diff: https://reviews.apache.org/r/14946/diff/
> >
> >
> > Testing
> > -------
> >
> > bin/mesos-tests.sh
> --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
> >
> >
> > Thanks,
> >
> > Jie Yu
> >
> >
>
>

Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.

> On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
> > src/messages/log.proto, line 186
> > <https://reviews.apache.org/r/14946/diff/6/?file=377561#file377561line186>
> >
> >     I think reusing the RecoverRequest for both the network broadcast and a RecoverProcess adds unnecessarily complexity. Why can't the RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica to set update it's metadata?
> 
> Jie Yu wrote:
>     That was my initial solution. The problem is that you need to mark setState (or update) as a const function, which is OK because Replica is a wrapper anyway, but is not intuitive. What do you think?
>     
>     Or we may want to introduce a WritableShared<T>(const Shared<T>& shared) to allow writing?
> 
> Benjamin Hindman wrote:
>     I guess I'm still unclear about the recovering semantics in the first place. What does it mean to have both a RecoverProcess and a CoordinatorProcess sharing a replica simultaneously? Can we not give the RecoverProcess an Owned<Replica> and then after it completes make it shared?

Oh! Haven't thought about in this way. Yeah, that should work.

We either need to add
Shared(const Owned<T>& owned);  -- in Shared<T>

or
Shared<T> downgrade(); -- in Owned<T>

Which one do you prefer?


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/#review28259
-----------------------------------------------------------


On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14946/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:52 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-736
>     https://issues.apache.org/jira/browse/MESOS-736
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This is the last patch of a series of patches that implement catch-up replicated log.
> 
> Here is summary of this patch:
> 1) Introduced RecoverRequest/RecoverResponse in log.proto.
> 2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
> 3) A 2-phase empty log start-up algorithm.
> 4) Added a test to test two competing recover processes.
> 
> This is a joint work with Yan Xu.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 9780d07 
>   src/log/log.hpp 77edc7a 
>   src/log/recover.hpp PRE-CREATION 
>   src/log/recover.cpp PRE-CREATION 
>   src/log/replica.hpp d1f5ead 
>   src/log/replica.cpp 59a6ff3 
>   src/messages/log.proto 3d5859f 
>   src/tests/log_tests.cpp ff5f86c 
> 
> Diff: https://reviews.apache.org/r/14946/diff/
> 
> 
> Testing
> -------
> 
> bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.

> On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
> > src/log/replica.hpp, line 58
> > <https://reviews.apache.org/r/14946/diff/6/?file=377559#file377559line58>
> >
> >     Hmm, what do you mean by backwards compatibility here? Most old users will want to turn strict on, and most new users will want to have it on by default and they'll likely miss it since it's a defaulted parameter. What's the issue in making strict = true by default?

Hah, I have another separate patch which will enable the 'strict' mode by default. I will merge that with this patch.


> On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
> > src/log/replica.cpp, lines 1209-1222
> > <https://reviews.apache.org/r/14946/diff/6/?file=377560#file377560line1209>
> >
> >     Why can't the default metadata status just be EMPTY? Then no need to set it explicitly right?

This is for backward compatibility concern. Suppose a replica has the old format (no meta data, only the promise is stored), I think we should to set its state to VOTING by default. Thoughts?


> On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
> > src/messages/log.proto, line 186
> > <https://reviews.apache.org/r/14946/diff/6/?file=377561#file377561line186>
> >
> >     I think reusing the RecoverRequest for both the network broadcast and a RecoverProcess adds unnecessarily complexity. Why can't the RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica to set update it's metadata?

That was my initial solution. The problem is that you need to mark setState (or update) as a const function, which is OK because Replica is a wrapper anyway, but is not intuitive. What do you think?

Or we may want to introduce a WritableShared<T>(const Shared<T>& shared) to allow writing?


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/#review28259
-----------------------------------------------------------


On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14946/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:52 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-736
>     https://issues.apache.org/jira/browse/MESOS-736
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This is the last patch of a series of patches that implement catch-up replicated log.
> 
> Here is summary of this patch:
> 1) Introduced RecoverRequest/RecoverResponse in log.proto.
> 2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
> 3) A 2-phase empty log start-up algorithm.
> 4) Added a test to test two competing recover processes.
> 
> This is a joint work with Yan Xu.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 9780d07 
>   src/log/log.hpp 77edc7a 
>   src/log/recover.hpp PRE-CREATION 
>   src/log/recover.cpp PRE-CREATION 
>   src/log/replica.hpp d1f5ead 
>   src/log/replica.cpp 59a6ff3 
>   src/messages/log.proto 3d5859f 
>   src/tests/log_tests.cpp ff5f86c 
> 
> Diff: https://reviews.apache.org/r/14946/diff/
> 
> 
> Testing
> -------
> 
> bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Benjamin Hindman <be...@berkeley.edu>.

> On Nov. 6, 2013, 6:48 a.m., Benjamin Hindman wrote:
> > src/messages/log.proto, line 186
> > <https://reviews.apache.org/r/14946/diff/6/?file=377561#file377561line186>
> >
> >     I think reusing the RecoverRequest for both the network broadcast and a RecoverProcess adds unnecessarily complexity. Why can't the RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica to set update it's metadata?
> 
> Jie Yu wrote:
>     That was my initial solution. The problem is that you need to mark setState (or update) as a const function, which is OK because Replica is a wrapper anyway, but is not intuitive. What do you think?
>     
>     Or we may want to introduce a WritableShared<T>(const Shared<T>& shared) to allow writing?

I guess I'm still unclear about the recovering semantics in the first place. What does it mean to have both a RecoverProcess and a CoordinatorProcess sharing a replica simultaneously? Can we not give the RecoverProcess an Owned<Replica> and then after it completes make it shared?


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/#review28259
-----------------------------------------------------------


On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14946/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:52 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-736
>     https://issues.apache.org/jira/browse/MESOS-736
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This is the last patch of a series of patches that implement catch-up replicated log.
> 
> Here is summary of this patch:
> 1) Introduced RecoverRequest/RecoverResponse in log.proto.
> 2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
> 3) A 2-phase empty log start-up algorithm.
> 4) Added a test to test two competing recover processes.
> 
> This is a joint work with Yan Xu.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 9780d07 
>   src/log/log.hpp 77edc7a 
>   src/log/recover.hpp PRE-CREATION 
>   src/log/recover.cpp PRE-CREATION 
>   src/log/replica.hpp d1f5ead 
>   src/log/replica.cpp 59a6ff3 
>   src/messages/log.proto 3d5859f 
>   src/tests/log_tests.cpp ff5f86c 
> 
> Diff: https://reviews.apache.org/r/14946/diff/
> 
> 
> Testing
> -------
> 
> bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/#review28259
-----------------------------------------------------------


Great stuff here but I think there are still come design decisions to address. I didn't spend too much time looking into the RecoverProcess just yet, I'll do that in a second pass after we address these comments first. Awesome!


src/log/log.hpp
<https://reviews.apache.org/r/14946/#comment55017>

    We need some more comments when a log shouldn't be in strict mode. And also why it's not in strict mode by default.



src/log/log.hpp
<https://reviews.apache.org/r/14946/#comment55018>

    So I don't fully understand the semantics here. What if I try and write to the log while the local replica is recovering? Do we need to gate other log operations on recovery? See the Registrar for examples of this to understand what I mean by "gate".



src/log/recover.cpp
<https://reviews.apache.org/r/14946/#comment55024>

    Okay, this is cheating and you know it! ;)
    
    Given the unclear semantics around recovery in Log anyway, I'm not sure that we have to do this. I think we should clean up those semantics first.



src/log/replica.hpp
<https://reviews.apache.org/r/14946/#comment55004>

    Hmm, what do you mean by backwards compatibility here? Most old users will want to turn strict on, and most new users will want to have it on by default and they'll likely miss it since it's a defaulted parameter. What's the issue in making strict = true by default?



src/log/replica.cpp
<https://reviews.apache.org/r/14946/#comment55012>

    s/meta data/metadata/



src/log/replica.cpp
<https://reviews.apache.org/r/14946/#comment55020>

    If I'm reading this correctly both the 'persist(const ReplicaState&)' and 'persist(uint64_t)' functions are only used in 'setState' and 'setPromised' respectively. Let's just inline them now (less functions). In addition, just like the "persist" naming, how about renaming setState and setPromised to 'update' overloaded by argument type? Then in the code we'd see things like 'update(proposal)' and 'update(state)'. Basically the 'update' family of functions will correspond to updating the metadata.



src/log/replica.cpp
<https://reviews.apache.org/r/14946/#comment55013>

    I'd probably still use the assignment operator here just because it's less abrupt with the surrounding code.



src/log/replica.cpp
<https://reviews.apache.org/r/14946/#comment55014>

    s/mater/matter/



src/log/replica.cpp
<https://reviews.apache.org/r/14946/#comment55015>

    s/log/replica/



src/log/replica.cpp
<https://reviews.apache.org/r/14946/#comment55016>

    Why can't the default metadata status just be EMPTY? Then no need to set it explicitly right?



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55007>

    How about s/NORMAL/VOTING/. Also, see my comments below but I think we should embed this enum and then kill the 'REPLICA_' prefixes.



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55003>

    s/meta data/metadata/



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55005>

    How about s/ReplicaInfo/Metadata/? This is more inline with things like 'Promise' too.



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55006>

    Let's move the ReplicaState enum as an embedded message. See things like Value::Type for other examples of this. Also, I'd like to call it 'Status' instead since state is really overloaded throughout all of this code. ;)
    
    message Metadata {
      enum Status {
        VOTING = 1;
        ..
      }
      required Status status = 1 [default = EMPTY];
    }
    
    This will let you address things cleanly, for example: Metadata::Status::VOTING.
    
    Also, I added a default, not sure if that is correct or not.



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55009>

    // DEPRECATED!



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55008>

    s/REPLICA_INFO/METADATA/



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55010>

    // DEPRECATED!



src/messages/log.proto
<https://reviews.apache.org/r/14946/#comment55011>

    I think reusing the RecoverRequest for both the network broadcast and a RecoverProcess adds unnecessarily complexity. Why can't the RecoverProcess just use it's Shared<Replica> to explicitly ask the Replica to set update it's metadata?


- Benjamin Hindman


On Nov. 5, 2013, 12:52 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14946/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:52 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-736
>     https://issues.apache.org/jira/browse/MESOS-736
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This is the last patch of a series of patches that implement catch-up replicated log.
> 
> Here is summary of this patch:
> 1) Introduced RecoverRequest/RecoverResponse in log.proto.
> 2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
> 3) A 2-phase empty log start-up algorithm.
> 4) Added a test to test two competing recover processes.
> 
> This is a joint work with Yan Xu.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 9780d07 
>   src/log/log.hpp 77edc7a 
>   src/log/recover.hpp PRE-CREATION 
>   src/log/recover.cpp PRE-CREATION 
>   src/log/replica.hpp d1f5ead 
>   src/log/replica.cpp 59a6ff3 
>   src/messages/log.proto 3d5859f 
>   src/tests/log_tests.cpp ff5f86c 
> 
> Diff: https://reviews.apache.org/r/14946/diff/
> 
> 
> Testing
> -------
> 
> bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/#review29315
-----------------------------------------------------------


Split into two patches:
1) https://reviews.apache.org/r/15799/ (3.1)
2) https://reviews.apache.org/r/15802/ (3.2)

- Jie Yu


On Nov. 13, 2013, 8:40 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14946/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2013, 8:40 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-736
>     https://issues.apache.org/jira/browse/MESOS-736
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This is the last patch of a series of patches that implement catch-up replicated log.
> 
> Here is summary of this patch:
> 1) Introduced RecoverRequest/RecoverResponse in log.proto.
> 2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
> 3) A 2-phase empty log start-up algorithm.
> 4) Added a test to test two competing recover processes.
> 
> This is a joint work with Yan Xu.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 52244dd 
>   src/log/log.hpp 77edc7a 
>   src/log/recover.hpp PRE-CREATION 
>   src/log/recover.cpp PRE-CREATION 
>   src/log/replica.hpp d1f5ead 
>   src/log/replica.cpp 59a6ff3 
>   src/messages/log.proto 3d5859f 
>   src/tests/log_tests.cpp ff5f86c 
> 
> Diff: https://reviews.apache.org/r/14946/diff/
> 
> 
> Testing
> -------
> 
> bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

(Updated Nov. 13, 2013, 8:40 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated "depends on".


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs
-----

  src/Makefile.am 52244dd 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

(Updated Nov. 13, 2013, 8:36 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Addressed BenH's comments.


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs (updated)
-----

  src/Makefile.am 52244dd 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

(Updated Nov. 5, 2013, 12:52 a.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Rebase.


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs (updated)
-----

  src/Makefile.am 9780d07 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

(Updated Nov. 1, 2013, 10:04 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Rebase. Added some tests.

Tests:
bin/mesos-tests.sh --gtest_filter=LogTest.*:ReplicaTest.*:CoordinatorTest.*:LogZooKeeperTest.*:RecovoverTest.* --gtest_repeat=100


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs (updated)
-----

  src/Makefile.am a11c76b 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

(Updated Oct. 29, 2013, 5:38 a.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Added a name for the recover process.


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs (updated)
-----

  src/Makefile.am 0b32d74 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu


Re: Review Request 14946: Catch-up Replicated Log 3: added recover process.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14946/
-----------------------------------------------------------

(Updated Oct. 26, 2013, 12:18 a.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Fixed a typo on test name.


Bugs: MESOS-736
    https://issues.apache.org/jira/browse/MESOS-736


Repository: mesos-git


Description
-------

This is the last patch of a series of patches that implement catch-up replicated log.

Here is summary of this patch:
1) Introduced RecoverRequest/RecoverResponse in log.proto.
2) Replaced Promise with ReplicaInfo in log.proto as we need persist recovery information (maintain backwards compatibility).
3) A 2-phase empty log start-up algorithm.
4) Added a test to test two competing recover processes.

This is a joint work with Yan Xu.


Diffs (updated)
-----

  src/Makefile.am 0b32d74 
  src/log/log.hpp 77edc7a 
  src/log/recover.hpp PRE-CREATION 
  src/log/recover.cpp PRE-CREATION 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14946/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:ReplicaTest.*:LogTest.* --gtest_repeat=100


Thanks,

Jie Yu