You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mike Percy (Code Review)" <ge...@cloudera.org> on 2017/11/21 22:57:17 UTC

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Hello Alexey Serbin,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/8626

to review the following change.


Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................

consensus: Add gflag to enable 3-4-3 re-replication

This gflag will be used to control whether 3-4-3 re-replication is used
on a cluster.

Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
---
M src/kudu/consensus/raft_consensus.cc
1 file changed, 7 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/8626/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>

[kudu-CR] consensus: Add gflag to enable improved re-replication

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable improved re-replication
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_prepare_replacement_before_eviction, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been prepared for them.");
> A little bit more of bike-shedding from my side: I think it's worth noting 
Check out my updated description. Do you mean that we should specify that if it's only supported to be enabled or disabled on the entire cluster?



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 2
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 08:35:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable improved re-replication

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable improved re-replication
......................................................................


Patch Set 3:

Great, thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 20:22:56 +0000
Gerrit-HasComments: No

[kudu-CR] consensus: Add gflag to enable improved re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable improved re-replication
......................................................................

consensus: Add gflag to enable improved re-replication

This gflag will be used to control whether improved re-replication is
enabled on a cluster.

Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Reviewed-on: http://gerrit.cloudera.org:8080/8626
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/consensus/raft_consensus.cc
1 file changed, 7 insertions(+), 0 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_rereplication_add_before_evict, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been added to the config.");
Does this mean the replicas are to be evicted by the leader replica when this is set to 'true'?  Or this flag actually switches to the new scheme where master, but not the leader replica, evicts failed replicas?



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 21 Nov 2017 23:41:25 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_rereplication_add_before_evict, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been added to the config.");
> I don't like putting version numbers in for config flags because they often
I think 'raft_prepare_replacement_before_eviction' sounds much better.



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 01:43:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable improved re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable improved re-replication
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_prepare_replacement_before_eviction, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been prepared for them.");
> Check out my updated description. Do you mean that we should specify that i
I meant to mention that this flag in case of the kudu-master binary means that catalog manager would behave differently.  Is that what this flag is meant to be for kudu-master, right?



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 2
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 17:55:16 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_rereplication_add_before_evict, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been added to the config.");
> The idea is that this enables 3-4-3 re-replication, and must be set to "tru
OK, I see.

I remember there were some musings about preference of 3-2-3 path for the case when we definitely know that one out of 3 replicas is failed, no?  Overall, do you think '3-4-3' theme is good to describe the new way of auto-recovery in case of replica failure?

Maybe, something like 'raft_rereplication_v1' is the best way to name it?



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 00:00:58 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable improved re-replication

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8626

to look at the new patch set (#2).

Change subject: consensus: Add gflag to enable improved re-replication
......................................................................

consensus: Add gflag to enable improved re-replication

This gflag will be used to control whether improved re-replication is
enabled on a cluster.

Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
---
M src/kudu/consensus/raft_consensus.cc
1 file changed, 7 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/8626/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 2
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] consensus: Add gflag to enable improved re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable improved re-replication
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_prepare_replacement_before_eviction, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been prepared for them.");
> I meant to mention that this flag in case of the kudu-master binary means t
After some consideration, I think we can leave this as it is.

Perhaps, my confusion comes from that 'raft_' prefix and my perception that for master 'raft_' should be related to the Raft cluster of the system tablet.  If we never want to control the latter by this flag, probably it's fine to keep this flag in here and give it a special treatment in case of catalog_manager.



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 2
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 19:35:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_rereplication_add_before_evict, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been added to the config.");
> I think 'raft_prepare_replacement_before_eviction' sounds much better.
A little bit more of bike-shedding from my side: I think it's worth noting in the description that this flag means different things in case of master/catalog_manager and tserver.

Instead, maybe, introduce a flag to both tserver and master, named like 'ts_tablet_prepare_replacement_before_eviction' ?

Naming is hard.



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 02:21:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_rereplication_add_before_evict, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been added to the config.");
> Does this mean the replicas are to be evicted by the leader replica when th
The idea is that this enables 3-4-3 re-replication, and must be set to "true" on all masters and tservers to work properly.



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 23:42:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] consensus: Add gflag to enable 3-4-3 re-replication

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8626 )

Change subject: consensus: Add gflag to enable 3-4-3 re-replication
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8626/1/src/kudu/consensus/raft_consensus.cc@124
PS1, Line 124: DEFINE_bool(raft_rereplication_add_before_evict, false,
             :             "When enabled, failed replicas will only be evicted after a "
             :             "replacement has been added to the config.");
> OK, I see.
I don't like putting version numbers in for config flags because they often last long after the versions have meaning, but you make a good point.

How about: raft_prepare_replacement_before_eviction

This is something that is actually new, and that would apply to many future designs, and represents something we're not doing that causes availability problems today.



-- 
To view, visit http://gerrit.cloudera.org:8080/8626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I247842af7d0172d6e6b167a29613f6b261990a01
Gerrit-Change-Number: 8626
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 00:44:41 +0000
Gerrit-HasComments: Yes