You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2016/06/16 18:43:10 UTC

[kudu-CR] docs: informal design for handling permanent master failures

Hello David Ribeiro Alves, Mike Percy, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/3393

to review the following change.

Change subject: docs: informal design for handling permanent master failures
......................................................................

docs: informal design for handling permanent master failures

Here's an informal design doc that describes how we might address permanent
master failures. It presents a hacky way to handle some permanent failures
without implementing full master config change, then describes how config
change might be implemented.

The most important question I'd like to discuss first is: should master
config change support be implemented? As the doc describes (though not in
great detail), config change would be expensive to implement, and we're
running out of time for 1.0. We might decide that permanent failure is out
of scope for 1.0 (or at least permanent failures that also take out the
disk), and push config change out to a different release. If we do that, we
could handle migration of one node to three nodes with a script, or not at
all (i.e. this is beta software).

Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
---
M docs/design-docs/README.md
A docs/design-docs/master-perm-failure-1.0.md
2 files changed, 170 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/3393/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] docs: design for handling permanent master failures

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3393/3/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

Line 52: 3. Copy the master's entire data/WAL directory from **X** to **Y**.
> hrm, this is odd -- I thought in step 2, X died. how are we going to copy i
To be honest I didn't delve into the various ways in which this condition (that X is "dead" but the data is salvageable) could be satisfied. Here are some possibilities:
1. X is super old and we'd like to decommission it. It'll be considered "dead" after the copy.
2. X has a bad DIMM that causes faulst rarely. Maybe we'll rip out the bad DIMM, boot, do the copy, then decommission it.
3. Some other piece of X's hardware is gone, in which case yes, we may move the disk.

Do you think these are too contrived? Should I just rewrite this to dispel any notion that today's Kudu can recover from some kinds of permanent failure?


Line 114: 2. Find new master machines, creating DNS cnames for all of them. Create a DNS
> how will this work in the context of a management tool like CM? wouldn't th
I haven't given much thought to CM since it's out of scope for the Kudu _project_, but yeah, we may need that. Is there a similar concept in HDFS?


Line 136: 2. Implement new command line tool to rewrite cmeta files.
> can we combine these two? something that leads you through the process?
I'd rather have both: a command line tool that can perform each (specific) task on its own, and a script that ties them together.

Now that I've implemented this, though, it's proving difficult to combine since different pieces of work happen on different machines:
1. On each new master, run new "format" command to create FS.
2. On each new master, run kudu-fs_dump "list_uuid" to get the FS's UUID.
3. On the old master, run new "cmeta rewrite" command with the new UUIDs, hostports, and existing UUID/hostport.
4. On each new master, run new "tablet copy" command to fetch the master tablet.

I guess it can be done with a shell script that uses ssh to get to each machine. That won't work in every environment, though.


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: informal design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: informal design for handling permanent master failures
......................................................................


Patch Set 1:

Rendered content available here: https://github.com/adembo/kudu/blob/config_change/docs/design-docs/master-perm-failure-1.0.md.

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: informal design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: informal design for handling permanent master failures
......................................................................


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/1852/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3393

to look at the new patch set (#2).

Change subject: docs: design for handling permanent master failures
......................................................................

docs: design for handling permanent master failures

Here's a design doc that describes how we might address permanent master
failures. It presents a hacky way to handle some permanent failures without
implementing full master config change, then describes how config change
might be implemented.

The most important question I'd like to discuss first is: should master
config change support be implemented? As the doc describes (though not in
great detail), config change would be expensive to implement, and we're
running out of time for 1.0. We might decide that permanent failure is out
of scope for 1.0 (or at least permanent failures that also take out the
disk), and push config change out to a different release. If we do that, we
could handle migration of one node to three nodes with a script, or not at
all (i.e. this is beta software).

Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
---
M docs/design-docs/README.md
A docs/design-docs/master-perm-failure-1.0.md
2 files changed, 194 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/3393/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3393

to look at the new patch set (#5).

Change subject: docs: design for handling permanent master failures
......................................................................

docs: design for handling permanent master failures

Here's a design doc that describes how we might address permanent master
failures. The downside of the proposed solution is that it requires DNS
manipulation, but the upside is that it can be adapted to migrate single
node deployments to multiple masters.

Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
---
M docs/design-docs/README.md
A docs/design-docs/master-perm-failure-1.0.md
2 files changed, 112 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/3393/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3393

to look at the new patch set (#4).

Change subject: docs: design for handling permanent master failures
......................................................................

docs: design for handling permanent master failures

Here's a design doc that describes how we might address permanent master
failures. The downside of the proposed solution is that it requires DNS
manipulation, but the upside is that it can be adapted to migrate single
node deployments to multiple masters.

Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
---
M docs/design-docs/README.md
A docs/design-docs/master-perm-failure-1.0.md
2 files changed, 136 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/3393/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] docs: design for handling permanent master failures

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3393/2/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

Line 62: something into Kudu itself.
what about a slight modification to the above:

- replace 'X' with a new machine 'Y', and give it the same CNAME (eg kudu-master-3)
- add a new flag to the kudu-master which is 'kudu-master --join-existing':
-- this causes it to start up with no system tablet
-- run the RemoteBootstrapService on the masters (should be a small change)
-- existing consensus client code in theory will kick in here - the existing leader will ask the ConsensusService on the "empty" master to bootstrap a copy of the system tablet

It seems like this would reuse most of the machinery we already have, without requiring a config change, no?


Line 78: 2. Operators must also modify gflags on tservers when a master is replaced so
alternatively, could use cnames, right?


Line 125: 4. Make changes to kudu-admin to support the relevant new commands.
oh, this is more or less what I suggested above, but with the restriction that we make users use cnames (or /etc/hosts files or whatever) so that you don't need to rolling-restart the tservers or clients. I think this is a smart idea anyway to simplify client configuration.


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3393

to look at the new patch set (#3).

Change subject: docs: design for handling permanent master failures
......................................................................

docs: design for handling permanent master failures

Here's a design doc that describes how we might address permanent master
failures. The downside of the proposed solution is that it requires DNS
manipulation, but the upside is that it can be adapted to migrate single
node deployments to multiple masters.

Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
---
M docs/design-docs/README.md
A docs/design-docs/master-perm-failure-1.0.md
2 files changed, 137 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/3393/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] docs: design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 5:

Build Started http://104.196.14.100/job/kudu-gerrit/2957/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: docs: design for handling permanent master failures
......................................................................


docs: design for handling permanent master failures

Here's a design doc that describes how we might address permanent master
failures. The downside of the proposed solution is that it requires DNS
manipulation, but the upside is that it can be adapted to migrate single
node deployments to multiple masters.

Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Reviewed-on: http://gerrit.cloudera.org:8080/3393
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M docs/design-docs/README.md
A docs/design-docs/master-perm-failure-1.0.md
2 files changed, 112 insertions(+), 0 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] docs: design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 3:

Build Started http://104.196.14.100/job/kudu-gerrit/2665/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 4:

Build Started http://104.196.14.100/job/kudu-gerrit/2809/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3393/4/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

PS4, Line 45: class of permanent failures
> I still think this is such a non-practical case that it's hardly worth ment
Fair critique. I think I'll remove this section altogether to keep the doc more focused.


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: informal design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: informal design for handling permanent master failures
......................................................................


Patch Set 1:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/3393/1/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

Line 19: As part of Kudu's upcoming 1.0 release, we've been working towards improving
> Too wordy. How about replacing this paragraph with:
I think you're approaching this document in a different way than what I intended.

My original intent was for the doc to be feature-driven. "Surviving transient failures" is a feature. "Surviving permanent failures with an intact disk" is a feature. "Surviving all permanent failures, period" is a feature. I'm trying to draw this distinction to help us decide what kind of permanent failure handling (if any) we should implement in time for Kudu 1.0, given the feature set we'd derive and the time it'd take to implement.

Does that make sense? It's why I'm not drawing attention to adding/removing masters until later on, and why I led with this paragraph to frame the discussion. I'll reword it to be more terse, but I would like to keep it largely intact for this reason.


Line 26: Let's assume we have a healthy Raft configuration consisting of three
> I think we should remove this paragraph entirely. It has nothing to do with
I'd like to keep it so I added a section header.


Line 38: The most important question is: should we build permanent failure handling into
> Maybe preface this with a section header "Manual repair when data from a do
Done


Line 64: ## Proposal
> How about "Design proposal for dynamically adding / removing masters"
I've moved the above paragraph into this section, so I'd like this section's label to convey "a proposal for handling any kind of permanent failure" (the feature) rather than "a proposal for adding/removing masters" (the design and implementation).

I've reworded it, but due to the above rationale, not exactly in the way that you suggested.


Line 65: 
> Before we get to the algorithm, this needs an intro to summarize the propos
Good idea. I've incorporated your text nearly verbatim.


Line 66: Here is the sequence of events:
> I think "algorithm" would be a better name for this
Done


Line 95: While we believe this approach is correct, we would like more eyes on it to
> Since this is now basically a formal design doc, we can assume more eyes on
Done


PS1, Line 119: mitigate the above confusion
> how about "To avoid the above inconsistency in the semantics of the --maste
Given the proximity to the previous paragraph, I think it's implicit that we're talking about the --master_addresses gflag. But I'll reword a bit to link back to "inconsistent semantics"


Line 121: disk. Then, we can remove **--master_addresses** and use a new kudu-admin
> "remove --master_addresses gflag from the kudu-master processes" ... maybe 
I'll clarify, though I'm trying to be careful and not confuse people into thinking --master_addresses is a tserver gflag too.

I also added a blurb to the first reference to --master_addresses and --tserver_master_addrs to explain what each is.


Line 126: started in normal mode, it creates a Raft config of one (just itself) and
> This mode stuff is quite confusing. I'm leaning toward implementing a total
I specified it this way so that just running the kudu-master binary with minimal gflag configuration from the command line yields an operational single-node master. Requiring a "format" for that case (which is exercised often by manual and automated tests alike) would be a frustrating tax.

That said, perhaps the "format" approach would be ideal for the multi-master case. For a single node, just run kudu-master as-is. For multi-node, do a "format" first, then run kudu-master (no special gflags). The master should have enough information at that point to come up in "listening" mode and await remote bootstrap, right? Effectively this would mean inserting the format command just before step 5 in the algorithm.

What do you think?


Line 133: 1. There\u2019s a certain desire to completely remove **last_known_addr** from
> s/There's a certain desire/It's useful on tablet servers/
Done


Line 134:    **RaftPeerPB** (as only UUIDs should be necessary for identifying members of
> only UUIDs should be necessary, and this makes it easier for tablet servers
Done


Line 153: the semantics of **--tserver_master_addrs** become confusing just like
> s/become/becomes/
Nope, the noun here is semantics, which is plural. "The semantics become confusing", not "the semantics becomes confusing".


Line 156: removed. To alleviate the confusion we could do the same thing we did for
> I think this process is overly onerous. It also makes it possible to "orpha
I understand that the union of --tserver_master_addrs and a dynamic discovery mechanism is more flexible, but I think it's also tougher to reason about ("why is this tserver heartbeating to that master? Is it because of the gflag value? Is it because of what the tserver found at runtime?"). I generally prefer solutions that are more predictable in nature, since that means an easier time troubleshooting (for us) and operating (for administrators).

Onerousness aside, how is orphaning possible? I assumed that, just as with the master, the on-disk RaftConfigPB can be modified by a command line tool (probably the same tool as is used by the master). That way, if an old tserver starts and can't find any of it's (now long gone) masters, the operator can use the tool to rewrite the master configuration.

I don't think rogue masters (in the "tserver is talking to a stale master that has been partitioned from its Raft config" sense of the word) are an issue, for the same reason that they aren't an issue in the rest of the multi-master world: any destructive operation that the rogue master could carry out on the tserver must be replicated first, at which point the rogue master would notice that it's no longer the leader.

But maybe you meant "rogue master" as in "wrong master"? Like, a tserver that was configured to heartbeat to the wrong Kudu deployment? In that case, the tserver could check the UUIDs of incoming requests and heartbeat responses against the UUIDs of the on-disk RaftConfigPB, and ignore requests coming from those masters.


Line 169: the configuration recover from this incomplete state?
> Yeah, we may need to look at implementing KUDU-1194: consensus: Allow abort
I'll add references to the bugs you mentioned, but I'm going to leave this section largely unspecified for now, since I'd rather not sink a ton of time into a design if we're feeling we may not implement it yet.


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: informal design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: informal design for handling permanent master failures
......................................................................


Patch Set 1: -Verified

Build Started http://104.196.14.100/job/kudu-gerrit/1853/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3393/2/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

Line 62: something into Kudu itself.
> what about a slight modification to the above:
I fear that "start up with no system tablet" is complicated to implement. But, I've reworked the algorithm significantly. Now it uses DNS cnames unabashedly, which means it doesn't need config changes at all.


Line 78: 2. Operators must also modify gflags on tservers when a master is replaced so
> alternatively, could use cnames, right?
Done


Line 125: 4. Make changes to kudu-admin to support the relevant new commands.
> oh, this is more or less what I suggested above, but with the restriction t
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 2:

The doc has been changed significantly. The "real" approach now requires DNS cnames and thus doesn't need config change support at all. I've also removed the various "we could do this" paragraphs to keep the focus squarely on what's in scope for 1.0.

As before, https://github.com/adembo/kudu/blob/config_change/docs/design-docs/master-perm-failure-1.0.md contains the rendered output.

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3393/3/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

Line 52: 3. Copy the master's entire data/WAL directory from **X** to **Y**.
hrm, this is odd -- I thought in step 2, X died. how are we going to copy it? are you supposing someone would rip the drives out and move them?


Line 114: 2. Find new master machines, creating DNS cnames for all of them. Create a DNS
how will this work in the context of a management tool like CM? wouldn't the tool try to specifically configure all the tservers to point to the real hostnames and not the cnames? do we need some config overrides for this to work?


Line 136: 2. Implement new command line tool to rewrite cmeta files.
can we combine these two? something that leads you through the process?


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: design for handling permanent master failures

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3393/4/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

PS4, Line 45: class of permanent failures
I still think this is such a non-practical case that it's hardly worth mentioning.

Even if your failure was bad RAM or something, are you really going to send someone to the datacenter to pull the disk out of the machine and attach it to a different machine so you can copy the data? Kind of unrealistic in an operational environment.

I suppose the one case in which this would make sense is if your masters had mounted their data directory via a SAN or somesuch, so you could feasibly lose the machine and then mount the same LUN on a new box, but again it's rare enough in a typical Kudu environment that maybe this should be relegated to an appendix?


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: design for handling permanent master failures

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 2:

> Mind updating the rendered HTML version?

Sorry for the delay; it's been updated, same link as before.

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: informal design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: informal design for handling permanent master failures
......................................................................


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/1839/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: informal design for handling permanent master failures

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: docs: informal design for handling permanent master failures
......................................................................


Patch Set 1:

(15 comments)

This is sounding more formal, let's just call it a real design doc :)

I added some suggestions to make it easier for others to review and read, IMHO.

I also added some comments about the design.

http://gerrit.cloudera.org:8080/#/c/3393/1/docs/design-docs/master-perm-failure-1.0.md
File docs/design-docs/master-perm-failure-1.0.md:

Line 19: As part of Kudu's upcoming 1.0 release, we've been working towards improving
Too wordy. How about replacing this paragraph with:

This document discusses one of the remaining gaps in Kudu's support for multi-master operation: adding and removing masters.

Regarding the transient vs permanent thing, adding / removing masters has nothing to do with transient failures. Maybe I'm assuming too much but when I read this doc I would assume that is obvious. If it's not, then maybe it's worth a single sentence: "Transient failures are handled by the normal Raft leader election mechanisms."


Line 26: Let's assume we have a healthy Raft configuration consisting of three
I think we should remove this paragraph entirely. It has nothing to do with the purpose of this design doc.

If you want to keep it, how about add a section header along the lines of "Handling transient failures" or something, so people can just skip it.


Line 38: The most important question is: should we build permanent failure handling into
Maybe preface this with a section header "Manual repair when data from a downed node is still available". We still haven't gotten to the actual design part yet.


Line 64: ## Proposal
How about "Design proposal for dynamically adding / removing masters"


Line 65: 
Before we get to the algorithm, this needs an intro to summarize the proposal in English. along the lines of:

We will reuse most of the configuration change and remote bootstrap design and implementation from the tablet servers. The differences are:

1) Adding and removing masters is not performed automatically. An administrator must trigger those actions with some tool.
2) Administrators must also modify command line flags on tablet servers when a master is replaced, so that the tablet servers can find the new master.
3) Masters will decide whether to self-initialize a new configuration or join an existing one based on their command-line flags and their consensus metadata files (discussed below in Master directory management, phase 2)


Line 66: Here is the sequence of events:
I think "algorithm" would be a better name for this


Line 95: While we believe this approach is correct, we would like more eyes on it to
Since this is now basically a formal design doc, we can assume more eyes on it is a given. Let's replace this whole paragraph with the following:

In order to implement the above design, we'll need to make the following changes:


PS1, Line 119: mitigate the above confusion
how about "To avoid the above inconsistency in the semantics of the --master_addresses gflag,"


Line 121: disk. Then, we can remove **--master_addresses** and use a new kudu-admin
"remove --master_addresses gflag from the kudu-master processes" ... maybe some wording like this will clarify that we're not talking about tservers here, for anybody who's lost track of that


Line 126: started in normal mode, it creates a Raft config of one (just itself) and
This mode stuff is quite confusing. I'm leaning toward implementing a totally manual process, including a "format master" command to "spawn" our first master server(s).

So basically if a master starts up, and it has an existing master tablet (with assumedly a consensus metadata file) then it starts up in normal mode. If it doesn't, it starts up in listening mode.


Line 133: 1. There\u2019s a certain desire to completely remove **last_known_addr** from
s/There's a certain desire/It's useful on tablet servers/


Line 134:    **RaftPeerPB** (as only UUIDs should be necessary for identifying members of
only UUIDs should be necessary, and this makes it easier for tablet servers to change IPs and ports transparently


Line 153: the semantics of **--tserver_master_addrs** become confusing just like
s/become/becomes/


Line 156: removed. To alleviate the confusion we could do the same thing we did for
I think this process is overly onerous. It also makes it possible to "orphan" a tserver with no way back that is covered by this spec.

How about we just say that --tserver_master_addrs is a pool of potential masters, tservers don't store anything about the masters on disk, and it can be modified in memory by the above mechanisms (i.e. ListMasters()) but if you want additional potential masters to be known at boot time then they have to be in the gflags.

Well, maybe the tserver should store one thing on disk: the sequence or OpId of the last committed master config. This would help prevent against action by rogue masters. Although that is not really relevant to this doc, it's more relevant to multi-master operation overall. We may have already discussed and resolved this issue in that context, if so then we can punt on this


Line 169: the configuration recover from this incomplete state?
Yeah, we may need to look at implementing KUDU-1194: consensus: Allow abort of uncommittable config change ops, or potentially something where we allow tombstoned tablets or non-members of the config to vote (KUDU-871: Allow tombstoned tablets to vote)


-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] docs: design for handling permanent master failures

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/1861/

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] docs: design for handling permanent master failures

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: docs: design for handling permanent master failures
......................................................................


Patch Set 2:

Mind updating the rendered HTML version?

-- 
To view, visit http://gerrit.cloudera.org:8080/3393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f05c319c89cf37e2d71fdc4b7ec951b2932a2b2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No