You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2018/01/04 18:31:26 UTC

[kudu-CR] [docs] Document how to recover from a majority failed tablet

Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/8402 )

Change subject: [docs] Document how to recover from a majority failed tablet
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc
File docs/administration.adoc:

http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@709
PS4, Line 709: Reviving a tablet that's lost a majority of replicas
> how about: Bringing a tablet that's lost a majority of replicas back online
Done


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@711
PS4, Line 711: If a tablet has permanently lost a majority of its replicas, it cannot recover
> It is critical to emphasize that in a majority-lost scenario, permanent dat
Done


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@723
PS4, Line 723:   638a20403e3e4ae3b55d4d07d920e6de (tserver-00:7150): RUNNING [LEADER]
> This is kind of a cool scenario but this whole thing only works if the lead
The procedure works if the leader doesn't survive, but yes the chance of data loss is much higher then.


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@760
PS4, Line 760: $ kudu remote_replica delete tserver-01:7150 e822cab6c0584bc0858219d1539a17e6 "delete failed replica"
> this is not actually required; the master should do it automatically once t
OK, if you're sure about this. I had a couple of situations in my testing where I had to do the deletion manually, but they were mock situations that probably should be treated as disk failure if they actually happened, e.g. deleting consensus metadata files.


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@767
PS4, Line 767: [source,bash]
             : ----
             : $ kudu remote_replica unsafe_change_config <tserver address> <tablet id> <uuid 1> <uuid 2> ...
             : ----
> I found this confusing. It seems like a command, I was trying to figure out
Done


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@775
PS4, Line 775: [source,bash]
> If you are going to put this in, at least mark it with a label like "Exampl
Removed


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@777
PS4, Line 777: $ kudu remote_replica unsafe_change_config tserver-00:7150 e822cab6c0584bc0858219d1539a17e6 638a20403e3e4ae3b55d4d07d920e6de
> Because having a long UUID for tablet_id and UUID for tablet server id can 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6326f65d029a1cd75e487b16ce5be4baea2f215
Gerrit-Change-Number: 8402
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 04 Jan 2018 18:31:26 +0000
Gerrit-HasComments: Yes