You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2018/04/02 23:58:14 UTC

[kudu-CR] KUDU-2290: Tool to re-create a tablet

Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
......................................................................


Patch Set 7:

(21 comments)

Are you comfortable merging this given the risk of KUDU-2376? The same segmentation fault will happen, and is captured by the disabled replace_tablet-itest, just like the test case I posted in the JIRA for alter table.

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/client/client-internal.cc@97
PS7, Line 97: using master::ReplaceTabletRequestPB;
> warning: using decl 'ReplaceTabletRequestPB' is unused [misc-unused-using-d
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/client/client-internal.cc@98
PS7, Line 98: using master::ReplaceTabletResponsePB;
> warning: using decl 'ReplaceTabletResponsePB' is unused [misc-unused-using-
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt@101
PS7, Line 101: ADD_KUDU_TEST(replace_tablet-itest)
> Could you follow the approach Todd took in commit 1c1d3ba to decide what va
B-but the test always fails and is disabled, because of KUDU-2376. Can I add a TODO here instead?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc@297
PS7, Line 297:         // The tablet is gone because it was already replaced or deleted.
> Should we still wait in this case?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc@468
PS7, Line 468:   LOG(INFO) << "Tables created: " << num_tables_created_.Load();
> What's the distribution of operations before and after your change? Curious
The numbers vary a lot, but it's roughly the same before and after. I see (with a debug build) ~70 tables created, ~45 deleted or altered, and about 45 tablets replaced. There's 5 master restarts or so.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc
File src/kudu/integration-tests/replace_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc@63
PS7, Line 63:     CHECK(tablet_id);
> Not needed?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc@76
PS7, Line 76:     CHECK(proxy);
> Not needed?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4324
PS7, Line 4324:     new_tablet = scoped_refptr<TabletInfo>(new TabletInfo(table, GenerateId()));
> How about:
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4346
PS7, Line 4346:   // Ensure we abort the mutations if persisting the changes fails.
> Don't the l_table and l_tablets destructors abort automatically?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4358
PS7, Line 4358:   actions.tablets_to_delete.push_back(old_tablet);
> Hmm, this should be in tablets_to_update, no? We never actually delete entr
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4376
PS7, Line 4376:     table->AddRemoveTablets({new_tablet}, {old_tablet});
              :     l_tablets.Commit();
              :     l_table.Commit();
> Are you sure these three should happen with the lock_ held? They acquire lo
Does the following order work?

1. table->AddRemoveTablets
2. l_tablets.Commit()
l(lock_) {
3. Insert to tablet map
}
4. l_table.Commit()

We need tablet + table locks when adding and removing from the table, and we want to commit the tablet state before  publishing the new tablet to the tablet map so uninitialized state won't be visible. However, we need to add the tablet to the tablet map before committing the changes to the table, since if we commit table changes before altering the tablet map there's a window where a GetTableLocations would see the new tablet as part of the table but no info about the new tablet would be available from GetTabletLocations. However, I'm worried about the interval between 2 + 4 where a GetTableLocations would see the old tablet as deleted. I wanted to ensure that GetTableLocations sees either a non-deleted old tablet or a new tablet.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4385
PS7, Line 4385:     background_tasks_->Wake();
> This is for the newly created tablet, right? Can you update the comment jus
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.h
File src/kudu/tools/tool_action_common.h:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.h@39
PS7, Line 39: class faststring;
> warning: invalid case style for class 'faststring' [readability-identifier-
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.h@155
PS7, Line 155:    LeaderMasterProxy() = default;
> Nit: indentation.
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc
File src/kudu/tools/tool_action_common.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@117
PS7, Line 117: using consensus::ConsensusServiceProxy;
> warning: using decl 'ConsensusServiceProxy' is unused [misc-unused-using-de
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@122
PS7, Line 122: using master::ListMastersRequestPB;
> warning: using decl 'ListMastersRequestPB' is unused [misc-unused-using-dec
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@123
PS7, Line 123: using master::ListMastersResponsePB;
> warning: using decl 'ListMastersResponsePB' is unused [misc-unused-using-de
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@124
PS7, Line 124: using master::ListTabletServersRequestPB;
> warning: using decl 'ListTabletServersRequestPB' is unused [misc-unused-usi
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@125
PS7, Line 125: using master::ListTabletServersResponsePB;
> warning: using decl 'ListTabletServersResponsePB' is unused [misc-unused-us
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@127
PS7, Line 127: using master::ReplaceTabletRequestPB;
> warning: using decl 'ReplaceTabletRequestPB' is unused [misc-unused-using-d
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.cc@128
PS7, Line 128: using master::ReplaceTabletResponsePB;
> warning: using decl 'ReplaceTabletResponsePB' is unused [misc-unused-using-
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 7
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Mon, 02 Apr 2018 23:58:14 +0000
Gerrit-HasComments: Yes