You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Abhishek Chennaka (Code Review)" <ge...@cloudera.org> on 2021/10/12 00:28:41 UTC

[kudu-CR] KUDU-1959 - Implement tests for the startup web page for Kudu servers

Abhishek Chennaka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17915


Change subject: KUDU-1959 - Implement tests for the startup web page for Kudu servers
......................................................................

KUDU-1959 - Implement tests for the startup web page for Kudu servers

This patch implements the tests for the startup page.
For tablet servers,
 - We inject latency to bootstrap tablets while reading the webpage every
   10 milliseconds and validating the status for each step.
- Fail a data directory and validate the status of each startup step.
For the master,
- We start it up and validate the status of each startup step.
We use mini tablet server and mini master for both these cases.

Change-Id: I80880ecbda6a9f6db85baaf109af7e701111328d
---
M src/kudu/master/master-test.cc
M src/kudu/master/master.cc
M src/kudu/mini-cluster/webui_checker.h
M src/kudu/tserver/tablet_server-test.cc
4 files changed, 173 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/17915/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I80880ecbda6a9f6db85baaf109af7e701111328d
Gerrit-Change-Number: 17915
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Chennaka <ac...@cloudera.com>

[kudu-CR] KUDU-1959 - Implement tests for the startup web page for Kudu servers

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17915 )

Change subject: KUDU-1959 - Implement tests for the startup web page for Kudu servers
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/17915/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17915/1//COMMIT_MSG@11
PS1, Line 11:  - We inject latency to bootstrap tablets while reading the webpage every
            :    10 milliseconds and validating the status for each step.
            : - Fail a data directory and validate the status of each startup step.
            : For the master,
            : - We start it up and validate the status of each startup step.
nit: make the spacing consistent?


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/master/master.cc
File src/kudu/master/master.cc:

http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/master/master.cc@a227
PS1, Line 227: 
             : 
             : 
             : 
             : 
             : 
             : 
nit: seems like this patch is more than just adding tests. Consider separating this into two patches:
- fixing the master timers, which includes master tests
- adding tests for tservers


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@803
PS1, Line 803: bool
This will be racy, so we should use an atomic<bool>


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@811
PS1, Line 811: continue;
nit: maybe sleep here too so we're not a tight loop while starting up. Same below


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@827
PS1, Line 827: {
             :                    is_started = true;
             :                    read_startup_page.join();
             :                  });
nit: in our codebase this is usually formatted as

SCOPED_CLEANUP({
  is_started = true;
  read_startup_page.join();
});

Same below.


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@846
PS1, Line 846:   // Fail a data directory.
             :   FLAGS_env_inject_eio_globs = fs_manager->GetDataRootDirs()[1];
             :   FLAGS_env_inject_eio = 1.0;
nit: it's probably worth moving this down to right below calling Start()


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@862
PS1, Line 862: ASSERT_STR_MATCHES(thread_buf.ToString(), "\"init_status\":[100|0]"
             :                                                 ".*\"read_filesystem_status\":[100|0]"
             :                                                 ".*\"read_instance_metadatafiles_status\":[100|0]"
             :                                                 ".*\"read_data_directories_status\":"
             :                                                 "([0-9]|[1-9][0-9]|100)"
             :                                                 ".*\"start_tablets_status\":([0-9]|[1-9][0-9]|100)"
             :                                                 ".*\"start_rpc_server_status\":[100|0]");
nit: maybe define this in an anonymous namespace as a helper function that validates that an input string matches this, maybe returning a bool? Same with the 100 check.


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@890
PS1, Line 890: INSTANTIATE_TEST_SUITE_P(Params, TabletServerStartupWebPageTest,
             :                          ::testing::ValuesIn(BlockManager::block_manager_types()));
Given the above tests are near identical, consider using ::testing::Combine() to cross test block manager types with whether we're injecting disk failures

See TsLocationAssignmentITest for an example



-- 
To view, visit http://gerrit.cloudera.org:8080/17915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80880ecbda6a9f6db85baaf109af7e701111328d
Gerrit-Change-Number: 17915
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Oct 2021 00:39:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1959 - Implement tests for the startup web page for Kudu servers

Posted by "Abhishek Chennaka (Code Review)" <ge...@cloudera.org>.
Abhishek Chennaka has abandoned this change. ( http://gerrit.cloudera.org:8080/17915 )

Change subject: KUDU-1959 - Implement tests for the startup web page for Kudu servers
......................................................................


Abandoned
-- 
To view, visit http://gerrit.cloudera.org:8080/17915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I80880ecbda6a9f6db85baaf109af7e701111328d
Gerrit-Change-Number: 17915
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-1959 - Implement tests for the startup web page for Kudu servers

Posted by "Abhishek Chennaka (Code Review)" <ge...@cloudera.org>.
Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/17915 )

Change subject: KUDU-1959 - Implement tests for the startup web page for Kudu servers
......................................................................


Patch Set 1:

(6 comments)

All done and abandoning this patch as this is split now

http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@803
PS1, Line 803: bool
> This will be racy, so we should use an atomic<bool>
Done


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@811
PS1, Line 811: continue;
> nit: maybe sleep here too so we're not a tight loop while starting up. Same
Done


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@827
PS1, Line 827: {
             :                    is_started = true;
             :                    read_startup_page.join();
             :                  });
> nit: in our codebase this is usually formatted as
Done


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@846
PS1, Line 846:   // Fail a data directory.
             :   FLAGS_env_inject_eio_globs = fs_manager->GetDataRootDirs()[1];
             :   FLAGS_env_inject_eio = 1.0;
> nit: it's probably worth moving this down to right below calling Start()
Done, and the same for the "FLAGS_tablet_bootstrap_inject_latency_ms" in the above test.


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@862
PS1, Line 862: ASSERT_STR_MATCHES(thread_buf.ToString(), "\"init_status\":[100|0]"
             :                                                 ".*\"read_filesystem_status\":[100|0]"
             :                                                 ".*\"read_instance_metadatafiles_status\":[100|0]"
             :                                                 ".*\"read_data_directories_status\":"
             :                                                 "([0-9]|[1-9][0-9]|100)"
             :                                                 ".*\"start_tablets_status\":([0-9]|[1-9][0-9]|100)"
             :                                                 ".*\"start_rpc_server_status\":[100|0]");
> nit: maybe define this in an anonymous namespace as a helper function that 
Done


http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@890
PS1, Line 890: INSTANTIATE_TEST_SUITE_P(Params, TabletServerStartupWebPageTest,
             :                          ::testing::ValuesIn(BlockManager::block_manager_types()));
> Given the above tests are near identical, consider using ::testing::Combine
Guess it makes more sense to cross test block manager types with whether we're injecting tablet bootstrap latency. This approach seems to cover all the cases.



-- 
To view, visit http://gerrit.cloudera.org:8080/17915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80880ecbda6a9f6db85baaf109af7e701111328d
Gerrit-Change-Number: 17915
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 02 Nov 2021 05:27:56 +0000
Gerrit-HasComments: Yes