You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2017/04/18 01:55:37 UTC

[kudu-CR] WIP: KUDU-1970: node density integration test

Hello Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6662

to review the following change.

Change subject: WIP: KUDU-1970: node density integration test
......................................................................

WIP: KUDU-1970: node density integration test

This patch introduces a new itest that models a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
produce a lot of metadata instead, as that's cheaper and can proxy for data.
The test itself isn't that interesting; most of the challenge was in running
it repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8
and num_seconds=240), I produced ~110K blocks across ~21k LBM containers,
which yielded a subsequent ~100s LBM startup time.

WIP because an itest isn't a great fit; we're not actually testing anything.
But, it didn't really make sense as part of the CLI tool either as that
talks to existing clusters/deployments and never spins up an
ExternalMiniCluster (and I'm not convinced it could either; there are some
Kudu testing assumptions baked into EMC). I'm open to other ideas.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
4 files changed, 261 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6662/1//COMMIT_MSG
Commit Message:

Line 20: WIP because an itest isn't a great fit; we're not actually testing anything.
> isn't it a sort of stress test, in that we're testing larger data volumes t
Thanks for the feedback.

Ended up doing a little bit of both: it's still a "stress test" that runs in about 25s on my laptop, but it also runs as part of benchmarks.sh and collects interesting metrics.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 9:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/dense_node-itest.cc
File src/kudu/integration-tests/dense_node-itest.cc:

PS9, Line 68: DenseNodeTest
> in general would it make sense to have this test be parameterized and also 
Good question.

I am interested in how fsync affects certain operations (e.g. CreateTablet() storm on the tserver when creating a table, TSHeartbeat() storm on the master when it starts), but I don't think it's necessary to always run with it on, at least not yet.

For now I'll add a gflag that will enable fsync if requested. Maybe once more work is done to improve scalability we'll work it into this test as a parameter.


PS9, Line 71: DenseNodeTest
> also this test seems to be more of a benchmark that an test. change the nam
I'll add some documentation.

I was torn on whether to name it "-bench" or "-itest", since it serves as both. I went with "-itest" because there's precedence for that in benchmarks.sh: plenty of tests, itests, and stress tests are benchmarked.


PS9, Line 100: // Inject steroids into the MM.
> hum 100 seems excessive an unrealistic even for denser nodes, maybe 10 woul
The goal isn't to mirror a real deployment; it's to get the MM to flush:
1. As little data as possible, and
2. As often as possible.

I actually used 1000 here at one point, but found throughput to be better with 100. When I switch it to 10, I see about half as many blocks generated (though the number of LBM containers remains about the same).


PS9, Line 131: for (int i = 1; i < FLAGS_num_columns; i++) {
             :     b.AddColumn(Substitute("i$0", i))->Type(KuduColumnSchema::INT32)->NotNull();
             :   }
> would it make sense to also have larger string columns? seems like this wil
But the goal of the test is to maximize metadata, which means smaller columns and rowsets (and more of them) is desirable.


PS9, Line 145: us
> nit: s/us/the test
Done


http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/external_mini_cluster.h
File src/kudu/integration-tests/external_mini_cluster.h:

PS9, Line 141: starting
> what is "starting" here? writing the instance file? booting up everything?
Writing out the instance file. I'll clarify the comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6662/4//COMMIT_MSG
Commit Message:

Line 11: lots of data (which is both time intensive and developer unfriendly), let's
> This sentence makes it sounds like you are somehow writing only the metadat
Will clarify.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 5: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#2).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that models a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
produce a lot of metadata instead, as that's cheaper and can proxy for data.
The test itself isn't that interesting; most of the challenge was in running
it repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8
and num_seconds=240), I produced ~110K blocks across ~21k LBM containers,
which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 336 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Dan Burkert,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#10).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that simulates a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
run a workload that produces a lot of metadata with a minimal amount of
data. This is cheaper, and the metadata can proxy for data in areas we care
about (such as start up time, thread count, memory usage, etc.). The test
itself isn't that interesting; most of the challenge was in running it
repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 352 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/dense_node-itest.cc
File src/kudu/integration-tests/dense_node-itest.cc:

PS9, Line 71: 
> I'll add some documentation.
right, most are benchmarks among other tests or actual tests that test for correctness, which this one doesn't do. maybe we should establish a new precendent? it'd be nice to have clear test and file names for these kinds of tests.


PS9, Line 100: Substitute("--log_container_max
> The goal isn't to mirror a real deployment; it's to get the MM to flush:
sounds reasonable


PS9, Line 131: // With the amount of data we're going to write, we need to make sure the
             :   // tserver has enough time to start back up (startup is only considered to be
             :   /
> But the goal of the test is to maximize metadata, which means smaller colum
k, thats fair


http://gerrit.cloudera.org:8080/#/c/6662/10/src/kudu/integration-tests/test_workload.cc
File src/kudu/integration-tests/test_workload.cc:

PS10, Line 233: // Do some sanity checks on the schema. They reflect how the rest of
              :     // TestWorkload is going to use the schema.
              :     CHECK_GT(schema_.num_columns(), 0) << "Schema should have at least one column";
              :     vector<int> key_indexes;
              :     schema_.GetPrimaryKeyColumnIndexes(&key_indexes);
              :     CHECK_EQ(1, key_indexes.size()) << "Schema should have just one key column";
              :     CHECK_EQ(0, key_indexes[0]) << "Schema's key column should be index 0";
              :     KuduColumnSchema key = schema_.Column(0);
              :     CHECK_EQ("key", key.name()) << "Schema column should be named 'key'";
              :     CHECK_EQ(KuduColumnSchema::INT32, key.type()) << "Schema key column should be of type INT32";
why not do this when the schema is set? we know that the simple schema abides by these rules.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/dense_node-itest.cc
File src/kudu/integration-tests/dense_node-itest.cc:

PS9, Line 68: DenseNodeTest
in general would it make sense to have this test be parameterized and also run with fsync enabled?


PS9, Line 71: DenseNodeTest
an intro to what this test targets here or in the class header would be nice


PS9, Line 71: DenseNodeTest
also this test seems to be more of a benchmark that an test. change the name?


PS9, Line 100: // Inject steroids into the MM.
hum 100 seems excessive an unrealistic even for denser nodes, maybe 10 would be more acceptable?


PS9, Line 131: for (int i = 1; i < FLAGS_num_columns; i++) {
             :     b.AddColumn(Substitute("i$0", i))->Type(KuduColumnSchema::INT32)->NotNull();
             :   }
would it make sense to also have larger string columns? seems like this will build very symmetric rowsets whereas assymetric ones where string columns dominate the size of a rowset are common


PS9, Line 145: us
nit: s/us/the test


http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/external_mini_cluster.h
File src/kudu/integration-tests/external_mini_cluster.h:

PS9, Line 141: starting
what is "starting" here? writing the instance file? booting up everything?


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/dense_node-itest.cc
File src/kudu/integration-tests/dense_node-itest.cc:

PS9, Line 71: 
> right, most are benchmarks among other tests or actual tests that test for 
I don't think a new convention is necessary, and FWIW I see enough overlap with existing tests called from benchmarks.sh. tablet_server-stress-test and full_stack-insert-scan-test, to name two.


http://gerrit.cloudera.org:8080/#/c/6662/10/src/kudu/integration-tests/test_workload.cc
File src/kudu/integration-tests/test_workload.cc:

PS10, Line 233: // Do some sanity checks on the schema. They reflect how the rest of
              :     // TestWorkload is going to use the schema.
              :     CHECK_GT(schema_.num_columns(), 0) << "Schema should have at least one column";
              :     vector<int> key_indexes;
              :     schema_.GetPrimaryKeyColumnIndexes(&key_indexes);
              :     CHECK_EQ(1, key_indexes.size()) << "Schema should have just one key column";
              :     CHECK_EQ(0, key_indexes[0]) << "Schema's key column should be index 0";
              :     KuduColumnSchema key = schema_.Column(0);
              :     CHECK_EQ("key", key.name()) << "Schema column should be named 'key'";
              :     CHECK_EQ(KuduColumnSchema::INT32, key.type()) << "Schema key column should be of type INT32";
> why not do this when the schema is set? we know that the simple schema abid
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#5).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that models a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
produce a lot of metadata instead, as that's cheaper and can proxy for data.
The test itself isn't that interesting; most of the challenge was in running
it repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 337 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP: KUDU-1970: node density integration test

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: WIP: KUDU-1970: node density integration test
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6662/1//COMMIT_MSG
Commit Message:

Line 20: WIP because an itest isn't a great fit; we're not actually testing anything.
isn't it a sort of stress test, in that we're testing larger data volumes than other tests do?

I've been pondering for years now the idea of changing KUDU_ALLOW_SLOW_TESTS=1 to KUDU_TEST_LENGTH={fast,slow,veryslow}, and this could be the first of the third category, which we might not run pre-commit but could run nightly or post-commit.

It could also be named '-bench' as in rpc-bench, and added to the benchmarks script, so we could plot startup time performance, thread count, memory usage, etc, of the test like we plot other numeric metrics from build to build.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Dan Burkert, David Ribeiro Alves, Todd Lipcon,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#8).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that simulates a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
run a workload that produces a lot of metadata with a minimal amount of
data. This is cheaper, and the metadata can proxy for data in areas we care
about (such as start up time, thread count, memory usage, etc.). The test
itself isn't that interesting; most of the challenge was in running it
repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 337 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#11).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that simulates a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
run a workload that produces a lot of metadata with a minimal amount of
data. This is cheaper, and the metadata can proxy for data in areas we care
about (such as start up time, thread count, memory usage, etc.). The test
itself isn't that interesting; most of the challenge was in running it
repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 355 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 11: Code-Review+2

Chatting about the naming situation on another channel Adar made the point that this is more of a stress test than an actual benchmark. Which makes sense

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#6).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that simulates a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
run a workload that produces a lot of metadata with a minimal amount of
data. This is cheaper, and the metadata can proxy for data in areas we care
about (such as start up time, thread count, memory usage, etc.). The test
itself isn't that interesting; most of the challenge was in running it
repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 337 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged.

Change subject: KUDU-1970: node density integration test
......................................................................


KUDU-1970: node density integration test

This patch introduces a new itest that simulates a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
run a workload that produces a lot of metadata with a minimal amount of
data. This is cheaper, and the metadata can proxy for data in areas we care
about (such as start up time, thread count, memory usage, etc.). The test
itself isn't that interesting; most of the challenge was in running it
repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Reviewed-on: http://gerrit.cloudera.org:8080/6662
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 355 insertions(+), 48 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6662/9/src/kudu/integration-tests/dense_node-itest.cc
File src/kudu/integration-tests/dense_node-itest.cc:

PS9, Line 71: DenseNodeTest
> I don't think a new convention is necessary, and FWIW I see enough overlap 
right, hence the "most". the point is: do you agree that it would be nice to stress only or bench only tests have a bench suffix or not? fwiw IMO it would be handy to be able to easily identify the binaries for benchmarks among the others when looking for ways to measure something. if not I'm ok with that too, just don't want to avoid making things better because there is no precedent.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 9: Verified+1

ASAN build failed during cmake, unrelated:

19:04:54    Change Dir: /home/jenkins-slave/workspace/kudu-master/3/build/asan/CMakeFiles/CMakeTmp
19:04:54 
19:04:54   
19:04:54 
19:04:54   Run Build Command:"/usr/bin/make" "cmTC_66d15/fast"
19:04:54 
19:04:54   /usr/bin/make -f CMakeFiles/cmTC_66d15.dir/build.make
19:04:54   CMakeFiles/cmTC_66d15.dir/build
19:04:54 
19:04:54   make[1]: Entering directory
19:04:54   `/home/jenkins-slave/workspace/kudu-master/3/build/asan/CMakeFiles/CMakeTmp'
19:04:54 
19:04:54 
19:04:54   Building C object CMakeFiles/cmTC_66d15.dir/testCCompiler.c.o
19:04:54 
19:04:54   
19:04:54   /home/jenkins-slave/workspace/kudu-master/3/build-support/ccache-clang/clang
19:04:54   -o CMakeFiles/cmTC_66d15.dir/testCCompiler.c.o -c
19:04:54   /home/jenkins-slave/workspace/kudu-master/3/build/asan/CMakeFiles/CMakeTmp/testCCompiler.c
19:04:54 
19:04:54 
19:04:54   Linking C executable cmTC_66d15
19:04:54 
19:04:54   
19:04:54   /home/jenkins-slave/workspace/kudu-master/3/thirdparty/installed/common/bin/cmake
19:04:54   -E cmake_link_script CMakeFiles/cmTC_66d15.dir/link.txt --verbose=1
19:04:54 
19:04:54   
19:04:54   /home/jenkins-slave/workspace/kudu-master/3/build-support/ccache-clang/clang
19:04:54   CMakeFiles/cmTC_66d15.dir/testCCompiler.c.o -o cmTC_66d15 -rdynamic
19:04:54 
19:04:54   CMakeFiles/cmTC_66d15.dir/testCCompiler.c.o: file not recognized: File
19:04:54   truncated
19:04:54 
19:04:54   clang-3.9: error: linker command failed with exit code 1 (use -v to see
19:04:54   invocation)
19:04:54 
19:04:54   make[1]: *** [cmTC_66d15] Error 1
19:04:54 
19:04:54   make[1]: Leaving directory
19:04:54   `/home/jenkins-slave/workspace/kudu-master/3/build/asan/CMakeFiles/CMakeTmp'
19:04:54 
19:04:54 
19:04:54   make: *** [cmTC_66d15/fast] Error 2

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1970: node density integration test

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6662/4//COMMIT_MSG
Commit Message:

Line 11: lots of data (which is both time intensive and developer unfriendly), let's
This sentence makes it sounds like you are somehow writing only the metadata.


-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6662

to look at the new patch set (#4).

Change subject: KUDU-1970: node density integration test
......................................................................

KUDU-1970: node density integration test

This patch introduces a new itest that models a storage-dense Kudu
deployment. The idea is simple: rather than actually generating and storing
lots of data (which is both time intensive and developer unfriendly), let's
produce a lot of metadata instead, as that's cheaper and can proxy for data.
The test itself isn't that interesting; most of the challenge was in running
it repeatedly to determine which flag values yielded the most metadata.

In a run of the test on a 48 core el6.6 machine (max_blocks_per_container=8,
num_tablets=1000, num_seconds=240), I produced ~110K blocks across ~21k LBM
containers, which yielded a subsequent ~100s LBM startup time.

I made the following modifications elsewhere to make this work:
- TestWorkload now supports arbitrary schemas.
- EMC-based tests can configure the amount of time they wait on each daemon
  process to start as the server info file isn't dumped until after FS
  startup is complete (maybe that should be changed?)
- The benchmarks.sh script runs the test with some customized parameters.

I also snuck in changes to remove an unused variable from random.h and to
switch TestWorkload from kudu::Thread to std::thread.

Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.h
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/scripts/benchmarks.sh
M src/kudu/tools/data_gen_util.cc
M src/kudu/tools/data_gen_util.h
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/util/random.h
13 files changed, 337 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/6662/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 8: Verified+1

Unrelated ASAN failure, filed as KUDU-1987.

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1970: node density integration test

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 8: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1970: node density integration test

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 11:

here's your precedent (unmerged) :) https://gerrit.cloudera.org/#/c/6696/1/src/kudu/util/cache-bench.cc

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1970: node density integration test

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1970: node density integration test
......................................................................


Patch Set 7: Verified+1

Unrelated failure in ITClient.

-- 
To view, visit http://gerrit.cloudera.org:8080/6662
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9b5d01557eb41d386ce92f576ed01ec658e8e7d
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No