You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2018/10/06 06:29:39 UTC
[kudu-CR] [compaction] Small improvements to compaction policy tests
Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11605
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
[compaction] Small improvements to compaction policy tests
To learn more about compaction policy, I spent some time reading over
compaction_policy-test.cc, and in doing so I made a bunch of small
improvements. I also added one test covering a facet of the policy
that's pretty important but isn't obviously covered in the design docs
or the existing tests.
There are no functional non-test changes in this patch. I added
scaled-down versions of a couple of slow tests that can run all the
time, and made one slow test only run when slow tests are allowed.
Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
---
M src/kudu/tablet/compaction_policy-test.cc
1 file changed, 131 insertions(+), 65 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/11605/1
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
[kudu-CR] [compaction] Small improvements to compaction policy tests
Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11605 )
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
Patch Set 2:
(4 comments)
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc
File src/kudu/tablet/compaction_policy-test.cc:
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@117
PS2, Line 117: // Similar to the above, but with some small overlap between adjacent
: // rowsets.
In compaction-policy parlance, this means is testing compaction on a "short" (keyspace height is small) tablet, right? Whereas TestSignificantOverlap is "tall"?
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@131
PS2, Line 131: %09d", i * 10000),
: StringPrintf("%09d
Was this change important?
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@178
PS2, Line 178: the
nit: remove
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@185
PS2, Line 185: // compacting {A, B, C} results in the same quality score as {A, B}, 0.67, but
: // uses more budget. By penalizing the wider solution {A, B, C}, we ensure we
: // don't waste I/O.
Not really related to this change, but thought I'd ask. Conceptually, I get that we're reducing the summed width an equal amount in both cases, but in one case we're using 2 "units" of IO and in the other we're using 3, but I haven't found a doc or comment explaining what 'quality' is in this case. Do they exist somewhere? Is there a formula for 'quality'?
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 09 Oct 2018 07:29:49 +0000
Gerrit-HasComments: Yes
[kudu-CR] [compaction] Small improvements to compaction policy tests
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Attila Bukor, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/11605
to look at the new patch set (#3).
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
[compaction] Small improvements to compaction policy tests
To learn more about compaction policy, I spent some time reading over
compaction_policy-test.cc, and in doing so I made a bunch of small
improvements. I also added one test covering a facet of the policy
that's pretty important but isn't obviously covered in the design docs
or the existing tests.
There are no functional non-test changes in this patch. I added
scaled-down versions of a couple of slow tests that can run all the
time, and made one slow test only run when slow tests are allowed.
Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
---
M src/kudu/tablet/compaction_policy-test.cc
1 file changed, 131 insertions(+), 65 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/11605/3
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
[kudu-CR] [compaction] Small improvements to compaction policy tests
Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/11605 )
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
Patch Set 3: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 09 Oct 2018 16:14:03 +0000
Gerrit-HasComments: No
[kudu-CR] [compaction] Small improvements to compaction policy tests
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11605 )
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
Patch Set 2:
(5 comments)
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc
File src/kudu/tablet/compaction_policy-test.cc:
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@88
PS2, Line 88: so the quality
: // score, which should be
> nit: "so the quality score should be"
Boy that comment was a mess. I rekajiggered the punctuation and wording.
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@117
PS2, Line 117: // Similar to the above, but with some small overlap between adjacent
: // rowsets.
> In compaction-policy parlance, this means is testing compaction on a "short
The max height is 2 in this case, and 3 in significant overlap, and is not the main difference between these tests. This test is meant to test that we don't rowset compact if there's only a small overlap, like the above test is that we don't compact when there's none, and the below test is that we do compact when there's a lot.
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@131
PS2, Line 131: %09d", i * 10000),
: StringPrintf("%09d
> Was this change important?
The point of it was to make the comment fit better before I decided to write the integers and not the fully-padded strings. But I left it since it's still correct, and less arbitrary because it's exactly the length needed (9999 * 10000 + 11000 = 100001000 has 9 digits).
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@178
PS2, Line 178: the
> nit: remove
Done
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@185
PS2, Line 185: // compacting {A, B, C} results in the same quality score as {A, B}, 0.67, but
: // uses more budget. By penalizing the wider solution {A, B, C}, we ensure we
: // don't waste I/O.
> Not really related to this change, but thought I'd ask. Conceptually, I get
Check out docs/design-docs/compaction-policy.md. I don't think it uses the word 'quality' like the code does but it describes the ideas. The formula for the quality of a rowset compaction of a set of rowsets X is
quality(X) = (sum of widths of rowsets in X) - (width of the union of the rowsets in X)
and the width of a rowset (or any set of keys) is the probability that a key falls between the set's min key and max key.
The probability distribution we use is one that weights the keyspace by how much data there is. So if [k_0, k_1] and [k_2, k_3] are key intervals, and [k_0, k_1] has 2x as much data in it, the width of [k_0, k_1] is 2x that of [k_2, k_3].
I think Todd and a guy who left a long, long time ago were the experts on this, so I'm hoping to bootcamp a few of us into familiarity using these smaller patches, before I try to make some larger changes to the compaction policy.
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 09 Oct 2018 15:29:02 +0000
Gerrit-HasComments: Yes
[kudu-CR] [compaction] Small improvements to compaction policy tests
Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/11605 )
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
Patch Set 2:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc
File src/kudu/tablet/compaction_policy-test.cc:
http://gerrit.cloudera.org:8080/#/c/11605/2/src/kudu/tablet/compaction_policy-test.cc@88
PS2, Line 88: so the quality
: // score, which should be
nit: "so the quality score should be"
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 09 Oct 2018 12:47:14 +0000
Gerrit-HasComments: Yes
[kudu-CR] [compaction] Small improvements to compaction policy tests
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11605 )
Change subject: [compaction] Small improvements to compaction policy tests
......................................................................
[compaction] Small improvements to compaction policy tests
To learn more about compaction policy, I spent some time reading over
compaction_policy-test.cc, and in doing so I made a bunch of small
improvements. I also added one test covering a facet of the policy
that's pretty important but isn't obviously covered in the design docs
or the existing tests.
There are no functional non-test changes in this patch. I added
scaled-down versions of a couple of slow tests that can run all the
time, and made one slow test only run when slow tests are allowed.
Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Reviewed-on: http://gerrit.cloudera.org:8080/11605
Tested-by: Kudu Jenkins
Reviewed-by: Attila Bukor <ab...@apache.org>
---
M src/kudu/tablet/compaction_policy-test.cc
1 file changed, 131 insertions(+), 65 deletions(-)
Approvals:
Kudu Jenkins: Verified
Attila Bukor: Looks good to me, approved
--
To view, visit http://gerrit.cloudera.org:8080/11605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8f7a0e2c69f301d7fc7ca2fac657569ea240d4e3
Gerrit-Change-Number: 11605
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>