You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/03/19 23:39:29 UTC

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Hello David Ribeiro Alves, Andrew Wong,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6432

to review the following change.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................

cfile-test: add a 100M-string file with low cardinality

This serves as a better benchmark of the dictionary-encoded string path.

Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
---
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
2 files changed, 30 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/32/6432/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6432

to look at the new patch set (#2).

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................

cfile-test: add a 100M-string file with low cardinality

This serves as a better benchmark of the dictionary-encoded string path.

Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
---
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/tablet/compaction.cc
3 files changed, 36 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/32/6432/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6432/1/src/kudu/cfile/cfile-test-base.h
File src/kudu/cfile/cfile-test-base.h:

PS1, Line 372: to match
             :     // the output block size of compaction.
I'm not following why this is the case. Maybe elaborate on why compaction block size is relevant. Is it because compaction occurs 100 entries at a time, so we're getting a more realistic benchmark?

Unsure of where the "block size" is coming in, since we're dealing with num_entries below, which seems different.


-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6432/1/src/kudu/cfile/cfile-test-base.h
File src/kudu/cfile/cfile-test-base.h:

PS1, Line 372: to match
             :     // the output block size of compaction.
> I think that Todd's referring to the common block size that we use in Flush
Ah, got it. May be worth adding extra context to the comment, something along the lines of:
"Since flushes commonly occur during compactions, we attempt to flush the max batch size for compaction: 100 entries."


-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6432/1/src/kudu/cfile/cfile-test-base.h
File src/kudu/cfile/cfile-test-base.h:

PS1, Line 372: to match
             :     // the output block size of compaction.
> Ah, got it. May be worth adding extra context to the comment, something alo
k, I pulled out a constant and then referred to it here in the comment. Unfortunately I couldn't actually make the _code_ refer to it, because I didn't want to introduce a backwards dependency from cfile/ -> tablet/, but at least now it's more clear.


http://gerrit.cloudera.org:8080/#/c/6432/1/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 35: #include "kudu/util/test_macros.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................


cfile-test: add a 100M-string file with low cardinality

This serves as a better benchmark of the dictionary-encoded string path.

Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Reviewed-on: http://gerrit.cloudera.org:8080/6432
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <dr...@apache.org>
---
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/tablet/compaction.cc
3 files changed, 36 insertions(+), 22 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] cfile-test: add a 100M-string file with low cardinality

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: cfile-test: add a 100M-string file with low cardinality
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6432/1/src/kudu/cfile/cfile-test-base.h
File src/kudu/cfile/cfile-test-base.h:

PS1, Line 372: to match
             :     // the output block size of compaction.
> I'm not following why this is the case. Maybe elaborate on why compaction b
I think that Todd's referring to the common block size that we use in FlushCompactionInput() in compaction.cc and that he wants to make the benchmark more realistic by using the same block size.

In any case I think Andrew's confusion raises a point: Could we pull a flag for that block size and reuse it here?


-- 
To view, visit http://gerrit.cloudera.org:8080/6432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9c92f3b6c04c4c2ef50497ad1dc1380b7152d9e9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes