You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2017/05/23 23:16:21 UTC

[kudu-CR] KUDU-1755 Part 2: Improve table on-disk size metric

Will Berkeley has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6968

Change subject: KUDU-1755 Part 2: Improve table on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve table on-disk size metric

This adds WAL segment and (an estimate of) cmeta sizes to the tablet
on-disk metric. This is a little bit trickier than previous
improvements because this info is held by the TabletReplica, not the
Tablet, so the on-disk metric itself had to be relocated and plugged
in to TabletReplica::EstimateOnDiskSize.

The cmeta size is estimated because it would require adding extra
plumbing to get the size, and it doesn't seem worth it because the
component we can't compute precisely is about 40 bytes.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
3 files changed, 43 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#13).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 261 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 21:

Something mysterious happened and so 19 + 1 = 21.

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 182:   std::atomic<uint64_t> on_disk_size_;
This class isn't thread-safe so there is no reason to use an atomic here


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 17:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS17, Line 909: std::lock_guard
Does this need to be lock_guard? Could it be lock_shared<rw_spinlock> l(state_lock_.get_lock()) instead?


http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS17, Line 736: OnDiskDataSize()
nit: may be worth commenting in the API declaration that this does not include undos/redos? I think it only says "no metadata" right now. 

Intuitively, I would expect OnDiskDataSize() to be larger than OnDiskDataSizeNoUndos(). Maybe OnDiskDataSizeWithRedos or something?


http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/tablet/tablet_metadata-test.cc
File src/kudu/tablet/tablet_metadata-test.cc:

PS17, Line 138: The on-disk size 
nit: maybe "The tablet metadata"?


http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS17, Line 193: ) != nullptr
nit: at L701, you have a similar check that doesn't make this explicit comparison to nullptr


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#11).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 258 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS13, Line 244: int64
> Thanks.  Just FYI, here is some more fun facts about that: https://news.yco
Show Mike :)


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS19, Line 348:   // A rowset's space-occupying components are as follows:
              :   // - cfile set
              :   //   - base data
              :   //   - bloom file
              :   //   - ad hoc index
              :   // - delta files
              :   //   - UNDO deltas
              :   //   - REDO deltas
              : 
              :   // Return the size on-disk of this rowset, in bytes.
              :   uint64_t OnDiskSize() const OVERRIDE;
              : 
              :   // Return the on-disk size of this rowset's cfile set, in bytes.
              :   uint64_t CFileSetOnDiskSize() const;
              : 
              :   // Return the size on-disk of the base data (no deltas) in this rowset's cfile set, in bytes.
              :   uint64_t OnDiskDataSize() const OVERRIDE;
              : 
              :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
              :   uint64_t RedoDeltaOnDiskSize() const;
              : 
              :   // Return the size on-disk of this rowset's cfile set's base data and this rowset's REDO deltas.
              :   uint64_t OnDiskDataSizeNoUndos() const OVERRIDE;
              : 
              :   size_t DeltaMemStoreSize() const OVERRIDE;
> my problem with these functions are they often return overlapping info and 
a possible alternative would be to have individual methods for each of the items, but then that would open the door to inconsistencies because the component lock was released in between getting the values for different components. wdyt?


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 690: Status DiskRowSet::CountRows(rowid_t *count) const {
> 1. I think you are right, but the pattern in the class is to use component_
On #1, I'm pretty certain that we need to acquire the lock to get a ref to base_data_, but what I was wondering was whether we need to hold component_lock_ while calling base_data_->OnDiskSize().

#2 is not done in rev 13, as far as I can tell.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 645: 
              : Status TabletRepli
> Adar had misgivings about this form. The comment was in PS10 but I'll repro
> TabletReplica::Shutdown() resets consensus_ with lock_ held.

How old was that comment? That is no longer the case.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 15:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS13, Line 244: d to 
> Thanks.  Just FYI, here is some more fun facts about that: https://news.yco
That is a really cool post from Joshua Bloch. Anyway, we have to decide whether we prefer to have one fewer bit available or have 0 - 1 == INT_MAX. The former seems better to me. :)


http://gerrit.cloudera.org:8080/#/c/6968/15/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS15, Line 736: base_data_->OnDiskDataSize()
I believe this should be OnDiskDataSize()


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 10: Code-Review+1

(1 comment)

Todd and/or Mike should take another look.

http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 385:   status_pb_out->set_estimated_on_disk_size(tablet_size);
> I'll take a local ref.
PS10 seems to have reverted this back to "tablet size without cmeta size". Was that intentional?


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve table on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve table on-disk size metric
......................................................................


Patch Set 1:

I just want a Jenkins run. Needs tests.

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 11:

Odd...I compiled and ran some tests just before pushing this to review. And TSAN worked.

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#6).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. This is a little bit trickier than previous improvements
because this info is held by the TabletReplica, not the Tablet,
so the on-disk metric itself had to be relocated and plugged in
to TabletReplica::EstimateOnDiskSize.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
17 files changed, 142 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#4).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment sizes to the tablet on-disk metric. This is a
little bit trickier than previous improvements because this info is
held by the TabletReplica, not the Tablet, so the on-disk metric
itself had to be relocated and plugged in to TabletReplica::EstimateOnDiskSize.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_state.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
15 files changed, 123 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#5).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. This is a little bit trickier than previous improvements
because this info is held by the TabletReplica, not the Tablet,
so the on-disk metric itself had to be relocated and plugged in
to TabletReplica::EstimateOnDiskSize.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
17 files changed, 135 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 220:   on_disk_size_.store(pb_.ByteSize(), std::memory_order_relaxed);
> this isn't actually the on disk size because of PB container framing
yea, I wonder whether it really even is worth tracking this. cmetas are pretty small, can't see anyone really getting upset over these few-kb files


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#19).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 279 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/19
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has submitted this change and it was merged.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Reviewed-on: http://gerrit.cloudera.org:8080/6968
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
Tested-by: Kudu Jenkins
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
31 files changed, 344 insertions(+), 115 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 24
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 15:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6968/16/src/kudu/consensus/log-test.cc
File src/kudu/consensus/log-test.cc:

PS16, Line 1166: ASSERT_EQ(anchors.size(), 3);
nit: the expected value comes first


http://gerrit.cloudera.org:8080/#/c/6968/15/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS15, Line 910: log_state_ == kLogClosed
What happens with the log when it closes?  Does it effective size go zero in that case?

Would it make sense to cache the size of the log upon closing,  so it would be available even in that case?  Or this state is really ephemeral and we should not bother about that at all because a closed log is not going to hand around any longer?


http://gerrit.cloudera.org:8080/#/c/6968/16/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS16, Line 24: #include <stddef.h>
nit: prefer <cstddef> for C++11 (I need to update IWYU stuff to give proper suggestions).


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS13, Line 244: d to 
> It's just for consistency with the other on disk size metrics. They were a 
Thanks.  Just FYI, here is some more fun facts about that: https://news.ycombinator.com/item?id=8690952
https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve table on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#3).

Change subject: KUDU-1755 Part 2: Improve table on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve table on-disk size metric

This adds WAL segment sizes to the tablet on-disk metric. This is a
little bit trickier than previous improvements because this info is
held by the TabletReplica, not the Tablet, so the on-disk metric
itself had to be relocated and plugged in to TabletReplica::EstimateOnDiskSize.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
3 files changed, 42 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 241:   uint64_t on_disk_size_;
> Why doesn't this need to be atomic (like TabletReplica::on_disk_size_)?
Ah, I just read your earlier discussion with Mike. Nevermind.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#15).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 275 insertions(+), 63 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/15
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 19:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS19, Line 210:   // Returns 0 if the log is shut down.
> hrm, this might be misleading, as the WAL might still be taking space (e.g.
Mmm after looking at it again I decided it's better to listen to Alexey and cache the size so we can return it if the log is closed but not yet deleted. Even if a closed log implies the log will be deleted "soon", it's better to be correct while the log still exists, and to be prepared for potential changes to the tablet lifecycle in the future.


http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS19, Line 364: OnDiskDataSize
> maybe OnDiskBaseDataSize()?
Done


PS19, Line 348:   // A rowset's space-occupying components are as follows:
              :   // - cfile set
              :   //   - base data
              :   //   - bloom file
              :   //   - ad hoc index
              :   // - delta files
              :   //   - UNDO deltas
              :   //   - REDO deltas
              : 
              :   // Return the size on-disk of this rowset, in bytes.
              :   uint64_t OnDiskSize() const OVERRIDE;
              : 
              :   // Return the on-disk size of this rowset's cfile set, in bytes.
              :   uint64_t CFileSetOnDiskSize() const;
              : 
              :   // Return the size on-disk of the base data (no deltas) in this rowset's cfile set, in bytes.
              :   uint64_t OnDiskDataSize() const OVERRIDE;
              : 
              :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
              :   uint64_t RedoDeltaOnDiskSize() const;
              : 
              :   // Return the size on-disk of this rowset's cfile set's base data and this rowset's REDO deltas.
              :   uint64_t OnDiskDataSizeNoUndos() const OVERRIDE;
              : 
              :   size_t DeltaMemStoreSize() const OVERRIDE;
> this is getting kind of messy. would it be too much work to clean this up a
I don't think it's worth it, for the following reasons:
1. There's more subdivisions than what's listed there that are needed. We also need the cfile set base data size, for example. The result would be one field for each piece of each component and many fields.
2. It wouldn't really remove the need for the above functions. Each of them has a specific use, so if there were a summary struct then either it would need to be reprocessed in a new definition of these functions or the calls to the functions would be replaced by the functions' logic, which would mix the disk space logic with, e.g. the compaction selection logic.
3. The functions are structured to do the least locking to get what they need. If one function returned a summary, for each 2 or 3 bits actually needed we would acquire the locks to gather all the information.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 220:   on_disk_size_.store(pb_.ByteSize(), std::memory_order_relaxed);
> I think if we want to include cmeta let's use the correct number. If we don
Right, I just didn't go through and re-plumb yet because it might get axed.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 17: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#18).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 279 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/18
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 15:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS13, Line 244: d to 
> That is a really cool post from Joshua Bloch. Anyway, we have to decide whe
I choose int64_t.


http://gerrit.cloudera.org:8080/#/c/6968/15/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS15, Line 736: base_data_->OnDiskDataSize()
> I believe this should be OnDiskDataSize()
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#23).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
31 files changed, 344 insertions(+), 115 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/23
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 220:   on_disk_size_.store(pb_.ByteSize(), std::memory_order_relaxed);
> I think it's also the only item from 1755 that is essentially fixed size an
I think if we want to include cmeta let's use the correct number. If we don't want to include it then it's a moot point.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#9).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
22 files changed, 232 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/4//COMMIT_MSG
Commit Message:

Line 13: 
Add a note about the new OnDIskDataSize tablet metric


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 13:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 305:       flush_count_for_tests_(0) {
> consider initializing on_disk_size_ with 0 as well.
Done


http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS13, Line 244: int64
> What is the reason to have this int64 instead of uint64_t which corresponds
It's just for consistency with the other on disk size metrics. They were a mix of things but Mike requested they be int64_t according to GSG guidelines for signed vs. unsigned (speaking of which this should be an int64_t). Comment added.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 690: Status DiskRowSet::CountRows(rowid_t *count) const {
> On #1, I'm pretty certain that we need to acquire the lock to get a ref to 
Yes, I think we can take a ref under the lock and then call OnDiskSize() without the lock. delta_tracker_ doesn't require the lock at all. I've rewritten a couple of the methods to reflect this.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 645: 
              : Status TabletRepli
> > TabletReplica::Shutdown() resets consensus_ with lock_ held.
Old. Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 305:       flush_count_for_tests_(0) {
consider initializing on_disk_size_ with 0 as well.


http://gerrit.cloudera.org:8080/#/c/6968/13/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS13, Line 244: int64
What is the reason to have this int64 instead of uint64_t which corresponds for the call-site of fs_manager_->env()->GetFileSize(path, &on_disk_size) ?

If there is anything particular, could you add a comment on that, please?


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

Line 338:   uint64_t OnDiskSize() const OVERRIDE;
Mind adding a comment here to differentiate from OnDiskDataSize()?


http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

Line 338:   // Return the total size on-disk of this rowset's data, in bytes.
Maybe add (excluding metadata)


http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 187:         tablet_->GetMetricEntity(), Bind(&TabletReplica::OnDiskSize, Unretained(this)))
In general I don't like metrics that acquire lots of locks on demand since they impact system load and if you have a monitoring system gathering metrics every few seconds it can hurt. Have you considered whether it's possible to collect this in a lazy or periodic manner? Most of these locks should be fairly fast but it makes me a little nervous


PS6, Line 360:   size_t cmeta_size = consensus_ ? consensus_->MetadataOnDiskSize() : 0;
             :   std::lock_guard<simple_spinlock> lock(lock_);
             :   DCHECK(status_pb_out != nullptr);
             :   status_pb_out->set_tablet_id(meta_->tablet_id());
             :   status_pb_out->set_table_name(meta_->table_name());
             :   status_pb_out->set_last_status(last_status_);
             :   meta_->partition().ToPB(status_pb_out->mutable_partition());
             :   status_pb_out->set_state(state_);
             :   status_pb_out->set_tablet_data_state(meta_->tablet_data_state());
             :   status_pb_out->set_estimated_on_disk_size(cmeta_size + OnDiskSizeUnlocked());
Since meta_ is const we can do less under the lock:

  DCHECK(status_pb_out != nullptr);
  uint64_t tablet_size;
  {
    std::lock_guard<simple_spinlock> lock(lock_);
    tablet_size = OnDiskSizeUnlocked();
    status_pb_out->set_state(state_);
    status_pb_out->set_last_status(last_status_);
  }
  status_pb_out->set_tablet_id(meta_->tablet_id());
  status_pb_out->set_table_name(meta_->table_name());
  meta_->partition().ToPB(status_pb_out->mutable_partition());
  status_pb_out->set_tablet_data_state(meta_->tablet_data_state());
  uint64_t cmeta_size = consensus_ ? consensus_->MetadataOnDiskSize() : 0;
  status_pb_out->set_estimated_on_disk_size(cmeta_size + tablet_size);


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 220:   on_disk_size_.store(pb_.ByteSize(), std::memory_order_relaxed);
this isn't actually the on disk size because of PB container framing


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 17:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS17, Line 909: std::lock_guard
> Does this need to be lock_guard? Could it be lock_shared<rw_spinlock> l(sta
I think percpu_rwlock doesn't support shared_lock.


http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS17, Line 736: OnDiskDataSize()
> nit: may be worth commenting in the API declaration that this does not incl
Added (no deltas) for clarity. The explanatory comment above the declarations, when matched with the comments on the declarations, does make it clear exactly what a given *OnDiskSize* metric includes.


http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/tablet/tablet_metadata-test.cc
File src/kudu/tablet/tablet_metadata-test.cc:

PS17, Line 138: The on-disk size 
> nit: maybe "The tablet metadata"?
Done


http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS17, Line 193: ) != nullptr
> nit: at L701, you have a similar check that doesn't make this explicit comp
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#8).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
22 files changed, 230 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

> Uploaded patch set 6.

This is still going to change quite a bit when all of Mike's in-flight consensus patches are committed, so it might not be worth reviewing this right now.

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 23: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#12).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 259 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 220:   on_disk_size_.store(pb_.ByteSize(), std::memory_order_relaxed);
> yea, I wonder whether it really even is worth tracking this. cmetas are pre
I think it's also the only item from 1755 that is essentially fixed size and not proportional to some measure of usage or activity (WAL, UNDOs) or size of data (data, superblock).

Mike, would you be ok leaving it out?


http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 182:   std::atomic<uint64_t> on_disk_size_;
> This class isn't thread-safe so there is no reason to use an atomic here
Done


http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

Line 338:   uint64_t OnDiskSize() const OVERRIDE;
> Mind adding a comment here to differentiate from OnDiskDataSize()?
Done


http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

Line 115:   virtual Status DebugDump(vector<string> *lines = NULL) = 0;
> warning: default arguments on virtual or override methods are prohibited [g
Ignored


Line 338:   // Return the total size on-disk of this rowset's data, in bytes.
> Maybe add (excluding metadata)
Done


http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 187:         tablet_->GetMetricEntity(), Bind(&TabletReplica::OnDiskSize, Unretained(this)))
> In general I don't like metrics that acquire lots of locks on demand since 
Todd, do you have an idea of how costly gathering these metrics will be? Do you think it'd be worth profiling to see how expensive they are?


PS6, Line 360:   size_t cmeta_size = consensus_ ? consensus_->MetadataOnDiskSize() : 0;
             :   std::lock_guard<simple_spinlock> lock(lock_);
             :   DCHECK(status_pb_out != nullptr);
             :   status_pb_out->set_tablet_id(meta_->tablet_id());
             :   status_pb_out->set_table_name(meta_->table_name());
             :   status_pb_out->set_last_status(last_status_);
             :   meta_->partition().ToPB(status_pb_out->mutable_partition());
             :   status_pb_out->set_state(state_);
             :   status_pb_out->set_tablet_data_state(meta_->tablet_data_state());
             :   status_pb_out->set_estimated_on_disk_size(cmeta_size + OnDiskSizeUnlocked());
> Since meta_ is const we can do less under the lock:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS17, Line 909: std::lock_guard
> Andrew is right. You can use shared_lock with a percpu_rwlock, in the way t
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 10:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 153:     return on_disk_size_;
> Can we say this is thread-safe and use an atomic?
Done


PS10, Line 241: uint64_t
> how about atomic and int64_t because there is just no way we are going to n
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS10, Line 922: CHECK_OK
> Do we need this CHECK? Also, can we make this safe to run even if log_ is s
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS10, Line 201: TotalSize
> maybe name this OnDiskSize() for consistency?
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 2645:   UniqueLock lock(lock_);
> If we make cmeta_->on_disk_size() thread-safe, then we can avoid taking the
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 690:   shared_lock<rw_spinlock> l(component_lock_);
> 1. Can we just take a ref to base_data_ instead of holding the component_lo
1. I think you are right, but the pattern in the class is to use component_lock_ whenever doing anything with base_data_. Is there a reason to prefer taking a ref over acquiring the lock briefly?
2. Done.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

Line 115:   virtual Status DebugDump(std::vector<std::string> *lines = NULL) = 0;
> warning: default arguments on virtual or override methods are prohibited [g
Pre-existing; ignored.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 377: OnDiskSizeNoCMetaUnlocked
> why the "no cmeta" version? could use a comment
I think that was a mistake, a leftover from a previous version. There's no need for that method at all now.


PS10, Line 645:   auto consensus = consensus_;
              :   if (consensus) {
> because consensus_ is set-once, this can be written as:
Adar had misgivings about this form. The comment was in PS10 but I'll reproduce it here for convenience:

"This isn't safe; you need to take a local ref of consensus_ and then test it for nullitude. Or you need to make the call with lock_ held, which guarantees that consensus_ isn't modified.

Okay, I went and read the comments in TabletReplica which talk about how consensus_ is a "set-once" object. That seems bogus to me, though admittedly I'm not an expert on this code. TabletReplica::Shutdown() resets consensus_ with lock_ held. Can we guarantee that when Shutdown() is called, no other thread (besides the one calling Shutdown()) will invoke a TabletReplica function? I don't see how we can (hence the fragility). And if we can't, then consensus_ access needs to be atomic, either by taking a local ref then operating on it, or by acquiring lock_ first."


PS10, Line 656: size_t
> Can we use int64_t instead of size_t? We are never going to have 64-bit len
Done


PS10, Line 660:   if (tablet_) {
              :     ret += tablet_->OnDiskSize();
              :   }
              :   // WAL segments.
              :   if (state_ == RUNNING) {
              :     ret += log_->TotalSize();
              :   }
> Instead of doing this work while holding TabletReplica::lock_, can this be 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 22:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS22, Line 159: on_disk_size()
> docs
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS22, Line 914: std::memory_order_relaxed
> do we really need to pass a memory ordering directive here?
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS22, Line 210: log's current WAL
> weird phrasing since WAL is also write-ahead-log (yeah, I know naming sucks
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS22, Line 328: size on-disk
> xxs nit: on-disk size?
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

PS22, Line 95: The size on-disk of this cfile set's data, in bytes.
> make it explicit that this excludes the (ad-hoc) index and blooms?
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS22, Line 726: OnDiskDataSizeNoUndos
> one last thing since you're fixing this. this naming is a bit misleading, m
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

PS22, Line 131: excluding metadata
> maybe be explicit? "excluding bloom filter and ad-hoc index"
Done


PS22, Line 134:  Return the size, in bytes, of this rowset's data, not including undo deltas.
> this doesn't include the "metadata" either, right?
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

PS22, Line 242: , but not the consensus metadata
> nit I don't think you technically need to mention consensus here, since tab
Done


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS22, Line 287: Differs from Tablet::OnDiskSize in that it includes log segments and cmeta.
> Again, I don't think you need to mention this here, since tablet, along wit
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS19, Line 348:   // A rowset's space-occupying components are as follows:
              :   // - cfile set
              :   //   - base data
              :   //   - bloom file
              :   //   - ad hoc index
              :   // - delta files
              :   //   - UNDO deltas
              :   //   - REDO deltas
              : 
              :   // Return the size on-disk of this rowset, in bytes.
              :   uint64_t OnDiskSize() const OVERRIDE;
              : 
              :   // Return the on-disk size of this rowset's cfile set, in bytes.
              :   uint64_t CFileSetOnDiskSize() const;
              : 
              :   // Return the size on-disk of the base data (no deltas) in this rowset's cfile set, in bytes.
              :   uint64_t OnDiskDataSize() const OVERRIDE;
              : 
              :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
              :   uint64_t RedoDeltaOnDiskSize() const;
              : 
              :   // Return the size on-disk of this rowset's cfile set's base data and this rowset's REDO deltas.
              :   uint64_t OnDiskDataSizeNoUndos() const OVERRIDE;
              : 
              :   size_t DeltaMemStoreSize() const OVERRIDE;
> a possible alternative would be to have individual methods for each of the 
I added a struct as you suggested. I think it simplifies things a lot-- there's now only three on disk size methods in rowset.h, one for the disk size metric, one for the data-only disk size metric, and one used in compaction scoring; the struct can be used internally to the disk row set to compute whatever other permutation of space usage might be needed. Thanks for the suggestion!


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 22:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

PS22, Line 159: on_disk_size()
docs


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS22, Line 914: std::memory_order_relaxed
do we really need to pass a memory ordering directive here?


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS22, Line 210: log's current WAL
weird phrasing since WAL is also write-ahead-log (yeah, I know naming sucks here). maybe: this log's current segments? or even: "total size of the current segments, in bytes"


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS22, Line 328: size on-disk
xxs nit: on-disk size?


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

PS22, Line 95: The size on-disk of this cfile set's data, in bytes.
make it explicit that this excludes the (ad-hoc) index and blooms?


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS22, Line 726: OnDiskDataSizeNoUndos
one last thing since you're fixing this. this naming is a bit misleading, maybe OnDiskBaseDataSizeWithRedos() would be more accurate.


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

PS22, Line 131: excluding metadata
maybe be explicit? "excluding bloom filter and ad-hoc index"


PS22, Line 134:  Return the size, in bytes, of this rowset's data, not including undo deltas.
this doesn't include the "metadata" either, right?


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

PS22, Line 242: , but not the consensus metadata
nit I don't think you technically need to mention consensus here, since tablet is an orthogonal component to consensus in a sense.


http://gerrit.cloudera.org:8080/#/c/6968/22/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS22, Line 287: Differs from Tablet::OnDiskSize in that it includes log segments and cmeta.
Again, I don't think you need to mention this here, since tablet, along with consensus and the log are components of tablet replica.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/17/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS17, Line 909: std::lock_guard
> I think percpu_rwlock doesn't support shared_lock.
Andrew is right. You can use shared_lock with a percpu_rwlock, in the way that he pointed out:

  shared_lock<rw_spinlock> l(state_lock_.get_lock());

See L783 in log.cc for an example.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#16).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 277 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/16
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 10:

(10 comments)

Would be nice to add a simple test for the values stored by the new metrics.

http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 153:     return on_disk_size_;
Can we say this is thread-safe and use an atomic?


PS10, Line 241: uint64_t
how about atomic and int64_t because there is just no way we are going to need that 64th bit, plus GSG says to prefer signed integers when possible


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS10, Line 922: CHECK_OK
Do we need this CHECK? Also, can we make this safe to run even if log_ is shut down?

Some reasonable options:
1. return Status
2. return 0 if not open


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS10, Line 201: TotalSize
maybe name this OnDiskSize() for consistency?


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 2645:   UniqueLock lock(lock_);
If we make cmeta_->on_disk_size() thread-safe, then we can avoid taking the consensus lock here. Additional note on why this would be safe: cmeta_ is set only once in Init(), and RaftConsensus::Create() requires Init() to succeed in order to publish the RaftConsensus instance, so we can generally treat the cmeta_ reference as a const.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 690:   shared_lock<rw_spinlock> l(component_lock_);
1. Can we just take a ref to base_data_ instead of holding the component_lock_ while calling base_data_->OnDiskSize() ? I'm not 100% sure, but it looks like it.
2. Do we need to hold component_lock_ while calling delta_tracker_->OnDiskSize()? I don't see a reason for that, since (a) as long as the DRS is open_, the delta_tracker_ reference is safe to access without a lock because it's set-once and (b) DeltaTracker::OnDiskSize() is thread-safe.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 377: OnDiskSizeNoCMetaUnlocked
why the "no cmeta" version? could use a comment


PS10, Line 645:   auto consensus = consensus_;
              :   if (consensus) {
because consensus_ is set-once, this can be written as:

  if (consensus_) {
    ret += consensus_->MetadataOnDiskSize();
  }


PS10, Line 656: size_t
Can we use int64_t instead of size_t? We are never going to have 64-bit length data on a single node.

See the GSG on unsigned types in general: https://google.github.io/styleguide/cppguide.html#Integer_Types

The exceptions to that rule are when storing things like pointers, or when wrapping an existing API makes it basically impossible.


PS10, Line 660:   if (tablet_) {
              :     ret += tablet_->OnDiskSize();
              :   }
              :   // WAL segments.
              :   if (state_ == RUNNING) {
              :     ret += log_->TotalSize();
              :   }
Instead of doing this work while holding TabletReplica::lock_, can this be structured so we can just grabs the refs from TabletReplica and then release the lock?
 Something like:

int64_t size = 0;
scoped_refptr<Tablet> tablet;
scoped_refptr<Log> log;
{
  lock_guard<simple_spinlock> l(lock_);
  tablet = tablet_;
  log = log_;
}
if (tablet) {
  size += tablet->OnDiskSize();
}
if (log) {
  size += log->OnDiskSize();
}
return ret;


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS19, Line 348:   // A rowset's space-occupying components are as follows:
              :   // - cfile set
              :   //   - base data
              :   //   - bloom file
              :   //   - ad hoc index
              :   // - delta files
              :   //   - UNDO deltas
              :   //   - REDO deltas
              : 
              :   // Return the size on-disk of this rowset, in bytes.
              :   uint64_t OnDiskSize() const OVERRIDE;
              : 
              :   // Return the on-disk size of this rowset's cfile set, in bytes.
              :   uint64_t CFileSetOnDiskSize() const;
              : 
              :   // Return the size on-disk of the base data (no deltas) in this rowset's cfile set, in bytes.
              :   uint64_t OnDiskDataSize() const OVERRIDE;
              : 
              :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
              :   uint64_t RedoDeltaOnDiskSize() const;
              : 
              :   // Return the size on-disk of this rowset's cfile set's base data and this rowset's REDO deltas.
              :   uint64_t OnDiskDataSizeNoUndos() const OVERRIDE;
              : 
              :   size_t DeltaMemStoreSize() const OVERRIDE;
> I don't think it's worth it, for the following reasons:
my problem with these functions are they often return overlapping info and the names are confusing. for instance it takes some digging/thinking to be able to distinguish: CFileSetOnDiskSize() from OnDiskDataSize().

maybe just list the base ones then add some helpers to the struct (or let the caller decide what to add)?
regarding locking these are all taking the component lock, so it's not like they're interfering with reads or writes, just flushes/compactions (which are asynch, and in any case I don't believe these methods get called often enough for this to be problematic). 

So if you have a struct that has:

struct DiskRowSetSpace {
   int64_t delta_memstore_size;
   int64_t base_data_size;
   int64_t bloom_size;
   int64_t ad_hoc_index_size;
   int64_t redo_deltas_size;
   int64_t undo_deltas_size;

   .. and then whatever helpers to add stuff here.
}

then theres no duplicate info and the caller can just choose which info is relevant.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 19:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS19, Line 210:   // Returns 0 if the log is shut down.
hrm, this might be misleading, as the WAL might still be taking space (e.g. for failed tablets that were not immediately deleted). what do other components do here? maybe we should return an obviously special value (like -1)?


http://gerrit.cloudera.org:8080/#/c/6968/19/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS19, Line 364: OnDiskDataSize
maybe OnDiskBaseDataSize()?


PS19, Line 348:   // A rowset's space-occupying components are as follows:
              :   // - cfile set
              :   //   - base data
              :   //   - bloom file
              :   //   - ad hoc index
              :   // - delta files
              :   //   - UNDO deltas
              :   //   - REDO deltas
              : 
              :   // Return the size on-disk of this rowset, in bytes.
              :   uint64_t OnDiskSize() const OVERRIDE;
              : 
              :   // Return the on-disk size of this rowset's cfile set, in bytes.
              :   uint64_t CFileSetOnDiskSize() const;
              : 
              :   // Return the size on-disk of the base data (no deltas) in this rowset's cfile set, in bytes.
              :   uint64_t OnDiskDataSize() const OVERRIDE;
              : 
              :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
              :   uint64_t RedoDeltaOnDiskSize() const;
              : 
              :   // Return the size on-disk of this rowset's cfile set's base data and this rowset's REDO deltas.
              :   uint64_t OnDiskDataSizeNoUndos() const OVERRIDE;
              : 
              :   size_t DeltaMemStoreSize() const OVERRIDE;
this is getting kind of messy. would it be too much work to clean this up and return something like:
struct DiskRowSetSpace {
   int64_t memstore_size;
   int64_t cfile_set_size;
   int64_t total_delta_size:
   int64_t redo_deltas_size;
   int64_t undo_deltas_size;
}

?


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#17).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 279 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/17
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 9:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 285:   WARN_NOT_OK(UpdateOnDiskSize(), "Failed to update cmeta on-disk size");
Could RETURN_NOT_OK here too.


Line 322:   WARN_NOT_OK(cmeta->UpdateOnDiskSize(), "Failed to update cmeta on-disk size");
And here.


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 241:   uint64_t on_disk_size_;
Why doesn't this need to be atomic (like TabletReplica::on_disk_size_)?


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS9, Line 331:   // Return the on-disk size of this rowset's cfile set, including bloomfiles
             :   // and the ad hoc index, in bytes.
             :   uint64_t BaseDataOnDiskSize() const;
             : 
             :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
             :   uint64_t RedoDeltaOnDiskSize() const;
             : 
             :   // Return the size on-disk of this rowset, in bytes.
             :   uint64_t OnDiskSize() const OVERRIDE;
             : 
             :   // Return the size on-disk of the data in this rowset (i.e. excluding metadata), in bytes.
             :   uint64_t OnDiskDataSize() const OVERRIDE;
The number of parts is high; could you add a block comment just before these functions describing the different space-occupying parts of a DRS?


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

Line 237:   // Return the total on-disk size of this tablet, in bytes.
Without additional commenting, it seems like the value returned by OnDiskSize() should include _all_ metadata. That is, not just the superblock but also the cmeta. But of course we can't do that since the cmeta isn't known here.

Can you update the comments (here and to OnDiskDataSize) to clarify this?


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

Line 309:   WARN_NOT_OK(UpdateOnDiskSize(), "Failed to update superblock on-disk size");
Seems reasonable to RETURN_NOT_OK here.


Line 587:   WARN_NOT_OK(UpdateOnDiskSize(), "Failed to update superblock on-disk size");
Maybe even RETURN_NOT_OK here too. I think your fear is "if this fails, we'll return failure even though we did replace the superblock". The counter-argument is: "WritePBContainerToPath can fail and still replace the superblock (i.e. a failure in SyncDir)".


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 87:                          "Space used by this tablet on disk.");
Maybe mention that this includes both data and metadata?


Line 377:     tablet_size = OnDiskSizeUnlocked();
We add in cmeta_size below and this is the only use of OnDiskSizeUnlocked. Why not make OnDiskSizeUnlocked return the cmeta size too?


Line 385:   size_t cmeta_size = consensus_ ? consensus_->MetadataOnDiskSize() : 0;
This isn't safe; you need to take a local ref of consensus_ and then test it for nullitude. Or you need to make the call with lock_ held, which guarantees that consensus_ isn't modified.

Okay, I went and read the comments in TabletReplica which talk about how consensus_ is a "set-once" object. That seems bogus to me, though admittedly I'm not an expert on this code. TabletReplica::Shutdown() resets consensus_ with lock_ held. Can we guarantee that when Shutdown() is called, no other thread (besides the one calling Shutdown()) will invoke a TabletReplica function? I don't see how we can (hence the fragility). And if we can't, then consensus_ access needs to be atomic, either by taking a local ref then operating on it, or by acquiring lock_ first.


Line 640: size_t TabletReplica::OnDiskSize() const {
Normally the only distinction between Foo and FooUnlocked is the acquisition of a lock. Here OnDiskSizeUnlocked does not include the consensus metadata size.

Can you rename the functions accordingly?


Line 656: size_t TabletReplica::OnDiskSizeUnlocked() const {
Can you DCHECK that lock_ is held?


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve table on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve table on-disk size metric
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6968/1/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 91: using consensus::Consensus;
> warning: using decl 'Consensus' is unused [misc-unused-using-decls]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#14).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 276 insertions(+), 63 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/14
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 6:

(1 comment)

Minor rebasing.

http://gerrit.cloudera.org:8080/#/c/6968/6/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 220:   on_disk_size_.store(pb_.ByteSize(), std::memory_order_relaxed);
> Right, I just didn't go through and re-plumb yet because it might get axed.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#21).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
25 files changed, 292 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/21
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#22).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
29 files changed, 329 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/22
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#7).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
22 files changed, 230 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 9:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/consensus/consensus_meta.cc
File src/kudu/consensus/consensus_meta.cc:

Line 285:   WARN_NOT_OK(UpdateOnDiskSize(), "Failed to update cmeta on-disk size");
> Could RETURN_NOT_OK here too.
Done


Line 322:   WARN_NOT_OK(cmeta->UpdateOnDiskSize(), "Failed to update cmeta on-disk size");
> And here.
Done


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 241:   uint64_t on_disk_size_;
> Ah, I just read your earlier discussion with Mike. Nevermind.
Right, class isn't threadsafe.


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS9, Line 331:   // Return the on-disk size of this rowset's cfile set, including bloomfiles
             :   // and the ad hoc index, in bytes.
             :   uint64_t BaseDataOnDiskSize() const;
             : 
             :   // Return the size on-disk of this rowset's REDO deltas, in bytes.
             :   uint64_t RedoDeltaOnDiskSize() const;
             : 
             :   // Return the size on-disk of this rowset, in bytes.
             :   uint64_t OnDiskSize() const OVERRIDE;
             : 
             :   // Return the size on-disk of the data in this rowset (i.e. excluding metadata), in bytes.
             :   uint64_t OnDiskDataSize() const OVERRIDE;
> The number of parts is high; could you add a block comment just before thes
Done


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

Line 115:   virtual Status DebugDump(std::vector<std::string> *lines = NULL) = 0;
> warning: default arguments on virtual or override methods are prohibited [g
Ignored.


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

Line 237:   // Return the total on-disk size of this tablet, in bytes.
> Without additional commenting, it seems like the value returned by OnDiskSi
Done


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

Line 309:   WARN_NOT_OK(UpdateOnDiskSize(), "Failed to update superblock on-disk size");
> Seems reasonable to RETURN_NOT_OK here.
Done


Line 587:   WARN_NOT_OK(UpdateOnDiskSize(), "Failed to update superblock on-disk size");
> Maybe even RETURN_NOT_OK here too. I think your fear is "if this fails, we'
Done


http://gerrit.cloudera.org:8080/#/c/6968/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 87:                          "Space used by this tablet on disk.");
> Maybe mention that this includes both data and metadata?
Done


Line 377:     tablet_size = OnDiskSizeUnlocked();
> We add in cmeta_size below and this is the only use of OnDiskSizeUnlocked. 
Because it causes a potential deadlock-- getting the cmeta size requires taking a lock in consensus, so OnDiskSize would acquire the locks in the order tablet_replica lock -> consensus lock, but (IIRC) the transaction code acquires the locks in the order consensus lock -> tablet_replica lock. There was a TSAN failure showing this in the pre-commit jobs but it looks like the links are broken now :/


Line 385:   size_t cmeta_size = consensus_ ? consensus_->MetadataOnDiskSize() : 0;
> This isn't safe; you need to take a local ref of consensus_ and then test i
I'll take a local ref.


Line 640: size_t TabletReplica::OnDiskSize() const {
> Normally the only distinction between Foo and FooUnlocked is the acquisitio
Done


Line 656: size_t TabletReplica::OnDiskSizeUnlocked() const {
> Can you DCHECK that lock_ is held?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 17:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6968/16/src/kudu/consensus/log-test.cc
File src/kudu/consensus/log-test.cc:

PS16, Line 1166: ASSERT_EQ(3, anchors.size());
> nit: the expected value comes first
Done


http://gerrit.cloudera.org:8080/#/c/6968/15/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS15, Line 910: f the log is closed, the
> What happens with the log when it closes?  Does it effective size go zero i
I believe the latter is the case-- the log is closed when the tablet is shutdown, which I believe happens only when the tablet is deleted or tombstoned. I added a comment explaining this.


http://gerrit.cloudera.org:8080/#/c/6968/16/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS16, Line 24: #include <mutex>
> nit: prefer <cstddef> for C++11 (I need to update IWYU stuff to give proper
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#10).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
23 files changed, 257 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
......................................................................


Patch Set 8:

Build failure looks like a random timeout in Python tests, which incidentally exposed a defect in the Python testing code-- I submitted a patch for it.

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1755 Part 2: Improve table on-disk size metric

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#2).

Change subject: KUDU-1755 Part 2: Improve table on-disk size metric
......................................................................

KUDU-1755 Part 2: Improve table on-disk size metric

This adds WAL segment sizes to the tablet on-disk metric. This is a
little bit trickier than previous improvements because this info is
held by the TabletReplica, not the Tablet, so the on-disk metric
itself had to be relocated and plugged in to TabletReplica::EstimateOnDiskSize.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
3 files changed, 39 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>