You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2017/04/24 08:58:48 UTC

[kudu-CR] log block manager: detect and repair unpunched holes

Hello David Ribeiro Alves, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6717

to review the following change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................

log block manager: detect and repair unpunched holes

This patch adds detection and repair of "unpunched holes" to the LBM. These
are deleted blocks whose space in the container data files is still "live",
either because hole punching failed, or because the server crashed before
the holes could be punched.

The newly added container accounting is used as a heuristic to decide when
to repunch holes: if a container's data size exceeds the size we think it
should have (after alignment), we'll repunch all holes and truncate any
preallocated space off the end.

An alternative is to use the container's extent map to figure this out (and
to provide greater precision on where to punch), but testing on el6.6 showed
that calling the FS_IOC_FIEMAP ioctl() on every full container increased LBM
startup time by about 50%. That's bad enough that we shouldn't do it willy
nilly. It could be gated on the above heuristic and used to drive more
precise hole repunching, but given the complexity involved and given the
amount of tilting I've done at this particular windmill, it'll remain a
problem for another day.

Testing is done in three ways:
- A new unit test that exercises new LBMCorruptor functionality.
- Inclusion in block_manager-stress-test via the change to
  LBMCorruptor::InjectRandomNonFatalInconsistency().
- Inclusion in BlockManagerTest.TestMetadataOkayDespiteFailedWrites via the
  generalization of the env_inject_io_error gflag.

Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
---
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
7 files changed, 231 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/6717/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6717

to look at the new patch set (#2).

Change subject: log block manager: detect and repair unpunched holes
......................................................................

log block manager: detect and repair unpunched holes

This patch adds detection and repair of "unpunched holes" to the LBM. These
are deleted blocks whose space in the container data files is still "live",
either because hole punching failed, or because the server crashed before
the holes could be punched.

The newly added container accounting is used as a heuristic to decide when
to repunch holes: if a container's data size exceeds the size we think it
should have (after alignment), we'll repunch all holes and truncate any
preallocated space off the end. The heuristic uses some (configurable) slop
to work around various filesystem accounting issues.

An alternative is to use the container's extent map to figure this out (and
to provide greater precision on where to punch), but testing on el6.6 showed
that calling the FS_IOC_FIEMAP ioctl() on every full container increased LBM
startup time by about 50%. That's bad enough that we shouldn't do it willy
nilly. It could be gated on the above heuristic and used to drive more
precise hole repunching, but given the complexity involved and given the
amount of tilting I've done at this particular windmill, it'll remain a
problem for another day.

Testing is done in three ways:
- A new unit test that exercises new LBMCorruptor functionality.
- Inclusion in block_manager-stress-test via the change to
  LBMCorruptor::InjectRandomNonFatalInconsistency().
- Inclusion in BlockManagerTest.TestMetadataOkayDespiteFailedWrites via the
  generalization of the env_inject_io_error gflag.

Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
8 files changed, 275 insertions(+), 63 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/6717/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 2: Verified+1

Unrelated failure in a Spark test.

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6717/4/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

Line 393:     if (r == 0) {
would this sequence of if()s read better as a 'case'?


http://gerrit.cloudera.org:8080/#/c/6717/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 2059:  // triggers asynchronous hole punching
not following this comment


http://gerrit.cloudera.org:8080/#/c/6717/4/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

Line 116:               "Fraction of the time that write or preallocate operations will fail");
need to update this?


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 1:

> I did some performance testing on the same machine as before to
 > gauge how expensive the "repunch and retruncate on heuristic
 > firing" is. Full results at https://gist.github.com/adembo/d4159f9610fc6c7b950bfff9272c0683.
 > 
 > Summary: it's about 6% slower, though the heuristic appears to be
 > firing more often than it should due to unexpected results from
 > stat(). That is, the filesystem reports that many containers
 > consume more disk space as reported by stat() than as reported by
 > their extent maps (or than they should, for that matter).

I tested again, after adding 10% slop to the heuristic. That dropped the impact to about 4%. I think the slop helped some, but given the container sizes at play, 10% may not be enough. That is, it should help much more on a real deployment.

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 6: Verified+1

Unrelated Spark test failure.

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6717

to look at the new patch set (#8).

Change subject: log block manager: detect and repair unpunched holes
......................................................................

log block manager: detect and repair unpunched holes

This patch adds detection and repair of "unpunched holes" to the LBM. These
are deleted blocks whose space in the container data files is still "live",
either because hole punching failed, or because the server crashed before
the holes could be punched.

The newly added container accounting is used as a heuristic to decide when
to repunch holes: if a container's data size exceeds the size we think it
should have (after alignment), we'll repunch all holes and truncate any
preallocated space off the end. The heuristic uses some (configurable) slop
to work around various filesystem accounting issues.

An alternative is to use the container's extent map to figure this out (and
to provide greater precision on where to punch), but testing on el6.6 showed
that calling the FS_IOC_FIEMAP ioctl() on every full container increased LBM
startup time by about 50%. That's bad enough that we shouldn't do it willy
nilly. It could be gated on the above heuristic and used to drive more
precise hole repunching, but given the complexity involved and given the
amount of tilting I've done at this particular windmill, it'll remain a
problem for another day.

Testing is done in three ways:
- A new unit test that exercises new LBMCorruptor functionality.
- Inclusion in block_manager-stress-test via the change to
  LBMCorruptor::InjectRandomNonFatalInconsistency().
- Inclusion in BlockManagerTest.TestMetadataOkayDespiteFailedWrites via the
  generalization of the env_inject_io_error gflag.

Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
8 files changed, 297 insertions(+), 81 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/6717/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS5, Line 79:  0.1
> Will change the name.
thanks for the explanation. it would be good to have this findings written somewhere


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 1:

I did some performance testing on the same machine as before to gauge how expensive the "repunch and retruncate on heuristic firing" is. Full results at https://gist.github.com/adembo/d4159f9610fc6c7b950bfff9272c0683. 

Summary: it's about 6% slower, though the heuristic appears to be firing more often than it should due to unexpected results from stat(). That is, the filesystem reports that many containers consume more disk space as reported by stat() than as reported by their extent maps (or than they should, for that matter).

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6717/7/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 860:   // when it fires.
Since slop isn't a part of the gflag, it may be clearer to update the "slop" talk with more boring terminology.
E.g. Enforce that the container actual size is strictly upper-bounded by the calculated size so we can more easily trigger repairs.

Or keep it and make it extremely clear what this "slop" is referring to via comments in other files.


http://gerrit.cloudera.org:8080/#/c/6717/7/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS7, Line 81:  
nit: is


PS7, Line 1830: slop in our size measurements. Too little
              :       // slop and we'll do unnecessary work at startup
nit: Make the connection between "slop" and FLAGS_log_container_excess_space_before_cleanup_fraction more clear. This comment isn't as clear given the gflag name change.


PS7, Line 1841: measured_size
nit: This is more of a threshold, rather than a measured size. Could we rename it to reflect that? e.g. cleanup_threshold_size


PS7, Line 2061: // Clearing this vector drops the last references to the LogBlocks within,
              :   // triggering the repunching operations.
Is this enforced by the fact that we're calling Repair() with std::move(need_repunching) as the argument? If this is necessary, should we document the requirement in the .h, or is that too implementation-specific?


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6717/4/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

Line 393:     if (r == 0) {
> would this sequence of if()s read better as a 'case'?
Yeah, not sure why I didn't do it as a switch to begin with.


http://gerrit.cloudera.org:8080/#/c/6717/4/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS4, Line 2059:  // triggers asynchronous hole punching
> not following this comment
Clearing needs_repunching drops the last references to the deleted LogBlocks. Their destructors run and will queue up hole punch closures on the data dir's thread pool.

The hole punching takes place asynchronously, but LBM::Open() will wait on the thread pools anyway, so I guess it's not really all that asynchronous.

I'll clarify the comment.


http://gerrit.cloudera.org:8080/#/c/6717/4/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

Line 116:               "Fraction of the time that write or preallocate operations will fail");
> need to update this?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 8: Verified+1

ITClient flake in ASAN, unrelated.

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 5:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

PS5, Line 156: initial_data_size
> does it make sense to add assertions on size?
Not sure what you mean; do you mean we should assert that initial_data_size is larger or equal to a certain value? What value would that add?


PS5, Line 460: int64_t* old_data_file_size
> maybe set a nullptr default arg so that you don't have to pass it if you do
All three callers make use of old_data_file_size though.


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 50: 
> extra line, likely also clean the other one below
Done


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS5, Line 79: 0.10
> this is what's gating the heavy startup penalty, right? How easy is it to r
Will change the name.

If the penalty you're alluding to was the 50% from the extent map approach, that's totally gone.

Without any slop (meaning, any full container with even one unaccounted-for byte will repunch its holes and truncate), I observed a 6% penalty on el6.6 where my ~10,000 containers of size 16-32k had about 2k extra space in them.

We still don't really know why these discrepancies exist. Todd's theory is that older versions of ext4 include blocks occupied by interior nodes in the extent tree in their accounting, but I wasn't able to find a slam dunk explanation in a bug report or similar resource. And I don't have time to do a thorough investigation in different environments. So, this knob will have to suffice.

Note: I didn't see any space discrepancies at all in Ubuntu 16; that is, at some point, ext4 fixed its space accounting issues. Just not on el6.


PS5, Line 80: Additional fraction of a log container's calculated size that "
            :               "must be consumed on disk before the container considered to be "
            :               "inconsistent and subject to excess space cleanup.
> explain when/how the cleanup happens (at startup or with a tool, right?)
Done


PS5, Line 1833: actual_size
> you're basically saying in the comments above that this number is unstrustw
Yes, it's untrustworthy. I'll change the name.


PS5, Line 1834: s = env_->GetFileSizeOnDisk(data_filename, &actual_size);
              :       if (!s.ok()) {
              :         *result_status = s.CloneAndPrepend(Substitute(
              :             "Could not get on-disk file size of container $0", container->ToString()));
              :         return;
              :       }
> use RETURN_NOT_OK_PREPEND
Can't; this function is run asynchronously out of a thread pool so it doesn't return its status.


PS5, Line 1840: measured_size
> _this_ is the actual size, right?
This is the size according to our own accounting. It's not the size according to the filesystem.


PS5, Line 2054: TODO(adar): can be more efficient (less fs work and more space reclamation
              :   // in case of misaligned blocks) via hole coalescing first, but this is easy.
> file a jira? seems like we might have to revisit this if it turns out that 
I'm going to punt. I think the TODO() is enough for now, and my observations show that even repunching every hole in the container isn't that bad (see above).


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

PS5, Line 116: certain
> which ones? can it be any operation?
The ones that use it. :)

It's a little ridiculous to maintain an exhaustive list that must be kept in-sync, especially since this is a testing-only gflag. It'd also be ridiculous to say "search for all references to this flag in order to find which operations are affected."


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 5:

Adding Andrew to the review since this touches stuff he might also be touching

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

PS5, Line 156: initial_data_size
does it make sense to add assertions on size?


PS5, Line 460: int64_t* old_data_file_size
maybe set a nullptr default arg so that you don't have to pass it if you don't care (twice in this file iirc)


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 50: 
extra line, likely also clean the other one below


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS5, Line 79: 0.10
this is what's gating the heavy startup penalty, right? How easy is it to reach 10% wasted space? afraid that if it's too easy then we might as well not gate it at all


PS5, Line 79: log_container_extra_space_heuristic_slop_fraction
this flag name is not very informative. maybe: log_container_wasted_space_before_cleanup_fraction or somesuch


PS5, Line 80: Additional fraction of a log container's calculated size that "
            :               "must be consumed on disk before the container considered to be "
            :               "inconsistent and subject to excess space cleanup.
explain when/how the cleanup happens (at startup or with a tool, right?)


PS5, Line 1833: actual_size
you're basically saying in the comments above that this number is unstrustworthy, or did I get that wrong?

maybe "reported_size" or "size_reported_by_os" would be a better name


PS5, Line 1834: s = env_->GetFileSizeOnDisk(data_filename, &actual_size);
              :       if (!s.ok()) {
              :         *result_status = s.CloneAndPrepend(Substitute(
              :             "Could not get on-disk file size of container $0", container->ToString()));
              :         return;
              :       }
use RETURN_NOT_OK_PREPEND


PS5, Line 1840: measured_size
_this_ is the actual size, right?


PS5, Line 2054: TODO(adar): can be more efficient (less fs work and more space reclamation
              :   // in case of misaligned blocks) via hole coalescing first, but this is easy.
file a jira? seems like we might have to revisit this if it turns out that hole puching is easily triggered and too heavy


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

PS5, Line 116: certain
which ones? can it be any operation?


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


log block manager: detect and repair unpunched holes

This patch adds detection and repair of "unpunched holes" to the LBM. These
are deleted blocks whose space in the container data files is still "live",
either because hole punching failed, or because the server crashed before
the holes could be punched.

The newly added container accounting is used as a heuristic to decide when
to repunch holes: if a container's data size exceeds the size we think it
should have (after alignment), we'll repunch all holes and truncate any
preallocated space off the end. The heuristic uses some (configurable) slop
to work around various filesystem accounting issues.

An alternative is to use the container's extent map to figure this out (and
to provide greater precision on where to punch), but testing on el6.6 showed
that calling the FS_IOC_FIEMAP ioctl() on every full container increased LBM
startup time by about 50%. That's bad enough that we shouldn't do it willy
nilly. It could be gated on the above heuristic and used to drive more
precise hole repunching, but given the complexity involved and given the
amount of tilting I've done at this particular windmill, it'll remain a
problem for another day.

Testing is done in three ways:
- A new unit test that exercises new LBMCorruptor functionality.
- Inclusion in block_manager-stress-test via the change to
  LBMCorruptor::InjectRandomNonFatalInconsistency().
- Inclusion in BlockManagerTest.TestMetadataOkayDespiteFailedWrites via the
  generalization of the env_inject_io_error gflag.

Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Reviewed-on: http://gerrit.cloudera.org:8080/6717
Tested-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
8 files changed, 297 insertions(+), 81 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Adar Dembo: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 8: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: log block manager: detect and repair unpunched holes
......................................................................


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6717/7/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 860:   // when it fires.
> Since slop isn't a part of the gflag, it may be clearer to update the "slop
Sure, will use your proposed rewording verbatim.


http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS5, Line 79:  0.1
> thanks for the explanation. it would be good to have this findings written 
Most of it is documented on L1815. I'll add a little more, but I'd rather not document raw performance numbers (percentages and raw times) since that may fluctuate as the code evolves.


http://gerrit.cloudera.org:8080/#/c/6717/7/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS7, Line 81:  
> nit: is
Done


PS7, Line 1830: slop in our size measurements. Too little
              :       // slop and we'll do unnecessary work at startup
> nit: Make the connection between "slop" and FLAGS_log_container_excess_spac
Done


PS7, Line 1841: measured_size
> nit: This is more of a threshold, rather than a measured size. Could we ren
Done


PS7, Line 2061: // Clearing this vector drops the last references to the LogBlocks within,
              :   // triggering the repunching operations.
> Is this enforced by the fact that we're calling Repair() with std::move(nee
I think that's too implementation-specific.

In the worst case and the std::move() is accidentally removed, the last references will be dropped when OpenDataDir() goes out of scope. From the perspective of the block manager user, it's effectively the same.


-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6717

to look at the new patch set (#5).

Change subject: log block manager: detect and repair unpunched holes
......................................................................

log block manager: detect and repair unpunched holes

This patch adds detection and repair of "unpunched holes" to the LBM. These
are deleted blocks whose space in the container data files is still "live",
either because hole punching failed, or because the server crashed before
the holes could be punched.

The newly added container accounting is used as a heuristic to decide when
to repunch holes: if a container's data size exceeds the size we think it
should have (after alignment), we'll repunch all holes and truncate any
preallocated space off the end. The heuristic uses some (configurable) slop
to work around various filesystem accounting issues.

An alternative is to use the container's extent map to figure this out (and
to provide greater precision on where to punch), but testing on el6.6 showed
that calling the FS_IOC_FIEMAP ioctl() on every full container increased LBM
startup time by about 50%. That's bad enough that we shouldn't do it willy
nilly. It could be gated on the above heuristic and used to drive more
precise hole repunching, but given the complexity involved and given the
amount of tilting I've done at this particular windmill, it'll remain a
problem for another day.

Testing is done in three ways:
- A new unit test that exercises new LBMCorruptor functionality.
- Inclusion in block_manager-stress-test via the change to
  LBMCorruptor::InjectRandomNonFatalInconsistency().
- Inclusion in BlockManagerTest.TestMetadataOkayDespiteFailedWrites via the
  generalization of the env_inject_io_error gflag.

Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
8 files changed, 291 insertions(+), 76 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/6717/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] log block manager: detect and repair unpunched holes

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6717

to look at the new patch set (#6).

Change subject: log block manager: detect and repair unpunched holes
......................................................................

log block manager: detect and repair unpunched holes

This patch adds detection and repair of "unpunched holes" to the LBM. These
are deleted blocks whose space in the container data files is still "live",
either because hole punching failed, or because the server crashed before
the holes could be punched.

The newly added container accounting is used as a heuristic to decide when
to repunch holes: if a container's data size exceeds the size we think it
should have (after alignment), we'll repunch all holes and truncate any
preallocated space off the end. The heuristic uses some (configurable) slop
to work around various filesystem accounting issues.

An alternative is to use the container's extent map to figure this out (and
to provide greater precision on where to punch), but testing on el6.6 showed
that calling the FS_IOC_FIEMAP ioctl() on every full container increased LBM
startup time by about 50%. That's bad enough that we shouldn't do it willy
nilly. It could be gated on the above heuristic and used to drive more
precise hole repunching, but given the complexity involved and given the
amount of tilting I've done at this particular windmill, it'll remain a
problem for another day.

Testing is done in three ways:
- A new unit test that exercises new LBMCorruptor functionality.
- Inclusion in block_manager-stress-test via the change to
  LBMCorruptor::InjectRandomNonFatalInconsistency().
- Inclusion in BlockManagerTest.TestMetadataOkayDespiteFailedWrites via the
  generalization of the env_inject_io_error gflag.

Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
8 files changed, 292 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/6717/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>