You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2017/04/24 08:58:48 UTC

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Hello David Ribeiro Alves, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6715

to review the following change.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................

KUDU-1978: avoid corruption when deleting misaligned blocks

The gist of the problem: when punching out blocks, we don't actually know
where they end. The existing code assumes that proper alignment is always
maintained, but when faced with a repeated sequence of blocks misaligned on
both start and end offsets, this may not be true. It is conceivable (though
not certain) that KUDU-1793 can cause such sequences.

To fix, we could maintain per-block end offsets, but that'd be expensive for
such a rare case (remember: we have millions of blocks). Instead, we'll look
for blocks misaligned on their start offsets and skip the align up step when
punching them out. This means we won't reclaim all of their space, but it's
better than corrupting data belonging to the next block.

A black box reproduction of KUDU-1793 is virtually impossible, so here's a
white box implementation instead, via augmentation of the LBMCorruptor's
misaligned block generator. Without the fix (though with the removal of the
DCHECK_EQ), the new test fails 100% of the time. With the fix, it hasn't
failed in 1000 runs.

Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
---
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
4 files changed, 182 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/6715/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6715

to look at the new patch set (#2).

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................

KUDU-1978: avoid corruption when deleting misaligned blocks

The gist of the problem: when punching out blocks, we don't actually know
where they end. The existing code assumes that proper alignment is always
maintained, but when faced with a repeated sequence of blocks misaligned on
both start and end offsets, this may not be true. It is conceivable (though
not certain) that KUDU-1793 can cause such sequences.

To fix, we could maintain per-block end offsets, but that'd be expensive for
such a rare case (remember: we have millions of blocks). Instead, we'll look
for blocks misaligned on their start offsets and skip the align up step when
punching them out. This means we won't reclaim all of their space, but it's
better than corrupting data belonging to the next block.

A black box reproduction of KUDU-1793 is virtually impossible, so here's a
white box implementation instead, via augmentation of the LBMCorruptor's
misaligned block generator. Without the fix (though with the removal of the
DCHECK_EQ), the new test fails 100% of the time. With the fix, it hasn't
failed in 1000 runs.

Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
---
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
4 files changed, 180 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/6715/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

PS3, Line 295: uint64_t block_length = rand_.Uniform(fs_block_size * 4);
             :   block_length -= block_length % block_id_str_len;
             :   uint8_t data[block_length];
             :   for (int i = 0; i < ARRAYSIZE(data); i += block_id_str_len) {
             :     memcpy(&data[i], block_id_str.data(), block_id_str_len);
             :   }
it's ok if this writes a 0 sized block, right?


http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS3, Line 782: Need to reopen the block manager after each corruption.
why?


-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

PS3, Line 288: id
maybe say stringified id? at first i expected to just have the blockid in binary format (int64s). In fact would that be simpler? we always know it'll be int64


http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test-util.h
File src/kudu/fs/log_block_manager-test-util.h:

PS3, Line 66: sequences of its size
of its block id?


-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................


KUDU-1978: avoid corruption when deleting misaligned blocks

The gist of the problem: when punching out blocks, we don't actually know
where they end. The existing code assumes that proper alignment is always
maintained, but when faced with a repeated sequence of blocks misaligned on
both start and end offsets, this may not be true. It is conceivable (though
not certain) that KUDU-1793 can cause such sequences.

To fix, we could maintain per-block end offsets, but that'd be expensive for
such a rare case (remember: we have millions of blocks). Instead, we'll look
for blocks misaligned on their start offsets and skip the align up step when
punching them out. This means we won't reclaim all of their space, but it's
better than corrupting data belonging to the next block.

A black box reproduction of KUDU-1793 is virtually impossible, so here's a
white box implementation instead, via augmentation of the LBMCorruptor's
misaligned block generator. Without the fix (though with the removal of the
DCHECK_EQ), the new test fails 100% of the time. With the fix, it hasn't
failed in 1000 runs.

Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Reviewed-on: http://gerrit.cloudera.org:8080/6715
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
4 files changed, 180 insertions(+), 42 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6715

to look at the new patch set (#4).

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................

KUDU-1978: avoid corruption when deleting misaligned blocks

The gist of the problem: when punching out blocks, we don't actually know
where they end. The existing code assumes that proper alignment is always
maintained, but when faced with a repeated sequence of blocks misaligned on
both start and end offsets, this may not be true. It is conceivable (though
not certain) that KUDU-1793 can cause such sequences.

To fix, we could maintain per-block end offsets, but that'd be expensive for
such a rare case (remember: we have millions of blocks). Instead, we'll look
for blocks misaligned on their start offsets and skip the align up step when
punching them out. This means we won't reclaim all of their space, but it's
better than corrupting data belonging to the next block.

A black box reproduction of KUDU-1793 is virtually impossible, so here's a
white box implementation instead, via augmentation of the LBMCorruptor's
misaligned block generator. Without the fix (though with the removal of the
DCHECK_EQ), the new test fails 100% of the time. With the fix, it hasn't
failed in 1000 runs.

Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
---
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test-util.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
4 files changed, 180 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/6715/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

PS3, Line 288: id
> maybe say stringified id? at first i expected to just have the blockid in b
Done


PS3, Line 295: uint64_t block_length = rand_.Uniform(fs_block_size * 4);
             :   block_length -= block_length % block_id_str_len;
             :   uint8_t data[block_length];
             :   for (int i = 0; i < ARRAYSIZE(data); i += block_id_str_len) {
             :     memcpy(&data[i], block_id_str.data(), block_id_str_len);
             :   }
> it's ok if this writes a 0 sized block, right?
Yes, as long as the data file is fully preallocated up to the previous block (see length_beyond_eof).


http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test-util.h
File src/kudu/fs/log_block_manager-test-util.h:

PS3, Line 66: sequences of its size
> of its block id?
Whoops, that was from when I was writing the size. Done.


http://gerrit.cloudera.org:8080/#/c/6715/3/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS3, Line 782: Need to reopen the block manager after each corruption.
> why?
Clarified in the comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1978: avoid corruption when deleting misaligned blocks

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1978: avoid corruption when deleting misaligned blocks
......................................................................


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6ed6e349ab10d8d04722cd7ef99e7a06554f51f1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No