You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2019/12/04 00:20:47 UTC

[kudu-CR] log index: use RWFiles for IO

Hello Alexey Serbin, Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/14822

to review the following change.


Change subject: log_index: use RWFiles for IO
......................................................................

log_index: use RWFiles for IO

To use LogIndex in FileCache, we need to do one of two things:
1. Add an mmap-based file abstraction to Env, to be used by LogIndex.
2. Rework LogIndex to use RWFile instead of memory mappings.

This patch implements option #2. Why? Because although memory mappings can
be used for zero copy IO, the LogIndex wasn't doing that, and more
importantly, failures during memory mapped IO are communicated via UNIX
signals, making it practically impossible for an application of Kudu's
complexity to recover from a WAL disk failure surfaced during log index IO,
a feature that is being actively worked on in KUDU-2975.

IO through mmap is identical to IO through RWFile (i.e. pwrite/pread) for
all other intents and purposes:
- Both can use ftruncate to grow the file's size while keeping it sparse.
- Both maintain holes in file sections that aren't written.
- Both go through the page cache for reads and writes.
- Both allow pages to be dirty before writing them out asynchronously.

Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index-test.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_index.h
5 files changed, 71 insertions(+), 95 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/14822/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>

[kudu-CR] log index: use RWFiles for IO

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14822 )

Change subject: log_index: use RWFiles for IO
......................................................................

log_index: use RWFiles for IO

To use LogIndex in FileCache, we need to do one of two things:
1. Add an mmap-based file abstraction to Env, to be used by LogIndex.
2. Rework LogIndex to use RWFile instead of memory mappings.

This patch implements option #2. Why? Because although memory mappings can
be used for zero copy IO, the LogIndex wasn't doing that, and more
importantly, failures during memory mapped IO are communicated via UNIX
signals, making it practically impossible for an application of Kudu's
complexity to recover from a WAL disk failure surfaced during log index IO,
a feature that is being actively worked on in KUDU-2975.

IO through mmap is identical to IO through RWFile (i.e. pwrite/pread) for
all other intents and purposes:
- Both can use ftruncate to grow the file's size while keeping it sparse.
- Both maintain holes in file sections that aren't written.
- Both go through the page cache for reads and writes.
- Both allow pages to be dirty before writing them out asynchronously.

Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Reviewed-on: http://gerrit.cloudera.org:8080/14822
Reviewed-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index-test.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_index.h
5 files changed, 72 insertions(+), 96 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] log index: use RWFiles for IO

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14822 )

Change subject: log_index: use RWFiles for IO
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG@15
PS1, Line 15: failures during memory mapped IO
> BTW, what kind of failures did we get there in Kudu?  Something like not en
Running out of disk space is one such example, though that's typically mitigated by --fs_wal_dir_reserved_bytes.

A more common example is a disk error that causes the filesystem to be remounted in read-only mode, at which case an attempt to memcpy() into the memory mapped region manifests as a SIGBUS.


http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG@20
PS1, Line 20: IO through RWFile
> Do we expect less or more memory allocated in this case compared with the c
In Kudu itself? A bit more per index chunk, as there's more bookkeeping in an RWFile vs. the existing memory-mapped chunk implementation.

In the kernel? I'd expect the same amount. In both cases, 4k pages worth of an underlying index chunk are loaded into the page cache as they are accessed, and the page cache may dump those pages (or, if dirty, write them out) on memory pressure.


http://gerrit.cloudera.org:8080/#/c/14822/1/src/kudu/consensus/log_index.cc
File src/kudu/consensus/log_index.cc:

http://gerrit.cloudera.org:8080/#/c/14822/1/src/kudu/consensus/log_index.cc@237
PS1, Line 237: "Unable to delete index chunk " << path;
> Maybe, add information from the Status object (i.e. s.ToString()) into the 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 04 Dec 2019 17:37:09 +0000
Gerrit-HasComments: Yes

[kudu-CR] log index: use RWFiles for IO

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14822 )

Change subject: log_index: use RWFiles for IO
......................................................................


Patch Set 1: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG@15
PS1, Line 15: failures during memory mapped IO
BTW, what kind of failures did we get there in Kudu?  Something like not enough space, etc.?


http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG@20
PS1, Line 20: IO through RWFile
Do we expect less or more memory allocated in this case compared with the case when the IO was done via mmap-ed files?


http://gerrit.cloudera.org:8080/#/c/14822/1/src/kudu/consensus/log_index.cc
File src/kudu/consensus/log_index.cc:

http://gerrit.cloudera.org:8080/#/c/14822/1/src/kudu/consensus/log_index.cc@237
PS1, Line 237: "Unable to delete index chunk " << path;
Maybe, add information from the Status object (i.e. s.ToString()) into the message?



-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 04 Dec 2019 07:32:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] log index: use RWFiles for IO

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14822

to look at the new patch set (#2).

Change subject: log_index: use RWFiles for IO
......................................................................

log_index: use RWFiles for IO

To use LogIndex in FileCache, we need to do one of two things:
1. Add an mmap-based file abstraction to Env, to be used by LogIndex.
2. Rework LogIndex to use RWFile instead of memory mappings.

This patch implements option #2. Why? Because although memory mappings can
be used for zero copy IO, the LogIndex wasn't doing that, and more
importantly, failures during memory mapped IO are communicated via UNIX
signals, making it practically impossible for an application of Kudu's
complexity to recover from a WAL disk failure surfaced during log index IO,
a feature that is being actively worked on in KUDU-2975.

IO through mmap is identical to IO through RWFile (i.e. pwrite/pread) for
all other intents and purposes:
- Both can use ftruncate to grow the file's size while keeping it sparse.
- Both maintain holes in file sections that aren't written.
- Both go through the page cache for reads and writes.
- Both allow pages to be dirty before writing them out asynchronously.

Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index-test.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_index.h
5 files changed, 72 insertions(+), 96 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/14822/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] log index: use RWFiles for IO

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14822 )

Change subject: log_index: use RWFiles for IO
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 04 Dec 2019 18:25:17 +0000
Gerrit-HasComments: No

[kudu-CR] log index: use RWFiles for IO

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14822 )

Change subject: log_index: use RWFiles for IO
......................................................................


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG@15
PS1, Line 15: failures during memory mapped IO
> Running out of disk space is one such example, though that's typically miti
I see.  That makes sense.


http://gerrit.cloudera.org:8080/#/c/14822/1//COMMIT_MSG@20
PS1, Line 20: IO through RWFile
> In Kudu itself? A bit more per index chunk, as there's more bookkeeping in 
SGTM, thanks.



-- 
To view, visit http://gerrit.cloudera.org:8080/14822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Gerrit-Change-Number: 14822
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 04 Dec 2019 18:15:27 +0000
Gerrit-HasComments: Yes