You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2018/08/10 20:24:44 UTC

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Hello Alexey Serbin, Todd Lipcon,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/11190

to review the following change.


Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................

file_cache: stop using sync_on_close when reopening evicted files

When the FileCache was originally written, PosixRWFile had a pending_sync_
member which tracked whether or not a file was dirty and needed to be
synced. However, with its removal in commit 9a07b2fed, sync_on_close=true
should no longer be necessary for correctness.

While I was there, I cleaned up env_posix.cc a bit, including constifying
an fd_ member and replacing its "is the file closed?" semantic with a new
dedicated boolean.

Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
---
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache.cc
3 files changed, 54 insertions(+), 46 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/11190/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@11
PS2, Line 11: However, with its removal in commit 9a07b2fed, sync_on_close=true
            : should no longer be necessary for correctness.
> hm, I'm having trouble paging this back in.
Yes, that is the expectation, but I can't imagine how it would have been safe to rely on the previous state of affairs for durability. Suppose you were using the file cache but your fileset was small enough that files were never evicted; if you didn't call Sync yourself, you didn't get durability.

So sure, I can add a comment, but the new semantics don't seem any different from the non file-cached world: if you want durability, you need to Sync (or you need to open your files with sync_on_close=true, but the file cache doesn't let you do that).


http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@16
PS2, Line 16: An unnecessary fdatasync system call should no-op, but maybe
            : that's not the case on older kernels?
> hm, that's interesting. Do you think maybe the file had been modified by so
Yeah that's a conceivable explanation. Or maybe on old kernels an fdatasync was always at least a seek, and since I was hammering this particular server's disk, these "no-op" fdatasyncs were affected.



-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Aug 2018 22:25:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@11
PS2, Line 11: However, with its removal in commit 9a07b2fed, sync_on_close=true
            : should no longer be necessary for correctness.
> I think it was some case in which a user was getting a "file not found" err
Right, I remember that case. I don't think that machine ever crashed though, so the kernel would have eventually synced the metadata even if we omitted an fdatasync call.



-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 15 Aug 2018 19:02:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@11
PS2, Line 11: However, with its removal in commit 9a07b2fed, sync_on_close=true
            : should no longer be necessary for correctness.
> I don't think we ever relied on sync_on_close, at least not in the block ma
I think it was some case in which a user was getting a "file not found" error about a particular block container after they'd had a full disk -- that means that we managed to sync the metadata but didn't sync the file itself and it disappeared. I didn't get to the root of that one though - just saw it in passing.



-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 15 Aug 2018 17:36:22 +0000
Gerrit-HasComments: Yes

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................

file_cache: stop using sync_on_close when reopening evicted files

When the FileCache was originally written, PosixRWFile had a pending_sync_
member which tracked whether or not a file was dirty and needed to be
synced. However, with its removal in commit 9a07b2fed, sync_on_close=true
should no longer be necessary for correctness.

I only noticed this because of a "Time spent sync" log message emitted while
running 'kudu fs dump', which was perplexing seeing as this CLI invocation
is read-only. An unnecessary fdatasync system call should no-op, but maybe
that's not the case on older kernels?

While I was there, I cleaned up env_posix.cc a bit, including constifying
an fd_ member and replacing its "is the file closed?" semantic with a new
dedicated boolean.

Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Reviewed-on: http://gerrit.cloudera.org:8080/11190
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache.cc
3 files changed, 54 insertions(+), 46 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Todd Lipcon: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@11
PS2, Line 11: However, with its removal in commit 9a07b2fed, sync_on_close=true
            : should no longer be necessary for correctness.
hm, I'm having trouble paging this back in.

So, now we no longer ever rely on 'sync_on_close' and for file-cached items we need to make sure that any code that expects them to be synced explicitly calls Sync? Should we note that somewhere in the file_cache.h docs?


http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@16
PS2, Line 16: An unnecessary fdatasync system call should no-op, but maybe
            : that's not the case on older kernels?
hm, that's interesting. Do you think maybe the file had been modified by some other program (eg a concurrently running tserver?)



-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Aug 2018 21:35:31 +0000
Gerrit-HasComments: Yes

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@11
PS2, Line 11: However, with its removal in commit 9a07b2fed, sync_on_close=true
            : should no longer be necessary for correctness.
> Yes, that is the expectation, but I can't imagine how it would have been sa
Yea, that makes sense. I forgot that the file cache doesn't allow a way for an "opener" to specify sync_on_close.

Recently I've seen a few cases where it oddly looked like blocks weren't getting fsynced where I thought they might have. Is it possible we are missing Sync somewhere because in a pre-filecache-world we relied on sync_on_close?


http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@16
PS2, Line 16: An unnecessary fdatasync system call should no-op, but maybe
            : that's not the case on older kernels?
> Yeah that's a conceivable explanation. Or maybe on old kernels an fdatasync
possible, yea



-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Aug 2018 22:28:17 +0000
Gerrit-HasComments: Yes

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11190

to look at the new patch set (#2).

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................

file_cache: stop using sync_on_close when reopening evicted files

When the FileCache was originally written, PosixRWFile had a pending_sync_
member which tracked whether or not a file was dirty and needed to be
synced. However, with its removal in commit 9a07b2fed, sync_on_close=true
should no longer be necessary for correctness.

I only noticed this because of a "Time spent sync" log message emitted while
running 'kudu fs dump', which was perplexing seeing as this CLI invocation
is read-only. An unnecessary fdatasync system call should no-op, but maybe
that's not the case on older kernels?

While I was there, I cleaned up env_posix.cc a bit, including constifying
an fd_ member and replacing its "is the file closed?" semantic with a new
dedicated boolean.

Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
---
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/file_cache.cc
3 files changed, 54 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/11190/2
-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] file cache: stop using sync on close when reopening evicted files

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11190 )

Change subject: file_cache: stop using sync_on_close when reopening evicted files
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11190/2//COMMIT_MSG@11
PS2, Line 11: However, with its removal in commit 9a07b2fed, sync_on_close=true
            : should no longer be necessary for correctness.
> Yea, that makes sense. I forgot that the file cache doesn't allow a way for
I don't think we ever relied on sync_on_close, at least not in the block managers. Before the file cache files opened by the block managers were never actually closed (and we'd run out of fds in larger deployments).

Can you share more details on what you saw?



-- 
To view, visit http://gerrit.cloudera.org:8080/11190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845f39ea202ec0cb46ce8e841c6cfb6c3a8dad2c
Gerrit-Change-Number: 11190
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 10 Aug 2018 22:31:13 +0000
Gerrit-HasComments: Yes