You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2018/01/17 07:16:44 UTC

[kudu-CR] KUDU-2195 (part 1): always sync PBC-format metadata files

Hello Adar Dembo,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/9043

to review the following change.


Change subject: KUDU-2195 (part 1): always sync PBC-format metadata files
......................................................................

KUDU-2195 (part 1): always sync PBC-format metadata files

This changes the writing of metadata files to always fsync data before
renaming into place. Previously, we didn't do this in a few cases, most
notably for consensus metadata when WAL fsync is off (the default).

However, ext4 already has some automatic fsync when it detects the
commmon "write and rename-to-replace" paradigm used by a lot of
software. So, adding the explicit fsync isn't likely to slow down ext4
much.

More importantly, xfs does _not_ have this behavior, and we've recently
seen some issues on an xfs-enabled system where consensus metadata files
were unexpectedly zero-length or appear to have lost term changes,
causing tablets to fail to initialize. Initial investigation seems to
indicate that the hosts with these issues had experienced some hard
resets, lending further credence to the theory that it was due to lack
of syncing metadata.

Lots of good background reading can be found here (particularly in the
comments):
https://lwn.net/Articles/351422/

Change-Id: I4f0c911662b2ff35fcf3915790248ba85bf6026f
---
M src/kudu/consensus/consensus_meta.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/integration-tests/external_mini_cluster_fs_inspector.cc
M src/kudu/server/server_base.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
11 files changed, 48 insertions(+), 76 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/9043/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4f0c911662b2ff35fcf3915790248ba85bf6026f
Gerrit-Change-Number: 9043
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>

[kudu-CR] KUDU-2195 (part 1): always sync PBC-format metadata files

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9043 )

Change subject: KUDU-2195 (part 1): always sync PBC-format metadata files
......................................................................


Patch Set 1:

This was recently seen in the wild again. Usually it's people running XFS that experience a power outage who see 0-length cmeta files. We should consider adding a gflag for just the cmeta files so that people running with XFS have a band-aid.


-- 
To view, visit http://gerrit.cloudera.org:8080/9043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4f0c911662b2ff35fcf3915790248ba85bf6026f
Gerrit-Change-Number: 9043
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 09 Jan 2019 00:38:21 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2195 (part 1): always sync PBC-format metadata files

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9043 )

Change subject: KUDU-2195 (part 1): always sync PBC-format metadata files
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9043/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9043/1//COMMIT_MSG@15
PS1, Line 15: So, adding the explicit fsync isn't likely to slow down ext4
            : much.
> Could you test this hypothesis?
yea, will investigate this. Actually looking back on some benchmarks I ran and commented on at https://issues.apache.org/jira/browse/KUDU-2204 it seems like fsync costs more than expected even on ext4.



-- 
To view, visit http://gerrit.cloudera.org:8080/9043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4f0c911662b2ff35fcf3915790248ba85bf6026f
Gerrit-Change-Number: 9043
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 18 Jan 2018 00:25:41 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2195 (part 1): always sync PBC-format metadata files

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9043 )

Change subject: KUDU-2195 (part 1): always sync PBC-format metadata files
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9043/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9043/1//COMMIT_MSG@15
PS1, Line 15: So, adding the explicit fsync isn't likely to slow down ext4
            : much.
Could you test this hypothesis?


http://gerrit.cloudera.org:8080/#/c/9043/1/src/kudu/util/pb_util.cc
File src/kudu/util/pb_util.cc:

http://gerrit.cloudera.org:8080/#/c/9043/1/src/kudu/util/pb_util.cc@493
PS1, Line 493:   RETURN_NOT_OK_PREPEND(env->SyncDir(DirName(path)), "Failed to SyncDir() parent of " + path);
I think this part is the only real loss of expressivity in the patch, because we lose the ability to create several PB files in the same directory and "batch up" the syncing of the directory by only passing pb_util::SYNC in the last call. Of course, with the old code you'd need to sync all but the last newly created PB file out-of-band so this metaphor certainly wasn't perfect (and likely unused by Kudu). Probably not worth keeping.


http://gerrit.cloudera.org:8080/#/c/9043/1/src/kudu/util/pb_util.cc@993
PS1, Line 993:   // TODO(todd) consider using O_DIRECT to avoid the "write, fsync, rename, syncdir" dance.
I scanned the linked e-mail thread but it wasn't clear to me whether Ted's suggestion was specific to ext4 or applies to any POSIX filesystem (which would presumably include macOS as well). Do you know?



-- 
To view, visit http://gerrit.cloudera.org:8080/9043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4f0c911662b2ff35fcf3915790248ba85bf6026f
Gerrit-Change-Number: 9043
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 17 Jan 2018 21:52:53 +0000
Gerrit-HasComments: Yes