You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Grant Henke (Code Review)" <ge...@cloudera.org> on 2018/03/14 01:20:33 UTC

[kudu-CR](branch-1.7.x) KUDU-2153. Servers should not delete tmp files until after locking directories

Grant Henke has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9622


Change subject: KUDU-2153. Servers should not delete tmp files until after locking directories
......................................................................

KUDU-2153. Servers should not delete tmp files until after locking directories

This changes the order of FsManager startup to not try to clean tmp
files until after successfully locking the data directories. This
prevents potential issues such as:

- a tserver is already running on some host, and in the middle of
  consensus voting. Thus it has created a tmp file.
- someone accidentally attempts to start a tserver with the same set of
  data dirs. Prior to this patch, it would delete the tmp file before
  realizing that it could not successfully lock its data dirs and
  aborting.
- the original tserver would crash or otherwise get very confused
  because the tmp file it just wrote would be gone.

This patch relies on the locking on the block manager instance files to
provide exclusive access to some non-block-manager-related storage such
as the consensus meta, etc. That means that it's still possible for
someone to hit the above issue if they were to start servers with
disjoint sets of data dirs but with the same meta dir. However, the
patch is still a net improvement because the most likely scenario is
that the two servers are started with identical configurations.

This patch also removes the block_manager_lock_dirs flag which was
apparently unused. It was always marked as 'unsafe' so it's not a
compatibility issue to remove it without a deprecation period.

Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Reviewed-on: http://gerrit.cloudera.org:8080/9596
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Todd Lipcon <to...@apache.org>
(cherry picked from commit d88e5772ee323c8a305250c7d0aa0b49f67475dc)
---
M src/kudu/fs/block_manager.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/util/env_posix.cc
6 files changed, 49 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/9622/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9622
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.7.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Gerrit-Change-Number: 9622
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR](branch-1.7.x) KUDU-2153. Servers should not delete tmp files until after locking directories

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9622 )

Change subject: KUDU-2153. Servers should not delete tmp files until after locking directories
......................................................................

KUDU-2153. Servers should not delete tmp files until after locking directories

This changes the order of FsManager startup to not try to clean tmp
files until after successfully locking the data directories. This
prevents potential issues such as:

- a tserver is already running on some host, and in the middle of
  consensus voting. Thus it has created a tmp file.
- someone accidentally attempts to start a tserver with the same set of
  data dirs. Prior to this patch, it would delete the tmp file before
  realizing that it could not successfully lock its data dirs and
  aborting.
- the original tserver would crash or otherwise get very confused
  because the tmp file it just wrote would be gone.

This patch relies on the locking on the block manager instance files to
provide exclusive access to some non-block-manager-related storage such
as the consensus meta, etc. That means that it's still possible for
someone to hit the above issue if they were to start servers with
disjoint sets of data dirs but with the same meta dir. However, the
patch is still a net improvement because the most likely scenario is
that the two servers are started with identical configurations.

This patch also removes the block_manager_lock_dirs flag which was
apparently unused. It was always marked as 'unsafe' so it's not a
compatibility issue to remove it without a deprecation period.

Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Reviewed-on: http://gerrit.cloudera.org:8080/9596
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Todd Lipcon <to...@apache.org>
(cherry picked from commit d88e5772ee323c8a305250c7d0aa0b49f67475dc)
Reviewed-on: http://gerrit.cloudera.org:8080/9622
Reviewed-by: Grant Henke <gr...@gmail.com>
Tested-by: Grant Henke <gr...@gmail.com>
---
M src/kudu/fs/block_manager.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/util/env_posix.cc
6 files changed, 49 insertions(+), 24 deletions(-)

Approvals:
  Grant Henke: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/9622
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.7.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Gerrit-Change-Number: 9622
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR](branch-1.7.x) KUDU-2153. Servers should not delete tmp files until after locking directories

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9622 )

Change subject: KUDU-2153. Servers should not delete tmp files until after locking directories
......................................................................


Patch Set 1: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/9622
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.7.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Gerrit-Change-Number: 9622
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 14 Mar 2018 01:20:41 +0000
Gerrit-HasComments: No

[kudu-CR](branch-1.7.x) KUDU-2153. Servers should not delete tmp files until after locking directories

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9622 )

Change subject: KUDU-2153. Servers should not delete tmp files until after locking directories
......................................................................


Patch Set 1: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/9622
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.7.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Gerrit-Change-Number: 9622
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 14 Mar 2018 23:13:51 +0000
Gerrit-HasComments: No

[kudu-CR](branch-1.7.x) KUDU-2153. Servers should not delete tmp files until after locking directories

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has removed a vote on this change.

Change subject: KUDU-2153. Servers should not delete tmp files until after locking directories
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/9622
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.7.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I3a3471c8ce00e77fa1712ea518f6ab281864a08d
Gerrit-Change-Number: 9622
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>