You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2017/05/01 16:45:18 UTC

[kudu-CR] WIP KUDU-1952 Remove round-robin for block placement

Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#13).

Change subject: WIP KUDU-1952 Remove round-robin for block placement
......................................................................

WIP KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's superblock
will not have a DataDirGroup. One will be generated containing all data
directories, as the tablet's data may already be spread across any
number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff between
I/O and failure disk-failure tolerance, the default behavior will be to
spread tablet data across all disks.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Marked WIP since I'd like to do some perf before pushing.

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
35 files changed, 696 insertions(+), 168 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>