You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "YangSong (Jira)" <ji...@apache.org> on 2019/10/31 09:26:00 UTC
[jira] [Issue Comment Deleted] (KUDU-2975) Spread WAL across
multiple data directories
[ https://issues.apache.org/jira/browse/KUDU-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
YangSong updated KUDU-2975:
---------------------------
Comment: was deleted
(was: Thank you, let me summarize the implementation:
# We need to add a new gflag, such as "–fs_wal_dirs", to support spreading WAL across multiple dirs. And we should keep around {{--fs_wal_dir}} for backwards compatibility. User can chose one of them.
# The first time 'fs_manager' is initialized it needs to generate an instance file per wal directory. If the data directories (fs_data_dirs) not provided, we use write-ahead log directories(fs_wal_dirs) as data directories. If the metadata directory not provided, we use the first wal directories or the first data directories. If one of the WAL directories doesn't exist, report a fatal error. If some of WAL directories have 'instance' file, but some of them have not, report a fatal error.
# Add a class WalDirManager, maybe like this:class WalDirManager {
public:
static Status Create(CanonicalizedRootsList wal_fs_roots,
std::unique_ptr<WalDirManager>* wal_manager);
static Status Open(CanonicalizedRootsList wal_fs_roots,
std::unique_ptr<WalDirManager>* wal_manager);
~WalDirManager();
void Shutdown();
Status LoadWalDirFromPB(const std::string& tablet_id, const WalDirPB& pb);
std::set<std::string> FindTabletsByWALDir(const std::string& wal_dir) const;
Status FindWalDirByTabletId(const std::string& tablet_id, std::string* wal_dir) const;
Status MarkWalDirsFailed(const std::string& error_message = "");
void MarkWalDirFailed(const std::string& dir);
bool IsWalDirFailed(const std::string& dir) const;
const std::set<string> GetFailedDataDirs() const;
std::vector<std::string> GetWalDirs() const;
string GetWalDirByUuid(string uuid) const;
Status CreateWalDir(const std::string& tablet_id);
private:
WalDirManager(CanonicalizedRootsList canonicalized_wal_roots);
const CanonicalizedRootsList canonicalized_wal_fs_roots_;
typedef std::unordered_map<std::string, std::string> DirByUuidMap;
DirByUuidMap dir_by_uuid_;
typedef std::multimap<std::string, std::string> TabletsByDirMap;
TabletsByDirMap tablets_by_dir_;
typedef std::set<string> FailedWalDirSet;
FailedWalDirSet failed_data_dirs_;
};
We need to update the "instance" file under per WAL dir when creating a new WalDirManager class. Each wal directory generates its own uuid, and recorde it in the instance file.The directory structure may be like this: --wal ----instance
# adf
# asdfadf
# dasf
)
> Spread WAL across multiple data directories
> -------------------------------------------
>
> Key: KUDU-2975
> URL: https://issues.apache.org/jira/browse/KUDU-2975
> Project: Kudu
> Issue Type: New Feature
> Components: fs, tablet, tserver
> Reporter: LiFu He
> Priority: Major
> Attachments: network.png, tserver-WARNING.png, util.png
>
>
> Recently, we deployed a new kudu cluster and every node has 12 SSD. Then, we created a big table and loaded data to it through flink. We noticed that the util of one SSD which is used to store WAL is 100% but others are free. So, we suggest to spread WAL across multiple data directories.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)