You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "YangSong (Jira)" <ji...@apache.org> on 2019/10/31 09:26:00 UTC

[jira] [Issue Comment Deleted] (KUDU-2975) Spread WAL across multiple data directories

     [ https://issues.apache.org/jira/browse/KUDU-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

YangSong updated KUDU-2975:
---------------------------
    Comment: was deleted

(was: Thank you, let me summarize the implementation:
 # We need to add a new gflag, such as "–fs_wal_dirs", to support spreading WAL across multiple dirs. And we should keep around {{--fs_wal_dir}} for backwards compatibility. User can chose one of them.
 # The first time 'fs_manager' is initialized it needs to generate an instance file per wal directory. If the data directories (fs_data_dirs) not provided, we use write-ahead log directories(fs_wal_dirs) as data directories. If the metadata directory not provided, we use the first wal directories or the first data directories. If one of the WAL directories doesn't exist, report a fatal error. If some of WAL directories have 'instance' file, but some of them have not, report a fatal error. 
 # Add a class WalDirManager, maybe like this:class WalDirManager {
 public:
  static Status Create(CanonicalizedRootsList wal_fs_roots,
  std::unique_ptr<WalDirManager>* wal_manager);
  static Status Open(CanonicalizedRootsList wal_fs_roots,
  std::unique_ptr<WalDirManager>* wal_manager);
  ~WalDirManager();
  void Shutdown();
  Status LoadWalDirFromPB(const std::string& tablet_id, const WalDirPB& pb);
  std::set<std::string> FindTabletsByWALDir(const std::string& wal_dir) const;
  Status FindWalDirByTabletId(const std::string& tablet_id, std::string* wal_dir) const;
  Status MarkWalDirsFailed(const std::string& error_message = "");
  void MarkWalDirFailed(const std::string& dir);
  bool IsWalDirFailed(const std::string& dir) const;
  const std::set<string> GetFailedDataDirs() const;
  std::vector<std::string> GetWalDirs() const;
  string GetWalDirByUuid(string uuid) const;
  Status CreateWalDir(const std::string& tablet_id);

private:
  WalDirManager(CanonicalizedRootsList canonicalized_wal_roots);

  const CanonicalizedRootsList canonicalized_wal_fs_roots_;
  typedef std::unordered_map<std::string, std::string> DirByUuidMap;
  DirByUuidMap dir_by_uuid_;
  typedef std::multimap<std::string, std::string> TabletsByDirMap;
  TabletsByDirMap tablets_by_dir_;
  typedef std::set<string> FailedWalDirSet;
  FailedWalDirSet failed_data_dirs_;
};
We need to update the "instance" file under per WAL dir when creating a new WalDirManager class. Each wal directory generates its own uuid, and recorde it in the instance file.The directory structure may be like this:    --wal        ----instance
 # adf
 # asdfadf
 # dasf

                

 )

> Spread WAL across multiple data directories
> -------------------------------------------
>
>                 Key: KUDU-2975
>                 URL: https://issues.apache.org/jira/browse/KUDU-2975
>             Project: Kudu
>          Issue Type: New Feature
>          Components: fs, tablet, tserver
>            Reporter: LiFu He
>            Priority: Major
>         Attachments: network.png, tserver-WARNING.png, util.png
>
>
> Recently, we deployed a new kudu cluster and every node has 12 SSD. Then, we created a big table and loaded data to it through flink.  We noticed that the util of one SSD which is used to store WAL is 100% but others are free. So, we suggest to spread WAL across multiple data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)