You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by br...@apache.org on 2018/09/11 04:46:22 UTC

[1/2] hadoop git commit: HDFS-13237. [Documentation] RBF: Mount points across multiple subclusters. Contributed Íñigo Goiri

Repository: hadoop
Updated Branches:
  refs/heads/branch-3.1 b6bc0f409 -> 77dd45646
  refs/heads/trunk 987d8191a -> 96892c469


HDFS-13237. [Documentation] RBF: Mount points across multiple subclusters. Contributed Íñigo Goiri


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/96892c46
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/96892c46
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/96892c46

Branch: refs/heads/trunk
Commit: 96892c469b16c5aaff1b7c42f66f820344256bc2
Parents: 987d819
Author: Brahma Reddy Battula <br...@apache.org>
Authored: Tue Sep 11 10:12:34 2018 +0530
Committer: Brahma Reddy Battula <br...@apache.org>
Committed: Tue Sep 11 10:12:34 2018 +0530

----------------------------------------------------------------------
 .../src/site/markdown/HDFSRouterFederation.md   | 26 ++++++++++++++++++++
 1 file changed, 26 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/96892c46/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
index 2f49587..edc9918 100644
--- a/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
+++ b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
@@ -214,6 +214,7 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755.
 
+#### Quotas
 Router-based federation supports global quota at mount table level. Mount table entries may spread multiple subclusters and the global quota will be
 accounted across these subclusters.
 
@@ -229,6 +230,31 @@ Ls command will show below information for each mount table entry:
     Source                    Destinations              Owner                     Group                     Mode                      Quota/Usage
     /path                     ns0->/path                root                      supergroup                rwxr-xr-x                 [NsQuota: 50/0, SsQuota: 100 B/0 B]
 
+#### Multiple subclusters
+A mount point also supports mapping multiple subclusters.
+For example, to create a mount point that stores files in subclusters `ns1` and `ns2`.
+
+    [hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -add /data ns1,ns2 /data -order SPACE
+
+When listing `/data`, it will show all the folders and files in both subclusters.
+For deciding where to create a new file/folder it uses the order parameter, it currently supports the following methods:
+
+* HASH: Follow consistent hashing in the first level. Deeper levels will be in the one of the parent.
+* LOCAL: Try to write data in the local subcluster.
+* RANDOM: Random subcluster. This is usually useful for balancing the load across. Folders are created in all subclusters.
+* HASH_ALL: Follow consistent hashing at all the levels. This approach tries to balance the reads and writes evenly across subclusters. Folders are created in all subclusters.
+* SPACE: Try to write data in the subcluster with the most available space. Folders are created in all subclusters.
+
+For the hash-based approaches, the difference is that HASH would make all the files/folders within a folder belong to the same subcluster while HASH_ALL will spread all files under a mount point.
+For example, assuming we have a HASH mount point for `/data/hash`, files and folders under `/data/hash/folder0` will all be in the same subcluster.
+On the other hand, a HASH_ALL mount point for `/data/hash_all`, will spread files under `/data/hash_all/folder0` across all the subclusters for that mount point (subfolders will be created to all subclusters).
+
+RANDOM can be used for reading and writing data from/into different subclusters.
+The common use for this approach is to have the same data in multiple subclusters and balance the reads across subclusters.
+For example, if thousands of containers need to read the same data (e.g., a library), one can use RANDOM to read the data from any of the subclusters.
+
+Note that consistency of the data across subclusters is not guaranteed by the Router.
+
 ### Disabling nameservices
 
 To prevent accessing a nameservice (sublcuster), it can be disabled from the federation.


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org


[2/2] hadoop git commit: HDFS-13237. [Documentation] RBF: Mount points across multiple subclusters. Contributed Íñigo Goiri

Posted by br...@apache.org.
HDFS-13237. [Documentation] RBF: Mount points across multiple subclusters. Contributed Íñigo Goiri

(cherry picked from commit 96892c469b16c5aaff1b7c42f66f820344256bc2)


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/77dd4564
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/77dd4564
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/77dd4564

Branch: refs/heads/branch-3.1
Commit: 77dd4564612a38d47600c883318985511b83f65d
Parents: b6bc0f4
Author: Brahma Reddy Battula <br...@apache.org>
Authored: Tue Sep 11 10:12:34 2018 +0530
Committer: Brahma Reddy Battula <br...@apache.org>
Committed: Tue Sep 11 10:13:35 2018 +0530

----------------------------------------------------------------------
 .../src/site/markdown/HDFSRouterFederation.md   | 26 ++++++++++++++++++++
 1 file changed, 26 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/77dd4564/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
index c5bf5e1..62b1107 100644
--- a/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
+++ b/hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
@@ -214,6 +214,7 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755.
 
+#### Quotas
 Router-based federation supports global quota at mount table level. Mount table entries may spread multiple subclusters and the global quota will be
 accounted across these subclusters.
 
@@ -229,6 +230,31 @@ Ls command will show below information for each mount table entry:
     Source                    Destinations              Owner                     Group                     Mode                      Quota/Usage
     /path                     ns0->/path                root                      supergroup                rwxr-xr-x                 [NsQuota: 50/0, SsQuota: 100 B/0 B]
 
+#### Multiple subclusters
+A mount point also supports mapping multiple subclusters.
+For example, to create a mount point that stores files in subclusters `ns1` and `ns2`.
+
+    [hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -add /data ns1,ns2 /data -order SPACE
+
+When listing `/data`, it will show all the folders and files in both subclusters.
+For deciding where to create a new file/folder it uses the order parameter, it currently supports the following methods:
+
+* HASH: Follow consistent hashing in the first level. Deeper levels will be in the one of the parent.
+* LOCAL: Try to write data in the local subcluster.
+* RANDOM: Random subcluster. This is usually useful for balancing the load across. Folders are created in all subclusters.
+* HASH_ALL: Follow consistent hashing at all the levels. This approach tries to balance the reads and writes evenly across subclusters. Folders are created in all subclusters.
+* SPACE: Try to write data in the subcluster with the most available space. Folders are created in all subclusters.
+
+For the hash-based approaches, the difference is that HASH would make all the files/folders within a folder belong to the same subcluster while HASH_ALL will spread all files under a mount point.
+For example, assuming we have a HASH mount point for `/data/hash`, files and folders under `/data/hash/folder0` will all be in the same subcluster.
+On the other hand, a HASH_ALL mount point for `/data/hash_all`, will spread files under `/data/hash_all/folder0` across all the subclusters for that mount point (subfolders will be created to all subclusters).
+
+RANDOM can be used for reading and writing data from/into different subclusters.
+The common use for this approach is to have the same data in multiple subclusters and balance the reads across subclusters.
+For example, if thousands of containers need to read the same data (e.g., a library), one can use RANDOM to read the data from any of the subclusters.
+
+Note that consistency of the data across subclusters is not guaranteed by the Router.
+
 ### Disabling nameservices
 
 To prevent accessing a nameservice (sublcuster), it can be disabled from the federation.


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org