You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by max scalf <or...@gmail.com> on 2016/02/29 15:26:46 UTC

move specific data set to specific nodes..

hello all,

I have a general question with regards to hadoop.  We have a 20 nodes
cluster, each node has about 8TB of local storage, our replication factor
is 3.  We plan to add 4 more nodes to the cluster as we are running low on
space.  I am sure I am add 4 more nodes with local storage of 20B each,
instead of our regular 8TB.

My question is, is there anyway to move specific data set(let's call this
COLD data that is NOT being accessed anymore) to these specific nodes as
these are storage dense node.

If that is possible how can we accomplish that, if not how do i fully
utilized the 4 nodes that will have 20TB of each node.  When I talk about
the storage on each nodes that is RAW storage.

any ideas/thoughts ??

RE: move specific data set to specific nodes..

Posted by Matieu Bachant-Lagace <ma...@ubisoft.com>.
Hi Max,

You might want to look into this :

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

Related to JIRA  Heterogeneous Storage (HDFS-2832)<https://issues.apache.org/jira/browse/HDFS-2832>

Have not yet used it on my side but I plan to as we have data that can be archived on low-perf machine for “archive” purpose.

Matieu

De : max scalf [mailto:oracle.blog3@gmail.com]
Envoyé : 29 février 2016 09:27
À : HDP mailing list <us...@hadoop.apache.org>
Objet : move specific data set to specific nodes..

hello all,

I have a general question with regards to hadoop.  We have a 20 nodes cluster, each node has about 8TB of local storage, our replication factor is 3.  We plan to add 4 more nodes to the cluster as we are running low on space.  I am sure I am add 4 more nodes with local storage of 20B each, instead of our regular 8TB.

My question is, is there anyway to move specific data set(let's call this COLD data that is NOT being accessed anymore) to these specific nodes as these are storage dense node.

If that is possible how can we accomplish that, if not how do i fully utilized the 4 nodes that will have 20TB of each node.  When I talk about the storage on each nodes that is RAW storage.

any ideas/thoughts ??

RE: move specific data set to specific nodes..

Posted by Matieu Bachant-Lagace <ma...@ubisoft.com>.
Hi Max,

You might want to look into this :

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

Related to JIRA  Heterogeneous Storage (HDFS-2832)<https://issues.apache.org/jira/browse/HDFS-2832>

Have not yet used it on my side but I plan to as we have data that can be archived on low-perf machine for “archive” purpose.

Matieu

De : max scalf [mailto:oracle.blog3@gmail.com]
Envoyé : 29 février 2016 09:27
À : HDP mailing list <us...@hadoop.apache.org>
Objet : move specific data set to specific nodes..

hello all,

I have a general question with regards to hadoop.  We have a 20 nodes cluster, each node has about 8TB of local storage, our replication factor is 3.  We plan to add 4 more nodes to the cluster as we are running low on space.  I am sure I am add 4 more nodes with local storage of 20B each, instead of our regular 8TB.

My question is, is there anyway to move specific data set(let's call this COLD data that is NOT being accessed anymore) to these specific nodes as these are storage dense node.

If that is possible how can we accomplish that, if not how do i fully utilized the 4 nodes that will have 20TB of each node.  When I talk about the storage on each nodes that is RAW storage.

any ideas/thoughts ??

RE: move specific data set to specific nodes..

Posted by Matieu Bachant-Lagace <ma...@ubisoft.com>.
Hi Max,

You might want to look into this :

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

Related to JIRA  Heterogeneous Storage (HDFS-2832)<https://issues.apache.org/jira/browse/HDFS-2832>

Have not yet used it on my side but I plan to as we have data that can be archived on low-perf machine for “archive” purpose.

Matieu

De : max scalf [mailto:oracle.blog3@gmail.com]
Envoyé : 29 février 2016 09:27
À : HDP mailing list <us...@hadoop.apache.org>
Objet : move specific data set to specific nodes..

hello all,

I have a general question with regards to hadoop.  We have a 20 nodes cluster, each node has about 8TB of local storage, our replication factor is 3.  We plan to add 4 more nodes to the cluster as we are running low on space.  I am sure I am add 4 more nodes with local storage of 20B each, instead of our regular 8TB.

My question is, is there anyway to move specific data set(let's call this COLD data that is NOT being accessed anymore) to these specific nodes as these are storage dense node.

If that is possible how can we accomplish that, if not how do i fully utilized the 4 nodes that will have 20TB of each node.  When I talk about the storage on each nodes that is RAW storage.

any ideas/thoughts ??

RE: move specific data set to specific nodes..

Posted by Matieu Bachant-Lagace <ma...@ubisoft.com>.
Hi Max,

You might want to look into this :

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

Related to JIRA  Heterogeneous Storage (HDFS-2832)<https://issues.apache.org/jira/browse/HDFS-2832>

Have not yet used it on my side but I plan to as we have data that can be archived on low-perf machine for “archive” purpose.

Matieu

De : max scalf [mailto:oracle.blog3@gmail.com]
Envoyé : 29 février 2016 09:27
À : HDP mailing list <us...@hadoop.apache.org>
Objet : move specific data set to specific nodes..

hello all,

I have a general question with regards to hadoop.  We have a 20 nodes cluster, each node has about 8TB of local storage, our replication factor is 3.  We plan to add 4 more nodes to the cluster as we are running low on space.  I am sure I am add 4 more nodes with local storage of 20B each, instead of our regular 8TB.

My question is, is there anyway to move specific data set(let's call this COLD data that is NOT being accessed anymore) to these specific nodes as these are storage dense node.

If that is possible how can we accomplish that, if not how do i fully utilized the 4 nodes that will have 20TB of each node.  When I talk about the storage on each nodes that is RAW storage.

any ideas/thoughts ??