You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Brahma Reddy Battula <br...@hotmail.com> on 2017/07/01 00:05:33 UTC

Re: Ensure High Availability of Datanodes in a HDFS cluster

1.Yes, those will ensure that file will be written to available nodes .


2.

BlockManager: defaultReplication         = 2

This is the Default block replication which you configured in server (Namenode). The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.



3. "dfs.replication" is client(in your case confluent kafka) side property.May be,you can cross check this configuration in kafka.



-Brahma Reddy Battula
________________________________
From: Nishant Verma <ni...@gmail.com>
Sent: Friday, June 30, 2017 7:50 PM
To: common-user@hadoop.apache.org
Subject: Ensure High Availability of Datanodes in a HDFS cluster


Hi

I have a two master and three datanode HDFS cluster setup. They are AWS EC2 instances.

I have to test High Availability of Datanodes i.e., if during load run where data is written on HDFS, a datanode dies then there is no data loss. The two remaning datanodes which are alive should take care of the data writes.

I have set below properties in hdfs-site.xml. dfs.replication = 2 (because if any one datanode dies, then there is no issue of not able to meet replication factor)

dfs.client.block.write.replace-datanode-on-failure.policy = ALWAYS
dfs.client.block.write.replace-datanode-on-failure.enable = true
dfs.client.block.write.replace-datanode-on-failure.best-effort = true


My questions are:

1 - Does setting up above properties suffice my Datanode High Availability? Or something else is needed? 2 - On dfs service startup, I do see below INFO on namenode logs:

2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication         = 2
2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication             = 512
2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication             = 1
2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams      = 2


But I still see that the files being created on HDFS are with replication factor 3. Why is that so? This would hurt my High Availability of Datanodes.

-rw-r--r--   3 hadoopuser supergroup     247373 2017-06-29 09:36 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+210+0001557358+0001557452
-rw-r--r--   3 hadoopuser supergroup       1344 2017-06-29 08:33 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+228+0001432839+0001432850
-rw-r--r--   3 hadoopuser supergroup       3472 2017-06-29 09:03 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+228+0001432851+0001432881
-rw-r--r--   3 hadoopuser supergroup       2576 2017-06-29 08:33 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+23+0001236477+0001236499


P.S. - My records are written on HDFS by Confluent Kafka Connect HDFS Sink Connector.


Thanks

Nishant

Re: Ensure High Availability of Datanodes in a HDFS cluster

Posted by Philippe Kernévez <pk...@octo.com>.
Hi,

As you have only a min of 1 (minReplication=1, meaning no replication) you
infrastructure won't provide a guarantee of not loosing data in case of
failure.
If a user do a command like this :

hdfs dfs -Ddfs.replication=1 -put localfile hdfsfile

The file will not be replicated and the block will be definitely loosed
when the datanode goes down.

You should set a minReplication of 2

Regards,
Philippe


On Sat, Jul 1, 2017 at 2:05 AM, Brahma Reddy Battula <
brahmareddy.battula@hotmail.com> wrote:

>
> 1.Yes, those will ensure that file will be written to available nodes .
>
>
> 2.
>
> BlockManager: defaultReplication         = 2
>
> This is the Default block replication which you configured in server
> (Namenode). The actual number of replications can be specified when the
> file is created. The default is used if replication is not specified in
> create time.
>
>
>
> 3. *"dfs.replication"* is client(in your case confluent kafka) side
> property.May be,you can cross check this configuration in kafka.
>
>
>
> -Brahma Reddy Battula
> ------------------------------
> *From:* Nishant Verma <ni...@gmail.com>
> *Sent:* Friday, June 30, 2017 7:50 PM
> *To:* common-user@hadoop.apache.org
> *Subject:* Ensure High Availability of Datanodes in a HDFS cluster
>
>
> Hi
>
> I have a two master and three datanode HDFS cluster setup. They are AWS
> EC2 instances.
>
> I have to test High Availability of Datanodes i.e., if during load run
> where data is written on HDFS, a datanode dies then there is no data loss.
> The two remaning datanodes which are alive should take care of the data
> writes.
>
> I have set below properties in hdfs-site.xml. dfs.replication = 2 (because
> if any one datanode dies, then there is no issue of not able to meet
> replication factor)
>
> dfs.client.block.write.replace-datanode-on-failure.policy = ALWAYS
> dfs.client.block.write.replace-datanode-on-failure.enable = true
> dfs.client.block.write.replace-datanode-on-failure.best-effort = true
>
> My questions are:
>
> 1 - Does setting up above properties suffice my Datanode High
> Availability? Or something else is needed? 2 - On dfs service startup, I do
> see below INFO on namenode logs:
>
> 2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication         = 2
> 2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication             = 512
> 2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication             = 1
> 2017-06-27 10:51:52,546 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams      = 2
>
> But I still see that the files being created on HDFS are with replication
> factor 3. Why is that so? This would hurt my High Availability of Datanodes.
>
> -rw-r--r--   3 hadoopuser supergroup     247373 2017-06-29 09:36 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+210+0001557358+0001557452
> -rw-r--r--   3 hadoopuser supergroup       1344 2017-06-29 08:33 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+228+0001432839+0001432850
> -rw-r--r--   3 hadoopuser supergroup       3472 2017-06-29 09:03 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+228+0001432851+0001432881
> -rw-r--r--   3 hadoopuser supergroup       2576 2017-06-29 08:33 /topics/testTopic/year=2017/month=06/day=29/hour=14/testTopic+23+0001236477+0001236499
>
> P.S. - My records are written on HDFS by Confluent Kafka Connect HDFS Sink
> Connector.
>
>
> Thanks
>
> Nishant
>



-- 
Philippe Kernévez



Directeur technique (Suisse),
pkernevez@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.ch