You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Raghavendra Chandra <ra...@gmail.com> on 2014/09/19 14:16:26 UTC

Query regarding the replication factor in hadoop

Hi All,

I have one very basic query regarding the replication factor in HDFS.

Scenario:

I have 4 node cluster : 3 data nodes and 1 master node.

The replication factor is 3. So ideally each data node would  get one
replica .

Assume that meanwhile one of the data node went down.

so ideally we will be having 2 data nodes.

Queries:

1. How hadoop will take care of balancing of replicas as the required
replicas are 3 , but we have only 2 data nodes up and running.

2. What happens when we try to write new  data into hdfs at this point of
time ? whether the write would be successful with only 2 data nodes and
replication factor 3 or it returns any error message?


These queries might be simple, but it would be really helpful if some one
can answer.

Thanks and regards,
Raghav Chandra

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your write will not succeed. You will get an exception like "xxxx could
only be replicated to 0 nodes, instead of 1"

More details here:
http://www.bigdataplanet.info/2013/10/Hadoop-Tutorial-Part-4-Write-Operations-in-HDFS.html
http://cloudcelebrity.wordpress.com/2013/11/25/handling-hadoop-error-could-only-be-replicated-to-0-nodes-instead-of-1-during-copying-data-to-hdfs-or-with-mapreduce-jobs/
http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo


Regards,
Shahab

On Fri, Sep 19, 2014 at 8:16 AM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your write will not succeed. You will get an exception like "xxxx could
only be replicated to 0 nodes, instead of 1"

More details here:
http://www.bigdataplanet.info/2013/10/Hadoop-Tutorial-Part-4-Write-Operations-in-HDFS.html
http://cloudcelebrity.wordpress.com/2013/11/25/handling-hadoop-error-could-only-be-replicated-to-0-nodes-instead-of-1-during-copying-data-to-hdfs-or-with-mapreduce-jobs/
http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo


Regards,
Shahab

On Fri, Sep 19, 2014 at 8:16 AM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Interesting. I thought that the write would fail in case if # of nodes
downs is greater than min-replication property. So in reality we only get a
warning while writing (and a info message through fsck.)

Regards,
Shahab

On Fri, Sep 19, 2014 at 9:26 AM, Abirami V <ab...@gmail.com> wrote:

> You  will get under replicated block and missing replicas when you run
> hdfs fsck /
>
> you may see info like the following
>
> Under replicated blk_-4791859336845413240_1544. Target Replicas
> is 3 but found 2 replica(s).
>
>
>
> On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
> adarsh.deshratnam@gmail.com> wrote:
>
>> 1. *How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.*
>>
>> *Ans:* As here the replication factor is three. The data block will be
>> replicated three time within 2 nodes. Block replication is random.
>>
>> *2. What happens when we try to write new  data into hdfs at this point
>> of time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?*
>> *Ans:*It will write successfully.
>>
>>
>> For further info please refer below link:
>> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>>
>>
>> Thanks,
>> Adarsh D
>>
>> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
>> raghavchandra.learning@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I have one very basic query regarding the replication factor in HDFS.
>>>
>>> Scenario:
>>>
>>> I have 4 node cluster : 3 data nodes and 1 master node.
>>>
>>> The replication factor is 3. So ideally each data node would  get one
>>> replica .
>>>
>>> Assume that meanwhile one of the data node went down.
>>>
>>> so ideally we will be having 2 data nodes.
>>>
>>> Queries:
>>>
>>> 1. How hadoop will take care of balancing of replicas as the required
>>> replicas are 3 , but we have only 2 data nodes up and running.
>>>
>>> 2. What happens when we try to write new  data into hdfs at this point
>>> of time ? whether the write would be successful with only 2 data nodes and
>>> replication factor 3 or it returns any error message?
>>>
>>>
>>> These queries might be simple, but it would be really helpful if some
>>> one can answer.
>>>
>>> Thanks and regards,
>>> Raghav Chandra
>>>
>>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Interesting. I thought that the write would fail in case if # of nodes
downs is greater than min-replication property. So in reality we only get a
warning while writing (and a info message through fsck.)

Regards,
Shahab

On Fri, Sep 19, 2014 at 9:26 AM, Abirami V <ab...@gmail.com> wrote:

> You  will get under replicated block and missing replicas when you run
> hdfs fsck /
>
> you may see info like the following
>
> Under replicated blk_-4791859336845413240_1544. Target Replicas
> is 3 but found 2 replica(s).
>
>
>
> On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
> adarsh.deshratnam@gmail.com> wrote:
>
>> 1. *How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.*
>>
>> *Ans:* As here the replication factor is three. The data block will be
>> replicated three time within 2 nodes. Block replication is random.
>>
>> *2. What happens when we try to write new  data into hdfs at this point
>> of time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?*
>> *Ans:*It will write successfully.
>>
>>
>> For further info please refer below link:
>> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>>
>>
>> Thanks,
>> Adarsh D
>>
>> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
>> raghavchandra.learning@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I have one very basic query regarding the replication factor in HDFS.
>>>
>>> Scenario:
>>>
>>> I have 4 node cluster : 3 data nodes and 1 master node.
>>>
>>> The replication factor is 3. So ideally each data node would  get one
>>> replica .
>>>
>>> Assume that meanwhile one of the data node went down.
>>>
>>> so ideally we will be having 2 data nodes.
>>>
>>> Queries:
>>>
>>> 1. How hadoop will take care of balancing of replicas as the required
>>> replicas are 3 , but we have only 2 data nodes up and running.
>>>
>>> 2. What happens when we try to write new  data into hdfs at this point
>>> of time ? whether the write would be successful with only 2 data nodes and
>>> replication factor 3 or it returns any error message?
>>>
>>>
>>> These queries might be simple, but it would be really helpful if some
>>> one can answer.
>>>
>>> Thanks and regards,
>>> Raghav Chandra
>>>
>>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Interesting. I thought that the write would fail in case if # of nodes
downs is greater than min-replication property. So in reality we only get a
warning while writing (and a info message through fsck.)

Regards,
Shahab

On Fri, Sep 19, 2014 at 9:26 AM, Abirami V <ab...@gmail.com> wrote:

> You  will get under replicated block and missing replicas when you run
> hdfs fsck /
>
> you may see info like the following
>
> Under replicated blk_-4791859336845413240_1544. Target Replicas
> is 3 but found 2 replica(s).
>
>
>
> On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
> adarsh.deshratnam@gmail.com> wrote:
>
>> 1. *How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.*
>>
>> *Ans:* As here the replication factor is three. The data block will be
>> replicated three time within 2 nodes. Block replication is random.
>>
>> *2. What happens when we try to write new  data into hdfs at this point
>> of time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?*
>> *Ans:*It will write successfully.
>>
>>
>> For further info please refer below link:
>> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>>
>>
>> Thanks,
>> Adarsh D
>>
>> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
>> raghavchandra.learning@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I have one very basic query regarding the replication factor in HDFS.
>>>
>>> Scenario:
>>>
>>> I have 4 node cluster : 3 data nodes and 1 master node.
>>>
>>> The replication factor is 3. So ideally each data node would  get one
>>> replica .
>>>
>>> Assume that meanwhile one of the data node went down.
>>>
>>> so ideally we will be having 2 data nodes.
>>>
>>> Queries:
>>>
>>> 1. How hadoop will take care of balancing of replicas as the required
>>> replicas are 3 , but we have only 2 data nodes up and running.
>>>
>>> 2. What happens when we try to write new  data into hdfs at this point
>>> of time ? whether the write would be successful with only 2 data nodes and
>>> replication factor 3 or it returns any error message?
>>>
>>>
>>> These queries might be simple, but it would be really helpful if some
>>> one can answer.
>>>
>>> Thanks and regards,
>>> Raghav Chandra
>>>
>>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Interesting. I thought that the write would fail in case if # of nodes
downs is greater than min-replication property. So in reality we only get a
warning while writing (and a info message through fsck.)

Regards,
Shahab

On Fri, Sep 19, 2014 at 9:26 AM, Abirami V <ab...@gmail.com> wrote:

> You  will get under replicated block and missing replicas when you run
> hdfs fsck /
>
> you may see info like the following
>
> Under replicated blk_-4791859336845413240_1544. Target Replicas
> is 3 but found 2 replica(s).
>
>
>
> On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
> adarsh.deshratnam@gmail.com> wrote:
>
>> 1. *How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.*
>>
>> *Ans:* As here the replication factor is three. The data block will be
>> replicated three time within 2 nodes. Block replication is random.
>>
>> *2. What happens when we try to write new  data into hdfs at this point
>> of time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?*
>> *Ans:*It will write successfully.
>>
>>
>> For further info please refer below link:
>> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>>
>>
>> Thanks,
>> Adarsh D
>>
>> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
>> raghavchandra.learning@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I have one very basic query regarding the replication factor in HDFS.
>>>
>>> Scenario:
>>>
>>> I have 4 node cluster : 3 data nodes and 1 master node.
>>>
>>> The replication factor is 3. So ideally each data node would  get one
>>> replica .
>>>
>>> Assume that meanwhile one of the data node went down.
>>>
>>> so ideally we will be having 2 data nodes.
>>>
>>> Queries:
>>>
>>> 1. How hadoop will take care of balancing of replicas as the required
>>> replicas are 3 , but we have only 2 data nodes up and running.
>>>
>>> 2. What happens when we try to write new  data into hdfs at this point
>>> of time ? whether the write would be successful with only 2 data nodes and
>>> replication factor 3 or it returns any error message?
>>>
>>>
>>> These queries might be simple, but it would be really helpful if some
>>> one can answer.
>>>
>>> Thanks and regards,
>>> Raghav Chandra
>>>
>>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Abirami V <ab...@gmail.com>.
You  will get under replicated block and missing replicas when you run hdfs
fsck /

you may see info like the following

Under replicated blk_-4791859336845413240_1544. Target Replicas
is 3 but found 2 replica(s).



On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
adarsh.deshratnam@gmail.com> wrote:

> 1. *How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.*
>
> *Ans:* As here the replication factor is three. The data block will be
> replicated three time within 2 nodes. Block replication is random.
>
> *2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?*
> *Ans:*It will write successfully.
>
>
> For further info please refer below link:
> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>
>
> Thanks,
> Adarsh D
>
> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
> raghavchandra.learning@gmail.com> wrote:
>
>> Hi All,
>>
>> I have one very basic query regarding the replication factor in HDFS.
>>
>> Scenario:
>>
>> I have 4 node cluster : 3 data nodes and 1 master node.
>>
>> The replication factor is 3. So ideally each data node would  get one
>> replica .
>>
>> Assume that meanwhile one of the data node went down.
>>
>> so ideally we will be having 2 data nodes.
>>
>> Queries:
>>
>> 1. How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.
>>
>> 2. What happens when we try to write new  data into hdfs at this point of
>> time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?
>>
>>
>> These queries might be simple, but it would be really helpful if some one
>> can answer.
>>
>> Thanks and regards,
>> Raghav Chandra
>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Abirami V <ab...@gmail.com>.
You  will get under replicated block and missing replicas when you run hdfs
fsck /

you may see info like the following

Under replicated blk_-4791859336845413240_1544. Target Replicas
is 3 but found 2 replica(s).



On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
adarsh.deshratnam@gmail.com> wrote:

> 1. *How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.*
>
> *Ans:* As here the replication factor is three. The data block will be
> replicated three time within 2 nodes. Block replication is random.
>
> *2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?*
> *Ans:*It will write successfully.
>
>
> For further info please refer below link:
> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>
>
> Thanks,
> Adarsh D
>
> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
> raghavchandra.learning@gmail.com> wrote:
>
>> Hi All,
>>
>> I have one very basic query regarding the replication factor in HDFS.
>>
>> Scenario:
>>
>> I have 4 node cluster : 3 data nodes and 1 master node.
>>
>> The replication factor is 3. So ideally each data node would  get one
>> replica .
>>
>> Assume that meanwhile one of the data node went down.
>>
>> so ideally we will be having 2 data nodes.
>>
>> Queries:
>>
>> 1. How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.
>>
>> 2. What happens when we try to write new  data into hdfs at this point of
>> time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?
>>
>>
>> These queries might be simple, but it would be really helpful if some one
>> can answer.
>>
>> Thanks and regards,
>> Raghav Chandra
>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Abirami V <ab...@gmail.com>.
You  will get under replicated block and missing replicas when you run hdfs
fsck /

you may see info like the following

Under replicated blk_-4791859336845413240_1544. Target Replicas
is 3 but found 2 replica(s).



On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
adarsh.deshratnam@gmail.com> wrote:

> 1. *How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.*
>
> *Ans:* As here the replication factor is three. The data block will be
> replicated three time within 2 nodes. Block replication is random.
>
> *2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?*
> *Ans:*It will write successfully.
>
>
> For further info please refer below link:
> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>
>
> Thanks,
> Adarsh D
>
> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
> raghavchandra.learning@gmail.com> wrote:
>
>> Hi All,
>>
>> I have one very basic query regarding the replication factor in HDFS.
>>
>> Scenario:
>>
>> I have 4 node cluster : 3 data nodes and 1 master node.
>>
>> The replication factor is 3. So ideally each data node would  get one
>> replica .
>>
>> Assume that meanwhile one of the data node went down.
>>
>> so ideally we will be having 2 data nodes.
>>
>> Queries:
>>
>> 1. How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.
>>
>> 2. What happens when we try to write new  data into hdfs at this point of
>> time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?
>>
>>
>> These queries might be simple, but it would be really helpful if some one
>> can answer.
>>
>> Thanks and regards,
>> Raghav Chandra
>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by Abirami V <ab...@gmail.com>.
You  will get under replicated block and missing replicas when you run hdfs
fsck /

you may see info like the following

Under replicated blk_-4791859336845413240_1544. Target Replicas
is 3 but found 2 replica(s).



On Fri, Sep 19, 2014 at 5:36 AM, adarsh deshratnam <
adarsh.deshratnam@gmail.com> wrote:

> 1. *How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.*
>
> *Ans:* As here the replication factor is three. The data block will be
> replicated three time within 2 nodes. Block replication is random.
>
> *2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?*
> *Ans:*It will write successfully.
>
>
> For further info please refer below link:
> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
>
>
> Thanks,
> Adarsh D
>
> On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
> raghavchandra.learning@gmail.com> wrote:
>
>> Hi All,
>>
>> I have one very basic query regarding the replication factor in HDFS.
>>
>> Scenario:
>>
>> I have 4 node cluster : 3 data nodes and 1 master node.
>>
>> The replication factor is 3. So ideally each data node would  get one
>> replica .
>>
>> Assume that meanwhile one of the data node went down.
>>
>> so ideally we will be having 2 data nodes.
>>
>> Queries:
>>
>> 1. How hadoop will take care of balancing of replicas as the required
>> replicas are 3 , but we have only 2 data nodes up and running.
>>
>> 2. What happens when we try to write new  data into hdfs at this point of
>> time ? whether the write would be successful with only 2 data nodes and
>> replication factor 3 or it returns any error message?
>>
>>
>> These queries might be simple, but it would be really helpful if some one
>> can answer.
>>
>> Thanks and regards,
>> Raghav Chandra
>>
>>
>

Re: Query regarding the replication factor in hadoop

Posted by adarsh deshratnam <ad...@gmail.com>.
1. *How hadoop will take care of balancing of replicas as the required
replicas are 3 , but we have only 2 data nodes up and running.*

*Ans:* As here the replication factor is three. The data block will be
replicated three time within 2 nodes. Block replication is random.

*2. What happens when we try to write new  data into hdfs at this point of
time ? whether the write would be successful with only 2 data nodes and
replication factor 3 or it returns any error message?*
*Ans:*It will write successfully.


For further info please refer below link:
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html


Thanks,
Adarsh D

On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your write will not succeed. You will get an exception like "xxxx could
only be replicated to 0 nodes, instead of 1"

More details here:
http://www.bigdataplanet.info/2013/10/Hadoop-Tutorial-Part-4-Write-Operations-in-HDFS.html
http://cloudcelebrity.wordpress.com/2013/11/25/handling-hadoop-error-could-only-be-replicated-to-0-nodes-instead-of-1-during-copying-data-to-hdfs-or-with-mapreduce-jobs/
http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo


Regards,
Shahab

On Fri, Sep 19, 2014 at 8:16 AM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by adarsh deshratnam <ad...@gmail.com>.
1. *How hadoop will take care of balancing of replicas as the required
replicas are 3 , but we have only 2 data nodes up and running.*

*Ans:* As here the replication factor is three. The data block will be
replicated three time within 2 nodes. Block replication is random.

*2. What happens when we try to write new  data into hdfs at this point of
time ? whether the write would be successful with only 2 data nodes and
replication factor 3 or it returns any error message?*
*Ans:*It will write successfully.


For further info please refer below link:
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html


Thanks,
Adarsh D

On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by adarsh deshratnam <ad...@gmail.com>.
1. *How hadoop will take care of balancing of replicas as the required
replicas are 3 , but we have only 2 data nodes up and running.*

*Ans:* As here the replication factor is three. The data block will be
replicated three time within 2 nodes. Block replication is random.

*2. What happens when we try to write new  data into hdfs at this point of
time ? whether the write would be successful with only 2 data nodes and
replication factor 3 or it returns any error message?*
*Ans:*It will write successfully.


For further info please refer below link:
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html


Thanks,
Adarsh D

On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by adarsh deshratnam <ad...@gmail.com>.
1. *How hadoop will take care of balancing of replicas as the required
replicas are 3 , but we have only 2 data nodes up and running.*

*Ans:* As here the replication factor is three. The data block will be
replicated three time within 2 nodes. Block replication is random.

*2. What happens when we try to write new  data into hdfs at this point of
time ? whether the write would be successful with only 2 data nodes and
replication factor 3 or it returns any error message?*
*Ans:*It will write successfully.


For further info please refer below link:
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html


Thanks,
Adarsh D

On Fri, Sep 19, 2014 at 5:46 PM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>

Re: Query regarding the replication factor in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your write will not succeed. You will get an exception like "xxxx could
only be replicated to 0 nodes, instead of 1"

More details here:
http://www.bigdataplanet.info/2013/10/Hadoop-Tutorial-Part-4-Write-Operations-in-HDFS.html
http://cloudcelebrity.wordpress.com/2013/11/25/handling-hadoop-error-could-only-be-replicated-to-0-nodes-instead-of-1-during-copying-data-to-hdfs-or-with-mapreduce-jobs/
http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo


Regards,
Shahab

On Fri, Sep 19, 2014 at 8:16 AM, Raghavendra Chandra <
raghavchandra.learning@gmail.com> wrote:

> Hi All,
>
> I have one very basic query regarding the replication factor in HDFS.
>
> Scenario:
>
> I have 4 node cluster : 3 data nodes and 1 master node.
>
> The replication factor is 3. So ideally each data node would  get one
> replica .
>
> Assume that meanwhile one of the data node went down.
>
> so ideally we will be having 2 data nodes.
>
> Queries:
>
> 1. How hadoop will take care of balancing of replicas as the required
> replicas are 3 , but we have only 2 data nodes up and running.
>
> 2. What happens when we try to write new  data into hdfs at this point of
> time ? whether the write would be successful with only 2 data nodes and
> replication factor 3 or it returns any error message?
>
>
> These queries might be simple, but it would be really helpful if some one
> can answer.
>
> Thanks and regards,
> Raghav Chandra
>
>