You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by jianan hu <hu...@gmail.com> on 2014/05/11 04:55:33 UTC

Rack awareness and pipeline write

Hi everyone,

See HDFS documents, It says "For the common case, when the replication
factor is three, HDFS’s placement policy is to put one replica on one node
in the local rack, another on a node in a different (remote) rack, and the
last on a different node in the same remote rack."

Assume there are two racks A and B. According to rack awareness, the first
block is put in rack A, and the the other two replicated blocks will be
pushed into rack B.

However, why not store the first and second replicas in the local rack (A),
and the last in a different remote rack (B)? Both two scenarios have same
network traffic. What's the disadvantage of it?

Thanks.

Best Regards,
Jianan

Re: Rack awareness and pipeline write

Posted by Stanley Shi <ss...@gopivotal.com>.
in some case you may not find the third node to place replica.

Regards,
*Stanley Shi,*



On Sun, May 11, 2014 at 10:55 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Raj K Singh <ra...@gmail.com>.
it's fine two place 2 replica on the local rack's nodes first and then the
third replica on the different rack node if the replica count is 3.
now consider the scenario for the replication factor 2. if it place these
two replica on the same rack, then you can loose all of your replica when
the rack goes down.
hope it will clear your doubt.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Sun, May 11, 2014 at 8:25 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Raj K Singh <ra...@gmail.com>.
it's fine two place 2 replica on the local rack's nodes first and then the
third replica on the different rack node if the replica count is 3.
now consider the scenario for the replication factor 2. if it place these
two replica on the same rack, then you can loose all of your replica when
the rack goes down.
hope it will clear your doubt.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Sun, May 11, 2014 at 8:25 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Raj K Singh <ra...@gmail.com>.
it's fine two place 2 replica on the local rack's nodes first and then the
third replica on the different rack node if the replica count is 3.
now consider the scenario for the replication factor 2. if it place these
two replica on the same rack, then you can loose all of your replica when
the rack goes down.
hope it will clear your doubt.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Sun, May 11, 2014 at 8:25 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Stanley Shi <ss...@gopivotal.com>.
in some case you may not find the third node to place replica.

Regards,
*Stanley Shi,*



On Sun, May 11, 2014 at 10:55 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Stanley Shi <ss...@gopivotal.com>.
in some case you may not find the third node to place replica.

Regards,
*Stanley Shi,*



On Sun, May 11, 2014 at 10:55 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Stanley Shi <ss...@gopivotal.com>.
in some case you may not find the third node to place replica.

Regards,
*Stanley Shi,*



On Sun, May 11, 2014 at 10:55 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>

Re: Rack awareness and pipeline write

Posted by Raj K Singh <ra...@gmail.com>.
it's fine two place 2 replica on the local rack's nodes first and then the
third replica on the different rack node if the replica count is 3.
now consider the scenario for the replication factor 2. if it place these
two replica on the same rack, then you can loose all of your replica when
the rack goes down.
hope it will clear your doubt.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Sun, May 11, 2014 at 8:25 AM, jianan hu <hu...@gmail.com> wrote:

> Hi everyone,
>
> See HDFS documents, It says "For the common case, when the replication
> factor is three, HDFS’s placement policy is to put one replica on one node
> in the local rack, another on a node in a different (remote) rack, and the
> last on a different node in the same remote rack."
>
> Assume there are two racks A and B. According to rack awareness, the first
> block is put in rack A, and the the other two replicated blocks will be
> pushed into rack B.
>
> However, why not store the first and second replicas in the local rack
> (A), and the last in a different remote rack (B)? Both two scenarios have
> same network traffic. What's the disadvantage of it?
>
> Thanks.
>
> Best Regards,
> Jianan
>