You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Norah Jones <nh...@gmail.com> on 2015/10/27 13:02:38 UTC

How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Hi,

Let we change the default block size to 32 MB and replication factor to 1.
Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I
want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32
= 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.

Can it be possible? How to accomplish it?

Thanks,
Salil

Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by Norah Jones <nh...@gmail.com>.
Thanks,

Can you please explain in more detail?


On Tue, Oct 27, 2015 at 5:44 PM, praveen S <my...@gmail.com> wrote:

> May be Using rack concept might work
> On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:
>
>> Hi,
>>
>> Let we change the default block size to 32 MB and replication factor to
>> 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now
>> I want to place data on DNs as following. DN1 and DN2 contain 2 blocks
>> (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>>
>> Can it be possible? How to accomplish it?
>>
>> Thanks,
>> Salil
>>
>>

RE: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Praveen and Salil,

If the data is being written from one of the cluster nodes then preference would be given for local node irrespective of the Rack being configured.
If its written remotely(not from one of cluster nodes) then there is possibility of blocks getting distributed.
Further you can think of having some custom BlockPlacementPolicy by extending BlockPlacementPolicydefault and configuring "dfs.block.replicator.classname" if required.

+ Naga

________________________________
From: praveen S [mylogin13@gmail.com]
Sent: Tuesday, October 27, 2015 17:44
To: user@hadoop.apache.org
Subject: Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?


May be Using rack concept might work

On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com>> wrote:
Hi,

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.

Can it be possible? How to accomplish it?

Thanks,
Salil


Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by Norah Jones <nh...@gmail.com>.
Thanks,

Can you please explain in more detail?


On Tue, Oct 27, 2015 at 5:44 PM, praveen S <my...@gmail.com> wrote:

> May be Using rack concept might work
> On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:
>
>> Hi,
>>
>> Let we change the default block size to 32 MB and replication factor to
>> 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now
>> I want to place data on DNs as following. DN1 and DN2 contain 2 blocks
>> (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>>
>> Can it be possible? How to accomplish it?
>>
>> Thanks,
>> Salil
>>
>>

Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by Norah Jones <nh...@gmail.com>.
Thanks,

Can you please explain in more detail?


On Tue, Oct 27, 2015 at 5:44 PM, praveen S <my...@gmail.com> wrote:

> May be Using rack concept might work
> On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:
>
>> Hi,
>>
>> Let we change the default block size to 32 MB and replication factor to
>> 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now
>> I want to place data on DNs as following. DN1 and DN2 contain 2 blocks
>> (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>>
>> Can it be possible? How to accomplish it?
>>
>> Thanks,
>> Salil
>>
>>

RE: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Praveen and Salil,

If the data is being written from one of the cluster nodes then preference would be given for local node irrespective of the Rack being configured.
If its written remotely(not from one of cluster nodes) then there is possibility of blocks getting distributed.
Further you can think of having some custom BlockPlacementPolicy by extending BlockPlacementPolicydefault and configuring "dfs.block.replicator.classname" if required.

+ Naga

________________________________
From: praveen S [mylogin13@gmail.com]
Sent: Tuesday, October 27, 2015 17:44
To: user@hadoop.apache.org
Subject: Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?


May be Using rack concept might work

On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com>> wrote:
Hi,

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.

Can it be possible? How to accomplish it?

Thanks,
Salil


Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by Norah Jones <nh...@gmail.com>.
Thanks,

Can you please explain in more detail?


On Tue, Oct 27, 2015 at 5:44 PM, praveen S <my...@gmail.com> wrote:

> May be Using rack concept might work
> On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:
>
>> Hi,
>>
>> Let we change the default block size to 32 MB and replication factor to
>> 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now
>> I want to place data on DNs as following. DN1 and DN2 contain 2 blocks
>> (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>>
>> Can it be possible? How to accomplish it?
>>
>> Thanks,
>> Salil
>>
>>

RE: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Praveen and Salil,

If the data is being written from one of the cluster nodes then preference would be given for local node irrespective of the Rack being configured.
If its written remotely(not from one of cluster nodes) then there is possibility of blocks getting distributed.
Further you can think of having some custom BlockPlacementPolicy by extending BlockPlacementPolicydefault and configuring "dfs.block.replicator.classname" if required.

+ Naga

________________________________
From: praveen S [mylogin13@gmail.com]
Sent: Tuesday, October 27, 2015 17:44
To: user@hadoop.apache.org
Subject: Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?


May be Using rack concept might work

On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com>> wrote:
Hi,

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.

Can it be possible? How to accomplish it?

Thanks,
Salil


RE: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Praveen and Salil,

If the data is being written from one of the cluster nodes then preference would be given for local node irrespective of the Rack being configured.
If its written remotely(not from one of cluster nodes) then there is possibility of blocks getting distributed.
Further you can think of having some custom BlockPlacementPolicy by extending BlockPlacementPolicydefault and configuring "dfs.block.replicator.classname" if required.

+ Naga

________________________________
From: praveen S [mylogin13@gmail.com]
Sent: Tuesday, October 27, 2015 17:44
To: user@hadoop.apache.org
Subject: Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?


May be Using rack concept might work

On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com>> wrote:
Hi,

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.

Can it be possible? How to accomplish it?

Thanks,
Salil


Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by praveen S <my...@gmail.com>.
May be Using rack concept might work
On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:

> Hi,
>
> Let we change the default block size to 32 MB and replication factor to 1.
> Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I
> want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32
> = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>
> Can it be possible? How to accomplish it?
>
> Thanks,
> Salil
>
>

Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by praveen S <my...@gmail.com>.
May be Using rack concept might work
On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:

> Hi,
>
> Let we change the default block size to 32 MB and replication factor to 1.
> Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I
> want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32
> = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>
> Can it be possible? How to accomplish it?
>
> Thanks,
> Salil
>
>

Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by praveen S <my...@gmail.com>.
May be Using rack concept might work
On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:

> Hi,
>
> Let we change the default block size to 32 MB and replication factor to 1.
> Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I
> want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32
> = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>
> Can it be possible? How to accomplish it?
>
> Thanks,
> Salil
>
>

Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Posted by praveen S <my...@gmail.com>.
May be Using rack concept might work
On 27 Oct 2015 17:32, "Norah Jones" <nh...@gmail.com> wrote:

> Hi,
>
> Let we change the default block size to 32 MB and replication factor to 1.
> Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I
> want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32
> = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each.
>
> Can it be possible? How to accomplish it?
>
> Thanks,
> Salil
>
>