You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by "Brian C. Huffman" <bh...@etinternational.com> on 2014/10/07 15:13:48 UTC

Re: Datanode disk considerations

What about setting the dfs.datanode.fsdataset.volume.choosing.policy to 
org.apache.hadoop.hdfs.server. datanode.fsdataset. 
AvailableSpaceVolumeChoosingPolicy?

Would that help?

Regards,
Brian

On 08/06/2014 05:23 PM, Adam Faris wrote:
> Hadoop balancer doesn’t balance data on the local drives, it balances data between datanodes on the grid, so running the balancer won’t balance data on the local datanode.
>
> The datanode process round-robins between data directories on local disk, so it’s not unexpected to see the smaller drive fill faster.  Typically people run the same size drives within each compute node to prevent this from happening.
>
> You could partition the 2TB drive into four 500GB partitions.  This isn’t optimal as you’ll have 4 write threads pointing at a single disk but is fairly simple to implement.  Otherwise you’ll want to physically rebuild your 4 nodes so each node has equal amounts of storage.
>
> I’d also like to suggest while restructuring your local filesystem, that the tasktracker/nodemanager be given it’s own partition for writes.  If both the tasktracker/nodemanger plus datanode process share a partition, when the mappers spill to disk it will cause the HDFS space to shrink and grow as the datanode is reporting back how much free space it has for it’s partitions.
>
> Good luck.
>
> On Aug 6, 2014, at 1:51 PM, Felix Chern <id...@gmail.com> wrote:
>
>> Run the “hadoop balencer” command on the namenode. It’s is used for balancing skewed data.
>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>
>>
>> On Aug 6, 2014, at 1:45 PM, Brian C. Huffman <bh...@etinternational.com> wrote:
>>
>>> All,
>>>
>>> We currently a Hadoop 2.2.0 cluster with the following characteristics:
>>> - 4 nodes
>>> - Each node is a datanode
>>> - Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
>>> - HDFS replication factor of 3
>>>
>>> It appears that our 500GB disks are filling up first (the alternative would be to put 4 times the number of blocks on the 2TB disks per node).  I'm concerned that once the 500GB disks fill, our performance will slow down (less spindles being read / written at the same time per node).  Is this correct?  Is there anything we can do to change this behavior?
>>>
>>> Thanks,
>>> Brian
>>>
>>>

Re: Datanode disk considerations

Posted by Azuryy Yu <az...@gmail.com>.

I think Brian gave the answer.

On Tue, Oct 7, 2014 at 9:13 PM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

> What about setting the dfs.datanode.fsdataset.volume.choosing.policy to
> org.apache.hadoop.hdfs.server. datanode.fsdataset.
> AvailableSpaceVolumeChoosingPolicy?
>
> Would that help?
>
> Regards,
> Brian
>
>
> On 08/06/2014 05:23 PM, Adam Faris wrote:
>
>> Hadoop balancer doesn’t balance data on the local drives, it balances
>> data between datanodes on the grid, so running the balancer won’t balance
>> data on the local datanode.
>>
>> The datanode process round-robins between data directories on local disk,
>> so it’s not unexpected to see the smaller drive fill faster.  Typically
>> people run the same size drives within each compute node to prevent this
>> from happening.
>>
>> You could partition the 2TB drive into four 500GB partitions.  This isn’t
>> optimal as you’ll have 4 write threads pointing at a single disk but is
>> fairly simple to implement.  Otherwise you’ll want to physically rebuild
>> your 4 nodes so each node has equal amounts of storage.
>>
>> I’d also like to suggest while restructuring your local filesystem, that
>> the tasktracker/nodemanager be given it’s own partition for writes.  If
>> both the tasktracker/nodemanger plus datanode process share a partition,
>> when the mappers spill to disk it will cause the HDFS space to shrink and
>> grow as the datanode is reporting back how much free space it has for it’s
>> partitions.
>>
>> Good luck.
>>
>> On Aug 6, 2014, at 1:51 PM, Felix Chern <id...@gmail.com> wrote:
>>
>>  Run the “hadoop balencer” command on the namenode. It’s is used for
>>> balancing skewed data.
>>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>>
>>>
>>> On Aug 6, 2014, at 1:45 PM, Brian C. Huffman <
>>> bhuffman@etinternational.com> wrote:
>>>
>>>  All,
>>>>
>>>> We currently a Hadoop 2.2.0 cluster with the following characteristics:
>>>> - 4 nodes
>>>> - Each node is a datanode
>>>> - Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
>>>> - HDFS replication factor of 3
>>>>
>>>> It appears that our 500GB disks are filling up first (the alternative
>>>> would be to put 4 times the number of blocks on the 2TB disks per node).
>>>> I'm concerned that once the 500GB disks fill, our performance will slow
>>>> down (less spindles being read / written at the same time per node).  Is
>>>> this correct?  Is there anything we can do to change this behavior?
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>>
>>>>
>
>

Re: Datanode disk considerations

Posted by Azuryy Yu <az...@gmail.com>.

I think Brian gave the answer.

On Tue, Oct 7, 2014 at 9:13 PM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

> What about setting the dfs.datanode.fsdataset.volume.choosing.policy to
> org.apache.hadoop.hdfs.server. datanode.fsdataset.
> AvailableSpaceVolumeChoosingPolicy?
>
> Would that help?
>
> Regards,
> Brian
>
>
> On 08/06/2014 05:23 PM, Adam Faris wrote:
>
>> Hadoop balancer doesn’t balance data on the local drives, it balances
>> data between datanodes on the grid, so running the balancer won’t balance
>> data on the local datanode.
>>
>> The datanode process round-robins between data directories on local disk,
>> so it’s not unexpected to see the smaller drive fill faster.  Typically
>> people run the same size drives within each compute node to prevent this
>> from happening.
>>
>> You could partition the 2TB drive into four 500GB partitions.  This isn’t
>> optimal as you’ll have 4 write threads pointing at a single disk but is
>> fairly simple to implement.  Otherwise you’ll want to physically rebuild
>> your 4 nodes so each node has equal amounts of storage.
>>
>> I’d also like to suggest while restructuring your local filesystem, that
>> the tasktracker/nodemanager be given it’s own partition for writes.  If
>> both the tasktracker/nodemanger plus datanode process share a partition,
>> when the mappers spill to disk it will cause the HDFS space to shrink and
>> grow as the datanode is reporting back how much free space it has for it’s
>> partitions.
>>
>> Good luck.
>>
>> On Aug 6, 2014, at 1:51 PM, Felix Chern <id...@gmail.com> wrote:
>>
>>  Run the “hadoop balencer” command on the namenode. It’s is used for
>>> balancing skewed data.
>>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>>
>>>
>>> On Aug 6, 2014, at 1:45 PM, Brian C. Huffman <
>>> bhuffman@etinternational.com> wrote:
>>>
>>>  All,
>>>>
>>>> We currently a Hadoop 2.2.0 cluster with the following characteristics:
>>>> - 4 nodes
>>>> - Each node is a datanode
>>>> - Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
>>>> - HDFS replication factor of 3
>>>>
>>>> It appears that our 500GB disks are filling up first (the alternative
>>>> would be to put 4 times the number of blocks on the 2TB disks per node).
>>>> I'm concerned that once the 500GB disks fill, our performance will slow
>>>> down (less spindles being read / written at the same time per node).  Is
>>>> this correct?  Is there anything we can do to change this behavior?
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>>
>>>>
>
>

Re: Datanode disk considerations

Posted by Azuryy Yu <az...@gmail.com>.

I think Brian gave the answer.

On Tue, Oct 7, 2014 at 9:13 PM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

> What about setting the dfs.datanode.fsdataset.volume.choosing.policy to
> org.apache.hadoop.hdfs.server. datanode.fsdataset.
> AvailableSpaceVolumeChoosingPolicy?
>
> Would that help?
>
> Regards,
> Brian
>
>
> On 08/06/2014 05:23 PM, Adam Faris wrote:
>
>> Hadoop balancer doesn’t balance data on the local drives, it balances
>> data between datanodes on the grid, so running the balancer won’t balance
>> data on the local datanode.
>>
>> The datanode process round-robins between data directories on local disk,
>> so it’s not unexpected to see the smaller drive fill faster.  Typically
>> people run the same size drives within each compute node to prevent this
>> from happening.
>>
>> You could partition the 2TB drive into four 500GB partitions.  This isn’t
>> optimal as you’ll have 4 write threads pointing at a single disk but is
>> fairly simple to implement.  Otherwise you’ll want to physically rebuild
>> your 4 nodes so each node has equal amounts of storage.
>>
>> I’d also like to suggest while restructuring your local filesystem, that
>> the tasktracker/nodemanager be given it’s own partition for writes.  If
>> both the tasktracker/nodemanger plus datanode process share a partition,
>> when the mappers spill to disk it will cause the HDFS space to shrink and
>> grow as the datanode is reporting back how much free space it has for it’s
>> partitions.
>>
>> Good luck.
>>
>> On Aug 6, 2014, at 1:51 PM, Felix Chern <id...@gmail.com> wrote:
>>
>>  Run the “hadoop balencer” command on the namenode. It’s is used for
>>> balancing skewed data.
>>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>>
>>>
>>> On Aug 6, 2014, at 1:45 PM, Brian C. Huffman <
>>> bhuffman@etinternational.com> wrote:
>>>
>>>  All,
>>>>
>>>> We currently a Hadoop 2.2.0 cluster with the following characteristics:
>>>> - 4 nodes
>>>> - Each node is a datanode
>>>> - Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
>>>> - HDFS replication factor of 3
>>>>
>>>> It appears that our 500GB disks are filling up first (the alternative
>>>> would be to put 4 times the number of blocks on the 2TB disks per node).
>>>> I'm concerned that once the 500GB disks fill, our performance will slow
>>>> down (less spindles being read / written at the same time per node).  Is
>>>> this correct?  Is there anything we can do to change this behavior?
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>>
>>>>
>
>

Re: Datanode disk considerations

Posted by Azuryy Yu <az...@gmail.com>.

I think Brian gave the answer.

On Tue, Oct 7, 2014 at 9:13 PM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

> What about setting the dfs.datanode.fsdataset.volume.choosing.policy to
> org.apache.hadoop.hdfs.server. datanode.fsdataset.
> AvailableSpaceVolumeChoosingPolicy?
>
> Would that help?
>
> Regards,
> Brian
>
>
> On 08/06/2014 05:23 PM, Adam Faris wrote:
>
>> Hadoop balancer doesn’t balance data on the local drives, it balances
>> data between datanodes on the grid, so running the balancer won’t balance
>> data on the local datanode.
>>
>> The datanode process round-robins between data directories on local disk,
>> so it’s not unexpected to see the smaller drive fill faster.  Typically
>> people run the same size drives within each compute node to prevent this
>> from happening.
>>
>> You could partition the 2TB drive into four 500GB partitions.  This isn’t
>> optimal as you’ll have 4 write threads pointing at a single disk but is
>> fairly simple to implement.  Otherwise you’ll want to physically rebuild
>> your 4 nodes so each node has equal amounts of storage.
>>
>> I’d also like to suggest while restructuring your local filesystem, that
>> the tasktracker/nodemanager be given it’s own partition for writes.  If
>> both the tasktracker/nodemanger plus datanode process share a partition,
>> when the mappers spill to disk it will cause the HDFS space to shrink and
>> grow as the datanode is reporting back how much free space it has for it’s
>> partitions.
>>
>> Good luck.
>>
>> On Aug 6, 2014, at 1:51 PM, Felix Chern <id...@gmail.com> wrote:
>>
>>  Run the “hadoop balencer” command on the namenode. It’s is used for
>>> balancing skewed data.
>>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>>
>>>
>>> On Aug 6, 2014, at 1:45 PM, Brian C. Huffman <
>>> bhuffman@etinternational.com> wrote:
>>>
>>>  All,
>>>>
>>>> We currently a Hadoop 2.2.0 cluster with the following characteristics:
>>>> - 4 nodes
>>>> - Each node is a datanode
>>>> - Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
>>>> - HDFS replication factor of 3
>>>>
>>>> It appears that our 500GB disks are filling up first (the alternative
>>>> would be to put 4 times the number of blocks on the 2TB disks per node).
>>>> I'm concerned that once the 500GB disks fill, our performance will slow
>>>> down (less spindles being read / written at the same time per node).  Is
>>>> this correct?  Is there anything we can do to change this behavior?
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>>
>>>>
>
>