You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chen Song <ch...@gmail.com> on 2015/02/11 16:44:00 UTC

hadoop cluster with non-uniform disk spec

We have a hadoop cluster consisting of 500 nodes. But the nodes are not
uniform in term of disk spaces. Half of the racks are newer with 11 volumes
of 1.1T on each node, while the other half have 5 volume of 900GB on each
node.

dfs.datanode.fsdataset.volume.choosing.policy is set to
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half
underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

-- 
Chen Song

Re: commodity hardware

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Typically that term means standard hardware which should be present per default in an enterprise without any extras like RAID, highSpeed NICs, dual power supply and so on.
But that change more and more, since some new independent frameworks and tools enter the market, like Spark, Kafka, Storm etc. 

Conclusion: For me the term "commodity” hardware says quite nothing anymore. Today you need to consider different types of hardware for different use cases. Depends on the goals you want to achieve. 

BR,
 Alexander 



> On 12 Feb 2015, at 17:45, Adaryl Wakefield <ad...@hotmail.com> wrote:
> 
> Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
>  
> B.


Re: commodity hardware

Posted by William Temperley <wi...@gmail.com>.
I'd say hardware is commodity when it's purchased to maximize the
performance-to-price ratio, as opposed to just going for optimum
performance, which will always cost a boat-load.

E.g. a 15000 RPM SAS drive is not commodity, but a 7200RPM SATA drive is.

On 12 February 2015 at 17:45, Adaryl Wakefield
<ad...@hotmail.com> wrote:
> Does anybody have a good definition of commodity hardware? I'm having a hard
> time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.

Re: commodity hardware

Posted by William Temperley <wi...@gmail.com>.
I'd say hardware is commodity when it's purchased to maximize the
performance-to-price ratio, as opposed to just going for optimum
performance, which will always cost a boat-load.

E.g. a 15000 RPM SAS drive is not commodity, but a 7200RPM SATA drive is.

On 12 February 2015 at 17:45, Adaryl Wakefield
<ad...@hotmail.com> wrote:
> Does anybody have a good definition of commodity hardware? I'm having a hard
> time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.

Re: commodity hardware

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Typically that term means standard hardware which should be present per default in an enterprise without any extras like RAID, highSpeed NICs, dual power supply and so on.
But that change more and more, since some new independent frameworks and tools enter the market, like Spark, Kafka, Storm etc. 

Conclusion: For me the term "commodity” hardware says quite nothing anymore. Today you need to consider different types of hardware for different use cases. Depends on the goals you want to achieve. 

BR,
 Alexander 



> On 12 Feb 2015, at 17:45, Adaryl Wakefield <ad...@hotmail.com> wrote:
> 
> Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
>  
> B.


Re: commodity hardware

Posted by William Temperley <wi...@gmail.com>.
I'd say hardware is commodity when it's purchased to maximize the
performance-to-price ratio, as opposed to just going for optimum
performance, which will always cost a boat-load.

E.g. a 15000 RPM SAS drive is not commodity, but a 7200RPM SATA drive is.

On 12 February 2015 at 17:45, Adaryl Wakefield
<ad...@hotmail.com> wrote:
> Does anybody have a good definition of commodity hardware? I'm having a hard
> time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.

Re: commodity hardware

Posted by William Temperley <wi...@gmail.com>.
I'd say hardware is commodity when it's purchased to maximize the
performance-to-price ratio, as opposed to just going for optimum
performance, which will always cost a boat-load.

E.g. a 15000 RPM SAS drive is not commodity, but a 7200RPM SATA drive is.

On 12 February 2015 at 17:45, Adaryl Wakefield
<ad...@hotmail.com> wrote:
> Does anybody have a good definition of commodity hardware? I'm having a hard
> time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.

Re: commodity hardware

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Typically that term means standard hardware which should be present per default in an enterprise without any extras like RAID, highSpeed NICs, dual power supply and so on.
But that change more and more, since some new independent frameworks and tools enter the market, like Spark, Kafka, Storm etc. 

Conclusion: For me the term "commodity” hardware says quite nothing anymore. Today you need to consider different types of hardware for different use cases. Depends on the goals you want to achieve. 

BR,
 Alexander 



> On 12 Feb 2015, at 17:45, Adaryl Wakefield <ad...@hotmail.com> wrote:
> 
> Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
>  
> B.


Re: commodity hardware

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Typically that term means standard hardware which should be present per default in an enterprise without any extras like RAID, highSpeed NICs, dual power supply and so on.
But that change more and more, since some new independent frameworks and tools enter the market, like Spark, Kafka, Storm etc. 

Conclusion: For me the term "commodity” hardware says quite nothing anymore. Today you need to consider different types of hardware for different use cases. Depends on the goals you want to achieve. 

BR,
 Alexander 



> On 12 Feb 2015, at 17:45, Adaryl Wakefield <ad...@hotmail.com> wrote:
> 
> Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
>  
> B.


commodity hardware

Posted by Adaryl Wakefield <ad...@hotmail.com>.
Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.
 		 	   		  

commodity hardware

Posted by Adaryl Wakefield <ad...@hotmail.com>.
Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.
 		 	   		  

Re: commodity hardware

Posted by Mathew Thomas <ma...@longdivision.com>.
My idea of commodity:

Quad Core i5 or higher
16 GB RAM
SSD hard drive (non-RAID, JBOD, size may vary)
Fast network


On Thu, Feb 12, 2015 at 11:50 AM, <th...@bentzn.com> wrote:

> If you can buy it in a shop from a shelf somewhere it's 'commodity' :)
>
>
> /th
>
>
> ------------------------------
> -----Original Besked-----
> Fra: "Adaryl Wakefield" <ad...@hotmail.com>
> Til: user@hadoop.apache.org
> Dato: 12-02-2015 16:45
> Emne: commodity hardware
>
> Does anybody have a good definition of commodity hardware? I'm having a
> hard time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.
>
>

Re: commodity hardware

Posted by Mathew Thomas <ma...@longdivision.com>.
My idea of commodity:

Quad Core i5 or higher
16 GB RAM
SSD hard drive (non-RAID, JBOD, size may vary)
Fast network


On Thu, Feb 12, 2015 at 11:50 AM, <th...@bentzn.com> wrote:

> If you can buy it in a shop from a shelf somewhere it's 'commodity' :)
>
>
> /th
>
>
> ------------------------------
> -----Original Besked-----
> Fra: "Adaryl Wakefield" <ad...@hotmail.com>
> Til: user@hadoop.apache.org
> Dato: 12-02-2015 16:45
> Emne: commodity hardware
>
> Does anybody have a good definition of commodity hardware? I'm having a
> hard time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.
>
>

Re: commodity hardware

Posted by Mathew Thomas <ma...@longdivision.com>.
My idea of commodity:

Quad Core i5 or higher
16 GB RAM
SSD hard drive (non-RAID, JBOD, size may vary)
Fast network


On Thu, Feb 12, 2015 at 11:50 AM, <th...@bentzn.com> wrote:

> If you can buy it in a shop from a shelf somewhere it's 'commodity' :)
>
>
> /th
>
>
> ------------------------------
> -----Original Besked-----
> Fra: "Adaryl Wakefield" <ad...@hotmail.com>
> Til: user@hadoop.apache.org
> Dato: 12-02-2015 16:45
> Emne: commodity hardware
>
> Does anybody have a good definition of commodity hardware? I'm having a
> hard time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.
>
>

Re: commodity hardware

Posted by Mathew Thomas <ma...@longdivision.com>.
My idea of commodity:

Quad Core i5 or higher
16 GB RAM
SSD hard drive (non-RAID, JBOD, size may vary)
Fast network


On Thu, Feb 12, 2015 at 11:50 AM, <th...@bentzn.com> wrote:

> If you can buy it in a shop from a shelf somewhere it's 'commodity' :)
>
>
> /th
>
>
> ------------------------------
> -----Original Besked-----
> Fra: "Adaryl Wakefield" <ad...@hotmail.com>
> Til: user@hadoop.apache.org
> Dato: 12-02-2015 16:45
> Emne: commodity hardware
>
> Does anybody have a good definition of commodity hardware? I'm having a
> hard time explaining it to people. I have no idea when a piece of HW is
> "commodity" or whatever the opposite of commodity is.
>
> B.
>
>

Re: commodity hardware

Posted by Gaurav Sharma <ga...@gmail.com>.
Indeed, another example would be the Dell r620. 


On Feb 12, 2015, at 08:51, Don Hilborn <dh...@hortonworks.com> wrote:

Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm​

From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware
 
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.

Re: commodity hardware

Posted by Gaurav Sharma <ga...@gmail.com>.
Indeed, another example would be the Dell r620. 


On Feb 12, 2015, at 08:51, Don Hilborn <dh...@hortonworks.com> wrote:

Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm​

From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware
 
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
When setting up /etc/hosts, you can use whatever domain name you would like (just pick one arbitrarily).
For example, host01.hadoop, host02.hadoop, etc., where "hadoop" is the chosen domain name.

Yusaku

From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 10:30 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: installation with Ambari

This is turning into less about Ambari and more general computing. I'm trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don't belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako<ma...@hortonworks.com>
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org<ma...@ambari.apache.org>.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
When setting up /etc/hosts, you can use whatever domain name you would like (just pick one arbitrarily).
For example, host01.hadoop, host02.hadoop, etc., where "hadoop" is the chosen domain name.

Yusaku

From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 10:30 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: installation with Ambari

This is turning into less about Ambari and more general computing. I'm trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don't belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako<ma...@hortonworks.com>
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org<ma...@ambari.apache.org>.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
When setting up /etc/hosts, you can use whatever domain name you would like (just pick one arbitrarily).
For example, host01.hadoop, host02.hadoop, etc., where "hadoop" is the chosen domain name.

Yusaku

From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 10:30 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: installation with Ambari

This is turning into less about Ambari and more general computing. I'm trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don't belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako<ma...@hortonworks.com>
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org<ma...@ambari.apache.org>.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
When setting up /etc/hosts, you can use whatever domain name you would like (just pick one arbitrarily).
For example, host01.hadoop, host02.hadoop, etc., where "hadoop" is the chosen domain name.

Yusaku

From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 10:30 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: installation with Ambari

This is turning into less about Ambari and more general computing. I'm trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don't belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako<ma...@hortonworks.com>
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org<ma...@ambari.apache.org>.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
This is turning into less about Ambari and more general computing. I’m trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don’t belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako 
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org 
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.  
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: installation with Ambari


I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
This is turning into less about Ambari and more general computing. I’m trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don’t belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako 
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org 
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.  
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: installation with Ambari


I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
This is turning into less about Ambari and more general computing. I’m trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don’t belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako 
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org 
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.  
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: installation with Ambari


I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
This is turning into less about Ambari and more general computing. I’m trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don’t belong to a domain. Everything I read says that in this situation, the computer name is the FQDN. Do I need to make a domain so my cluster will work properly?

B.

From: Yusaku Sako 
Sent: Thursday, February 12, 2015 11:38 PM
To: user@hadoop.apache.org 
Subject: Re: installation with Ambari

Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.  
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: installation with Ambari


I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

Re: installation with Ambari

Posted by Ted Yu <yu...@gmail.com>.
Looks like you may get good answer from Ambari mailing list.

http://ambari.apache.org/mail-lists.html

On Thu, Feb 12, 2015 at 9:24 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   I’m trying to set up a Hadoop cluster but Ambari is giving me issues.
> At the screen where it ask me to confirm host, I get:
> 1. Warning that I’m not inputting a fully qualified domain name.
> 2. The host that the Ambari instance is actually sitting on is not even
> registering.
>
> When run hostname –fqdn I get just the name of my machine. I put that into
> the screen asking for FQDNs and I just get a warning that it’s not a FQDN.
> What exactly is Ambari expecting?
> B.
>

Re: installation with Ambari

Posted by Ted Yu <yu...@gmail.com>.
Looks like you may get good answer from Ambari mailing list.

http://ambari.apache.org/mail-lists.html

On Thu, Feb 12, 2015 at 9:24 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   I’m trying to set up a Hadoop cluster but Ambari is giving me issues.
> At the screen where it ask me to confirm host, I get:
> 1. Warning that I’m not inputting a fully qualified domain name.
> 2. The host that the Ambari instance is actually sitting on is not even
> registering.
>
> When run hostname –fqdn I get just the name of my machine. I put that into
> the screen asking for FQDNs and I just get a warning that it’s not a FQDN.
> What exactly is Ambari expecting?
> B.
>

Re: installation with Ambari

Posted by Ted Yu <yu...@gmail.com>.
Looks like you may get good answer from Ambari mailing list.

http://ambari.apache.org/mail-lists.html

On Thu, Feb 12, 2015 at 9:24 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   I’m trying to set up a Hadoop cluster but Ambari is giving me issues.
> At the screen where it ask me to confirm host, I get:
> 1. Warning that I’m not inputting a fully qualified domain name.
> 2. The host that the Ambari instance is actually sitting on is not even
> registering.
>
> When run hostname –fqdn I get just the name of my machine. I put that into
> the screen asking for FQDNs and I just get a warning that it’s not a FQDN.
> What exactly is Ambari expecting?
> B.
>

Re: installation with Ambari

Posted by Ted Yu <yu...@gmail.com>.
Looks like you may get good answer from Ambari mailing list.

http://ambari.apache.org/mail-lists.html

On Thu, Feb 12, 2015 at 9:24 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   I’m trying to set up a Hadoop cluster but Ambari is giving me issues.
> At the screen where it ask me to confirm host, I get:
> 1. Warning that I’m not inputting a fully qualified domain name.
> 2. The host that the Ambari instance is actually sitting on is not even
> registering.
>
> When run hostname –fqdn I get just the name of my machine. I put that into
> the screen asking for FQDNs and I just get a warning that it’s not a FQDN.
> What exactly is Ambari expecting?
> B.
>

Re: installation with Ambari

Posted by Yusaku Sako <yu...@hortonworks.com>.
Hi Adaryl,

Ambari expects FQDNs to be set on the hosts.
On your hosts, you want to make sure that "hostname -f" returns the FQDN (with the domain name, like c6401.ambari.apache.org).

Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short name):
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.64.101 c6401.ambari.apache.org c6401
192.168.64.102 c6402.ambari.apache.org c6402
192.168.64.103 c6403.ambari.apache.org c6403

For your future reference, you may want to ask Ambari-specific questions via user@ambari.apache.org.

I hope this helps!
Yusaku


From: MBA <ad...@hotmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, February 12, 2015 9:24 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: installation with Ambari

I'm trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I'm not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname -fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it's not a FQDN. What exactly is Ambari expecting?
B.

installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: commodity hardware

Posted by Gaurav Sharma <ga...@gmail.com>.
Indeed, another example would be the Dell r620. 


On Feb 12, 2015, at 08:51, Don Hilborn <dh...@hortonworks.com> wrote:

Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm​

From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware
 
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.

installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: commodity hardware

Posted by Gaurav Sharma <ga...@gmail.com>.
Indeed, another example would be the Dell r620. 


On Feb 12, 2015, at 08:51, Don Hilborn <dh...@hortonworks.com> wrote:

Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm​

From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware
 
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.

installation with Ambari

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even registering.

When run hostname –fqdn I get just the name of my machine. I put that into the screen asking for FQDNs and I just get a warning that it’s not a FQDN. What exactly is Ambari expecting?
B.

Re: commodity hardware

Posted by Don Hilborn <dh...@hortonworks.com>.
Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm?


________________________________
From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware

If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


________________________________
-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>>
Til: user@hadoop.apache.org<ma...@hadoop.apache.org>
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.

B.

Re: commodity hardware

Posted by Don Hilborn <dh...@hortonworks.com>.
Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm?


________________________________
From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware

If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


________________________________
-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>>
Til: user@hadoop.apache.org<ma...@hadoop.apache.org>
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.

B.

Re: commodity hardware

Posted by Don Hilborn <dh...@hortonworks.com>.
Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm?


________________________________
From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware

If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


________________________________
-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>>
Til: user@hadoop.apache.org<ma...@hadoop.apache.org>
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.

B.

Re: commodity hardware

Posted by Don Hilborn <dh...@hortonworks.com>.
Super Micro is a good example of commodity Hardware.  http://www.supermicro.com/index_home.cfm?


________________________________
From: th@bentzn.com <th...@bentzn.com>
Sent: Thursday, February 12, 2015 10:50 AM
To: user@hadoop.apache.org
Subject: Re: commodity hardware

If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th


________________________________
-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>>
Til: user@hadoop.apache.org<ma...@hadoop.apache.org>
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.

B.

Re: commodity hardware

Posted by th...@bentzn.com.
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th



-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.




Re: commodity hardware

Posted by th...@bentzn.com.
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th



-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.




Re: commodity hardware

Posted by th...@bentzn.com.
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th



-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.




Re: commodity hardware

Posted by th...@bentzn.com.
If you can buy it in a shop from a shelf somewhere it's 'commodity' :)


/th



-----Original Besked-----
Fra: "Adaryl Wakefield" <ad...@hotmail.com>
Til: user@hadoop.apache.org
Dato: 12-02-2015 16:45
Emne: commodity hardware

Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.




commodity hardware

Posted by Adaryl Wakefield <ad...@hotmail.com>.
Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.
 		 	   		  

commodity hardware

Posted by Adaryl Wakefield <ad...@hotmail.com>.
Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is "commodity" or whatever the opposite of commodity is.
 
B.
 		 	   		  

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
*@Leo Leung*
Yes, dfs.datanode.data.dir is set correctly.

@Brahma Reddy Battula

Initially all the nodes we had were 5-disk nodes. Then we added a few racks
of 11-disk nodes. We are using CDH distribution and we set these settings
when we upgraded from CDH4 to CDH5.

To make it more clear, at this moment, all nodes (regardless of 5 disks or
11 disks) have roughly the same number of blocks, thus the same amount of
data stored. It seems data blocks are evenly distributed to the nodes
regardless of whether it is a 5-disk or 11-disk node. Is this expected
behavior?

The concern is that as more data coming in, the 5-disk nodes are reaching
to its configured capacity, while 11-disk nodes why below its capacity,
because the latter have more space collectively on each node.

I don't know if it is expected or my concern is valid?

Chen


On Thu, Feb 12, 2015 at 6:49 AM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  Hello daemeon reiydelle
>
>
> Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
> >>Yes, you need to set this policy which will balance among the disks
>
> *@Chen Song*
>
> following settings controls what percentage of new block allocations will
> be sent to volumes with more available disk space than others
>
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
>
> Did you set while startup the cluster..?
>
>
>  Thanks & Regards
>
>  Brahma Reddy Battula
>
>
>
>
>   ------------------------------
> *From:* daemeon reiydelle [daemeonr@gmail.com]
> *Sent:* Thursday, February 12, 2015 12:02 PM
> *To:* user@hadoop.apache.org
> *Cc:* Ravi Prakash
> *Subject:* Re: hadoop cluster with non-uniform disk spec
>
>    What have you set dfs.datanode.fsdataset.volume.choosing.policy to
> (assuming you are on a current version of Hadoop)? Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
>
>
>
> * ....... *
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in a pretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke, thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter Thompson Daemeon C.M. Reiydelle
> USA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198> London (+44) (0) 20 8144
> 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:
>
>> Hey Ravi
>>
>>  Here are my settings:
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
>> (20G)
>>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> = 0.85f
>>
>>  Chen
>>
>>
>> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>>>  Hi Chen!
>>>
>>>  Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>>
>>>
>>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>>> to?
>>>
>>>
>>>
>>>
>>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>>> chen.song.82@gmail.com> wrote:
>>>
>>>
>>>  We have a hadoop cluster consisting of 500 nodes. But the nodes are
>>> not uniform in term of disk spaces. Half of the racks are newer with 11
>>> volumes of 1.1T on each node, while the other half have 5 volume of 900GB
>>> on each node.
>>>
>>>  dfs.datanode.fsdataset.volume.choosing.policy is set to
>>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>>
>>>  It winds up with the state of half of nodes are full while the other
>>> half underutilized. I am wondering if there is a known solution for this
>>> problem.
>>>
>>>  Thank you for any suggestions.
>>>
>>>  --
>>> Chen Song
>>>
>>>
>>>
>>>
>>
>>
>>   --
>> Chen Song
>>
>>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
*@Leo Leung*
Yes, dfs.datanode.data.dir is set correctly.

@Brahma Reddy Battula

Initially all the nodes we had were 5-disk nodes. Then we added a few racks
of 11-disk nodes. We are using CDH distribution and we set these settings
when we upgraded from CDH4 to CDH5.

To make it more clear, at this moment, all nodes (regardless of 5 disks or
11 disks) have roughly the same number of blocks, thus the same amount of
data stored. It seems data blocks are evenly distributed to the nodes
regardless of whether it is a 5-disk or 11-disk node. Is this expected
behavior?

The concern is that as more data coming in, the 5-disk nodes are reaching
to its configured capacity, while 11-disk nodes why below its capacity,
because the latter have more space collectively on each node.

I don't know if it is expected or my concern is valid?

Chen


On Thu, Feb 12, 2015 at 6:49 AM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  Hello daemeon reiydelle
>
>
> Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
> >>Yes, you need to set this policy which will balance among the disks
>
> *@Chen Song*
>
> following settings controls what percentage of new block allocations will
> be sent to volumes with more available disk space than others
>
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
>
> Did you set while startup the cluster..?
>
>
>  Thanks & Regards
>
>  Brahma Reddy Battula
>
>
>
>
>   ------------------------------
> *From:* daemeon reiydelle [daemeonr@gmail.com]
> *Sent:* Thursday, February 12, 2015 12:02 PM
> *To:* user@hadoop.apache.org
> *Cc:* Ravi Prakash
> *Subject:* Re: hadoop cluster with non-uniform disk spec
>
>    What have you set dfs.datanode.fsdataset.volume.choosing.policy to
> (assuming you are on a current version of Hadoop)? Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
>
>
>
> * ....... *
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in a pretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke, thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter Thompson Daemeon C.M. Reiydelle
> USA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198> London (+44) (0) 20 8144
> 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:
>
>> Hey Ravi
>>
>>  Here are my settings:
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
>> (20G)
>>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> = 0.85f
>>
>>  Chen
>>
>>
>> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>>>  Hi Chen!
>>>
>>>  Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>>
>>>
>>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>>> to?
>>>
>>>
>>>
>>>
>>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>>> chen.song.82@gmail.com> wrote:
>>>
>>>
>>>  We have a hadoop cluster consisting of 500 nodes. But the nodes are
>>> not uniform in term of disk spaces. Half of the racks are newer with 11
>>> volumes of 1.1T on each node, while the other half have 5 volume of 900GB
>>> on each node.
>>>
>>>  dfs.datanode.fsdataset.volume.choosing.policy is set to
>>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>>
>>>  It winds up with the state of half of nodes are full while the other
>>> half underutilized. I am wondering if there is a known solution for this
>>> problem.
>>>
>>>  Thank you for any suggestions.
>>>
>>>  --
>>> Chen Song
>>>
>>>
>>>
>>>
>>
>>
>>   --
>> Chen Song
>>
>>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
*@Leo Leung*
Yes, dfs.datanode.data.dir is set correctly.

@Brahma Reddy Battula

Initially all the nodes we had were 5-disk nodes. Then we added a few racks
of 11-disk nodes. We are using CDH distribution and we set these settings
when we upgraded from CDH4 to CDH5.

To make it more clear, at this moment, all nodes (regardless of 5 disks or
11 disks) have roughly the same number of blocks, thus the same amount of
data stored. It seems data blocks are evenly distributed to the nodes
regardless of whether it is a 5-disk or 11-disk node. Is this expected
behavior?

The concern is that as more data coming in, the 5-disk nodes are reaching
to its configured capacity, while 11-disk nodes why below its capacity,
because the latter have more space collectively on each node.

I don't know if it is expected or my concern is valid?

Chen


On Thu, Feb 12, 2015 at 6:49 AM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  Hello daemeon reiydelle
>
>
> Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
> >>Yes, you need to set this policy which will balance among the disks
>
> *@Chen Song*
>
> following settings controls what percentage of new block allocations will
> be sent to volumes with more available disk space than others
>
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
>
> Did you set while startup the cluster..?
>
>
>  Thanks & Regards
>
>  Brahma Reddy Battula
>
>
>
>
>   ------------------------------
> *From:* daemeon reiydelle [daemeonr@gmail.com]
> *Sent:* Thursday, February 12, 2015 12:02 PM
> *To:* user@hadoop.apache.org
> *Cc:* Ravi Prakash
> *Subject:* Re: hadoop cluster with non-uniform disk spec
>
>    What have you set dfs.datanode.fsdataset.volume.choosing.policy to
> (assuming you are on a current version of Hadoop)? Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
>
>
>
> * ....... *
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in a pretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke, thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter Thompson Daemeon C.M. Reiydelle
> USA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198> London (+44) (0) 20 8144
> 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:
>
>> Hey Ravi
>>
>>  Here are my settings:
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
>> (20G)
>>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> = 0.85f
>>
>>  Chen
>>
>>
>> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>>>  Hi Chen!
>>>
>>>  Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>>
>>>
>>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>>> to?
>>>
>>>
>>>
>>>
>>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>>> chen.song.82@gmail.com> wrote:
>>>
>>>
>>>  We have a hadoop cluster consisting of 500 nodes. But the nodes are
>>> not uniform in term of disk spaces. Half of the racks are newer with 11
>>> volumes of 1.1T on each node, while the other half have 5 volume of 900GB
>>> on each node.
>>>
>>>  dfs.datanode.fsdataset.volume.choosing.policy is set to
>>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>>
>>>  It winds up with the state of half of nodes are full while the other
>>> half underutilized. I am wondering if there is a known solution for this
>>> problem.
>>>
>>>  Thank you for any suggestions.
>>>
>>>  --
>>> Chen Song
>>>
>>>
>>>
>>>
>>
>>
>>   --
>> Chen Song
>>
>>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
*@Leo Leung*
Yes, dfs.datanode.data.dir is set correctly.

@Brahma Reddy Battula

Initially all the nodes we had were 5-disk nodes. Then we added a few racks
of 11-disk nodes. We are using CDH distribution and we set these settings
when we upgraded from CDH4 to CDH5.

To make it more clear, at this moment, all nodes (regardless of 5 disks or
11 disks) have roughly the same number of blocks, thus the same amount of
data stored. It seems data blocks are evenly distributed to the nodes
regardless of whether it is a 5-disk or 11-disk node. Is this expected
behavior?

The concern is that as more data coming in, the 5-disk nodes are reaching
to its configured capacity, while 11-disk nodes why below its capacity,
because the latter have more space collectively on each node.

I don't know if it is expected or my concern is valid?

Chen


On Thu, Feb 12, 2015 at 6:49 AM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  Hello daemeon reiydelle
>
>
> Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
> >>Yes, you need to set this policy which will balance among the disks
>
> *@Chen Song*
>
> following settings controls what percentage of new block allocations will
> be sent to volumes with more available disk space than others
>
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
>
> Did you set while startup the cluster..?
>
>
>  Thanks & Regards
>
>  Brahma Reddy Battula
>
>
>
>
>   ------------------------------
> *From:* daemeon reiydelle [daemeonr@gmail.com]
> *Sent:* Thursday, February 12, 2015 12:02 PM
> *To:* user@hadoop.apache.org
> *Cc:* Ravi Prakash
> *Subject:* Re: hadoop cluster with non-uniform disk spec
>
>    What have you set dfs.datanode.fsdataset.volume.choosing.policy to
> (assuming you are on a current version of Hadoop)? Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
>
>
>
> * ....... *
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in a pretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke, thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter Thompson Daemeon C.M. Reiydelle
> USA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198> London (+44) (0) 20 8144
> 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:
>
>> Hey Ravi
>>
>>  Here are my settings:
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
>> (20G)
>>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> = 0.85f
>>
>>  Chen
>>
>>
>> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>>>  Hi Chen!
>>>
>>>  Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>>
>>>
>>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>>> to?
>>>
>>>
>>>
>>>
>>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>>> chen.song.82@gmail.com> wrote:
>>>
>>>
>>>  We have a hadoop cluster consisting of 500 nodes. But the nodes are
>>> not uniform in term of disk spaces. Half of the racks are newer with 11
>>> volumes of 1.1T on each node, while the other half have 5 volume of 900GB
>>> on each node.
>>>
>>>  dfs.datanode.fsdataset.volume.choosing.policy is set to
>>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>>
>>>  It winds up with the state of half of nodes are full while the other
>>> half underutilized. I am wondering if there is a known solution for this
>>> problem.
>>>
>>>  Thank you for any suggestions.
>>>
>>>  --
>>> Chen Song
>>>
>>>
>>>
>>>
>>
>>
>>   --
>> Chen Song
>>
>>
>


-- 
Chen Song

RE: hadoop cluster with non-uniform disk spec

Posted by Brahma Reddy Battula <br...@huawei.com>.
Hello daemeon reiydelle


Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?

>>Yes, you need to set this policy which will balance among the disks

@Chen Song

following settings controls what percentage of new block allocations will be sent to volumes with more available disk space than others

dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f


Did you set while startup the cluster..?



Thanks & Regards

 Brahma Reddy Battula




________________________________
From: daemeon reiydelle [daemeonr@gmail.com]
Sent: Thursday, February 12, 2015 12:02 PM
To: user@hadoop.apache.org
Cc: Ravi Prakash
Subject: Re: hadoop cluster with non-uniform disk spec

What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!”
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com>> wrote:
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com>> wrote:
Hi Chen!

Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
to?




On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com>> wrote:


We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song






--
Chen Song



RE: hadoop cluster with non-uniform disk spec

Posted by Brahma Reddy Battula <br...@huawei.com>.
Hello daemeon reiydelle


Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?

>>Yes, you need to set this policy which will balance among the disks

@Chen Song

following settings controls what percentage of new block allocations will be sent to volumes with more available disk space than others

dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f


Did you set while startup the cluster..?



Thanks & Regards

 Brahma Reddy Battula




________________________________
From: daemeon reiydelle [daemeonr@gmail.com]
Sent: Thursday, February 12, 2015 12:02 PM
To: user@hadoop.apache.org
Cc: Ravi Prakash
Subject: Re: hadoop cluster with non-uniform disk spec

What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!”
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com>> wrote:
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com>> wrote:
Hi Chen!

Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
to?




On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com>> wrote:


We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song






--
Chen Song



RE: hadoop cluster with non-uniform disk spec

Posted by Brahma Reddy Battula <br...@huawei.com>.
Hello daemeon reiydelle


Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?

>>Yes, you need to set this policy which will balance among the disks

@Chen Song

following settings controls what percentage of new block allocations will be sent to volumes with more available disk space than others

dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f


Did you set while startup the cluster..?



Thanks & Regards

 Brahma Reddy Battula




________________________________
From: daemeon reiydelle [daemeonr@gmail.com]
Sent: Thursday, February 12, 2015 12:02 PM
To: user@hadoop.apache.org
Cc: Ravi Prakash
Subject: Re: hadoop cluster with non-uniform disk spec

What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!”
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com>> wrote:
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com>> wrote:
Hi Chen!

Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
to?




On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com>> wrote:


We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song






--
Chen Song



RE: hadoop cluster with non-uniform disk spec

Posted by Brahma Reddy Battula <br...@huawei.com>.
Hello daemeon reiydelle


Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?

>>Yes, you need to set this policy which will balance among the disks

@Chen Song

following settings controls what percentage of new block allocations will be sent to volumes with more available disk space than others

dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f


Did you set while startup the cluster..?



Thanks & Regards

 Brahma Reddy Battula




________________________________
From: daemeon reiydelle [daemeonr@gmail.com]
Sent: Thursday, February 12, 2015 12:02 PM
To: user@hadoop.apache.org
Cc: Ravi Prakash
Subject: Re: hadoop cluster with non-uniform disk spec

What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?



.......
“Life should not be a journey to the grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!”
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com>> wrote:
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com>> wrote:
Hi Chen!

Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
to?




On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com>> wrote:


We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song






--
Chen Song



Re: hadoop cluster with non-uniform disk spec

Posted by daemeon reiydelle <da...@gmail.com>.
What have you set dfs.datanode.fsdataset.volume.choosing.policy to
(assuming you are on a current version of Hadoop)? Is the policy set to
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?




*.......*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:

> Hey Ravi
>
> Here are my settings:
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
> Chen
>
>
> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Chen!
>>
>> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>
>>
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> to?
>>
>>
>>
>>
>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>> chen.song.82@gmail.com> wrote:
>>
>>
>> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
>> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
>> of 1.1T on each node, while the other half have 5 volume of 900GB on each
>> node.
>>
>> dfs.datanode.fsdataset.volume.choosing.policy is set to
>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>
>> It winds up with the state of half of nodes are full while the other half
>> underutilized. I am wondering if there is a known solution for this problem.
>>
>> Thank you for any suggestions.
>>
>> --
>> Chen Song
>>
>>
>>
>>
>
>
> --
> Chen Song
>
>

Re: hadoop cluster with non-uniform disk spec

Posted by daemeon reiydelle <da...@gmail.com>.
What have you set dfs.datanode.fsdataset.volume.choosing.policy to
(assuming you are on a current version of Hadoop)? Is the policy set to
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?




*.......*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:

> Hey Ravi
>
> Here are my settings:
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
> Chen
>
>
> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Chen!
>>
>> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>
>>
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> to?
>>
>>
>>
>>
>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>> chen.song.82@gmail.com> wrote:
>>
>>
>> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
>> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
>> of 1.1T on each node, while the other half have 5 volume of 900GB on each
>> node.
>>
>> dfs.datanode.fsdataset.volume.choosing.policy is set to
>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>
>> It winds up with the state of half of nodes are full while the other half
>> underutilized. I am wondering if there is a known solution for this problem.
>>
>> Thank you for any suggestions.
>>
>> --
>> Chen Song
>>
>>
>>
>>
>
>
> --
> Chen Song
>
>

Re: hadoop cluster with non-uniform disk spec

Posted by daemeon reiydelle <da...@gmail.com>.
What have you set dfs.datanode.fsdataset.volume.choosing.policy to
(assuming you are on a current version of Hadoop)? Is the policy set to
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?




*.......*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:

> Hey Ravi
>
> Here are my settings:
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
> Chen
>
>
> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Chen!
>>
>> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>
>>
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> to?
>>
>>
>>
>>
>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>> chen.song.82@gmail.com> wrote:
>>
>>
>> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
>> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
>> of 1.1T on each node, while the other half have 5 volume of 900GB on each
>> node.
>>
>> dfs.datanode.fsdataset.volume.choosing.policy is set to
>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>
>> It winds up with the state of half of nodes are full while the other half
>> underutilized. I am wondering if there is a known solution for this problem.
>>
>> Thank you for any suggestions.
>>
>> --
>> Chen Song
>>
>>
>>
>>
>
>
> --
> Chen Song
>
>

Re: hadoop cluster with non-uniform disk spec

Posted by daemeon reiydelle <da...@gmail.com>.
What have you set dfs.datanode.fsdataset.volume.choosing.policy to
(assuming you are on a current version of Hadoop)? Is the policy set to
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?




*.......*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <ch...@gmail.com> wrote:

> Hey Ravi
>
> Here are my settings:
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
> Chen
>
>
> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Chen!
>>
>> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>
>>
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> to?
>>
>>
>>
>>
>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>> chen.song.82@gmail.com> wrote:
>>
>>
>> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
>> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
>> of 1.1T on each node, while the other half have 5 volume of 900GB on each
>> node.
>>
>> dfs.datanode.fsdataset.volume.choosing.policy is set to
>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>
>> It winds up with the state of half of nodes are full while the other half
>> underutilized. I am wondering if there is a known solution for this problem.
>>
>> Thank you for any suggestions.
>>
>> --
>> Chen Song
>>
>>
>>
>>
>
>
> --
> Chen Song
>
>

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
= 21474836480
(20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
= 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Chen!
>
> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>
>
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> to?
>
>
>
>
>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
> chen.song.82@gmail.com> wrote:
>
>
> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
> of 1.1T on each node, while the other half have 5 volume of 900GB on each
> node.
>
> dfs.datanode.fsdataset.volume.choosing.policy is set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>
> It winds up with the state of half of nodes are full while the other half
> underutilized. I am wondering if there is a known solution for this problem.
>
> Thank you for any suggestions.
>
> --
> Chen Song
>
>
>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
= 21474836480
(20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
= 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Chen!
>
> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>
>
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> to?
>
>
>
>
>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
> chen.song.82@gmail.com> wrote:
>
>
> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
> of 1.1T on each node, while the other half have 5 volume of 900GB on each
> node.
>
> dfs.datanode.fsdataset.volume.choosing.policy is set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>
> It winds up with the state of half of nodes are full while the other half
> underutilized. I am wondering if there is a known solution for this problem.
>
> Thank you for any suggestions.
>
> --
> Chen Song
>
>
>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
= 21474836480
(20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
= 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Chen!
>
> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>
>
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> to?
>
>
>
>
>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
> chen.song.82@gmail.com> wrote:
>
>
> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
> of 1.1T on each node, while the other half have 5 volume of 900GB on each
> node.
>
> dfs.datanode.fsdataset.volume.choosing.policy is set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>
> It winds up with the state of half of nodes are full while the other half
> underutilized. I am wondering if there is a known solution for this problem.
>
> Thank you for any suggestions.
>
> --
> Chen Song
>
>
>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Chen Song <ch...@gmail.com>.
Hey Ravi

Here are my settings:
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
= 21474836480
(20G)
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
= 0.85f

Chen


On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Chen!
>
> Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>
>
> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> to?
>
>
>
>
>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
> chen.song.82@gmail.com> wrote:
>
>
> We have a hadoop cluster consisting of 500 nodes. But the nodes are not
> uniform in term of disk spaces. Half of the racks are newer with 11 volumes
> of 1.1T on each node, while the other half have 5 volume of 900GB on each
> node.
>
> dfs.datanode.fsdataset.volume.choosing.policy is set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>
> It winds up with the state of half of nodes are full while the other half
> underutilized. I am wondering if there is a known solution for this problem.
>
> Thank you for any suggestions.
>
> --
> Chen Song
>
>
>
>


-- 
Chen Song

Re: hadoop cluster with non-uniform disk spec

Posted by Ravi Prakash <ra...@ymail.com>.
Hi Chen!
Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold 
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fractionto?

 

     On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com> wrote:
   

 We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.
dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.
Thank you for any suggestions.

-- 
Chen Song



    

Re: hadoop cluster with non-uniform disk spec

Posted by Manoj Venkatesh <ma...@xoom.com>.
I had a similar question recently.
Please check out balancer http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer  this will balance the data across the nodes.

- Manoj

From: Chen Song <ch...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, February 11, 2015 at 7:44 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: hadoop cluster with non-uniform disk spec

We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song


The information transmitted in this email is intended only for the person or entity to which it is addressed, and may contain material confidential to Xoom Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. If you received this email in error, please contact the sender and delete the material from your files.

Re: hadoop cluster with non-uniform disk spec

Posted by Manoj Venkatesh <ma...@xoom.com>.
I had a similar question recently.
Please check out balancer http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer  this will balance the data across the nodes.

- Manoj

From: Chen Song <ch...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, February 11, 2015 at 7:44 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: hadoop cluster with non-uniform disk spec

We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song


The information transmitted in this email is intended only for the person or entity to which it is addressed, and may contain material confidential to Xoom Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. If you received this email in error, please contact the sender and delete the material from your files.

Re: hadoop cluster with non-uniform disk spec

Posted by Manoj Venkatesh <ma...@xoom.com>.
I had a similar question recently.
Please check out balancer http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer  this will balance the data across the nodes.

- Manoj

From: Chen Song <ch...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, February 11, 2015 at 7:44 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: hadoop cluster with non-uniform disk spec

We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song


The information transmitted in this email is intended only for the person or entity to which it is addressed, and may contain material confidential to Xoom Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. If you received this email in error, please contact the sender and delete the material from your files.

Re: hadoop cluster with non-uniform disk spec

Posted by Ravi Prakash <ra...@ymail.com>.
Hi Chen!
Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold 
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fractionto?

 

     On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com> wrote:
   

 We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.
dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.
Thank you for any suggestions.

-- 
Chen Song



    

Re: hadoop cluster with non-uniform disk spec

Posted by Manoj Venkatesh <ma...@xoom.com>.
I had a similar question recently.
Please check out balancer http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer  this will balance the data across the nodes.

- Manoj

From: Chen Song <ch...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, February 11, 2015 at 7:44 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: hadoop cluster with non-uniform disk spec

We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song


The information transmitted in this email is intended only for the person or entity to which it is addressed, and may contain material confidential to Xoom Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. If you received this email in error, please contact the sender and delete the material from your files.

Re: hadoop cluster with non-uniform disk spec

Posted by Ravi Prakash <ra...@ymail.com>.
Hi Chen!
Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold 
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fractionto?

 

     On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com> wrote:
   

 We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.
dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.
Thank you for any suggestions.

-- 
Chen Song



    

Re: hadoop cluster with non-uniform disk spec

Posted by Ravi Prakash <ra...@ymail.com>.
Hi Chen!
Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold 
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fractionto?

 

     On Wednesday, February 11, 2015 7:44 AM, Chen Song <ch...@gmail.com> wrote:
   

 We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node.
dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem.
Thank you for any suggestions.

-- 
Chen Song