You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Amit Anand <aa...@aquratesolutions.com> on 2013/07/06 16:18:01 UTC

hadoop datanode capacity issue

Hi All,

 

I have configured a three node cluster:

 

For each data node the configured capacity is showing double the size of
actual storage. Below is the screen shot of configuration  files, "dfsadmin
-report" and "df -h" from each node. Any idea why would it show configured
capacity as double the size of actual storage?

 

After looking at configuration files, I am assuming each directory mentioned
under "dfs.data.dir" is being treated as a separate storage device and hence
doubling the configured capacity size. Am I correct? Is this a bug or
something wrong with my configuration?

 

CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)

 



 

R1NN1 (NAMENODE, DATANODE, JOBTRACKER)

 



 

R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)

 



 

R1DN1(DATANODE, TASKTRACKER)

 



 

DFSADMIN -REPORT

 



 

Thank You,

Amit Anand

(Mob) 484.682.3065 , 215-995-1058

(Fax) 215.359.9674

(Desk). 215-774-9959

 <ma...@aquratesolutions.com> aanand@aquratesolutions.com

 



 

Disclaimer: This email message is for the sole use of the intended recipient
(s) and may contain confidential and privileged information. Any
unauthorized review, use, disclosure or distribution is prohibited. If you
are not the intended recipient, please contact the sender by reply email and
destroy all copies of the original message.

NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this
mail cannot be considered Spam as long as we include the contact information
for removal from our mailing list. To be removed from our mailing list
please reply to this email with 'remove' in the subject heading and your
email address in the body. Include complete address and/or domain/aliases to
be removed.

 


Re: hadoop datanode capacity issue

Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.

Ian.


On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:

> Hi All,
>  
> I have configured a three node cluster:
>  
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration  files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>  
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>  
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>  
> <image003.png>
>  
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>  
> <image004.png>
>  
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>  
> <image005.png>
>  
> R1DN1(DATANODE, TASKTRACKER)
>  
> <image006.png>
>  
> DFSADMIN -REPORT
>  
> <image002.png>
>  
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>  
> <image001.gif>
>  
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>  


---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075


Re: hadoop datanode capacity issue

Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.

Ian.


On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:

> Hi All,
>  
> I have configured a three node cluster:
>  
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration  files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>  
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>  
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>  
> <image003.png>
>  
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>  
> <image004.png>
>  
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>  
> <image005.png>
>  
> R1DN1(DATANODE, TASKTRACKER)
>  
> <image006.png>
>  
> DFSADMIN -REPORT
>  
> <image002.png>
>  
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>  
> <image001.gif>
>  
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>  


---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075


Re: hadoop datanode capacity issue

Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.

Ian.


On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:

> Hi All,
>  
> I have configured a three node cluster:
>  
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration  files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>  
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>  
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>  
> <image003.png>
>  
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>  
> <image004.png>
>  
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>  
> <image005.png>
>  
> R1DN1(DATANODE, TASKTRACKER)
>  
> <image006.png>
>  
> DFSADMIN -REPORT
>  
> <image002.png>
>  
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>  
> <image001.gif>
>  
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>  


---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075


Re: hadoop datanode capacity issue

Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.

Ian.


On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:

> Hi All,
>  
> I have configured a three node cluster:
>  
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration  files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>  
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>  
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>  
> <image003.png>
>  
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>  
> <image004.png>
>  
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>  
> <image005.png>
>  
> R1DN1(DATANODE, TASKTRACKER)
>  
> <image006.png>
>  
> DFSADMIN -REPORT
>  
> <image002.png>
>  
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>  
> <image001.gif>
>  
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>  


---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075