You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Amit Anand <aa...@aquratesolutions.com> on 2013/07/06 16:18:01 UTC
hadoop datanode capacity issue
Hi All,
I have configured a three node cluster:
For each data node the configured capacity is showing double the size of
actual storage. Below is the screen shot of configuration files, "dfsadmin
-report" and "df -h" from each node. Any idea why would it show configured
capacity as double the size of actual storage?
After looking at configuration files, I am assuming each directory mentioned
under "dfs.data.dir" is being treated as a separate storage device and hence
doubling the configured capacity size. Am I correct? Is this a bug or
something wrong with my configuration?
CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
R1DN1(DATANODE, TASKTRACKER)
DFSADMIN -REPORT
Thank You,
Amit Anand
(Mob) 484.682.3065 , 215-995-1058
(Fax) 215.359.9674
(Desk). 215-774-9959
<ma...@aquratesolutions.com> aanand@aquratesolutions.com
Disclaimer: This email message is for the sole use of the intended recipient
(s) and may contain confidential and privileged information. Any
unauthorized review, use, disclosure or distribution is prohibited. If you
are not the intended recipient, please contact the sender by reply email and
destroy all copies of the original message.
NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this
mail cannot be considered Spam as long as we include the contact information
for removal from our mailing list. To be removed from our mailing list
please reply to this email with 'remove' in the subject heading and your
email address in the body. Include complete address and/or domain/aliases to
be removed.
Re: hadoop datanode capacity issue
Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.
Ian.
On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:
> Hi All,
>
> I have configured a three node cluster:
>
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>
> <image003.png>
>
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>
> <image004.png>
>
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>
> <image005.png>
>
> R1DN1(DATANODE, TASKTRACKER)
>
> <image006.png>
>
> DFSADMIN -REPORT
>
> <image002.png>
>
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>
> <image001.gif>
>
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>
---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075
Re: hadoop datanode capacity issue
Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.
Ian.
On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:
> Hi All,
>
> I have configured a three node cluster:
>
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>
> <image003.png>
>
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>
> <image004.png>
>
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>
> <image005.png>
>
> R1DN1(DATANODE, TASKTRACKER)
>
> <image006.png>
>
> DFSADMIN -REPORT
>
> <image002.png>
>
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>
> <image001.gif>
>
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>
---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075
Re: hadoop datanode capacity issue
Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.
Ian.
On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:
> Hi All,
>
> I have configured a three node cluster:
>
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>
> <image003.png>
>
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>
> <image004.png>
>
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>
> <image005.png>
>
> R1DN1(DATANODE, TASKTRACKER)
>
> <image006.png>
>
> DFSADMIN -REPORT
>
> <image002.png>
>
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>
> <image001.gif>
>
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>
---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075
Re: hadoop datanode capacity issue
Posted by Ian Wrigley <ia...@cloudera.com>.
You're correct: each directory is assumed to be a different storage device. There's really no reason to specify two directories on the same physical disk in dfs.data.dir -- just use one directory.
Ian.
On Jul 6, 2013, at 9:18 AM, "Amit Anand" <aa...@aquratesolutions.com> wrote:
> Hi All,
>
> I have configured a three node cluster:
>
> For each data node the configured capacity is showing double the size of actual storage. Below is the screen shot of configuration files, “dfsadmin –report” and “df –h” from each node. Any idea why would it show configured capacity as double the size of actual storage?
>
> After looking at configuration files, I am assuming each directory mentioned under “dfs.data.dir” is being treated as a separate storage device and hence doubling the configured capacity size. Am I correct? Is this a bug or something wrong with my configuration?
>
> CORE-SITE.XML, HDFS-SITE.XML, MAPRED-SITE.XML (From all nodes)
>
> <image003.png>
>
> R1NN1 (NAMENODE, DATANODE, JOBTRACKER)
>
> <image004.png>
>
> R1SN1 (SECONDARY NAMENODE, DATANODE, TASKTRACKER)
>
> <image005.png>
>
> R1DN1(DATANODE, TASKTRACKER)
>
> <image006.png>
>
> DFSADMIN -REPORT
>
> <image002.png>
>
> Thank You,
> Amit Anand
> (Mob) 484.682.3065 , 215-995-1058
> (Fax) 215.359.9674
> (Desk). 215-774-9959
> aanand@aquratesolutions.com
>
> <image001.gif>
>
> Disclaimer: This email message is for the sole use of the intended recipient (s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> NOTE: Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include the contact information for removal from our mailing list. To be removed from our mailing list please reply to this email with 'remove' in the subject heading and your email address in the body. Include complete address and/or domain/aliases to be removed.
>
---
Ian Wrigley
Sr. Curriculum Manager
Cloudera, Inc
Cell: (323) 819 4075