You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Hamed Ghavamnia <gh...@gmail.com> on 2012/01/01 07:03:48 UTC

HDFS Datanode Capacity

Hi,
I've been searching on how to configure the maximum capacity of a datanode.
I've added big volumes to one of my datanodes, but the configured capacity
doesn't get bigger than the default 5GB. If I want a datanode with 100GB of
capacity, I have to add 20 directories, each having 5GB so the maximum
capacity reaches 100. Is there anywhere this can be set? Can different
datanodes have different capacities?

Also it seems like the *dfs.datanode.du.reserved *doesn't work either,
because I've set it to zero, but it still leaves 50% of the free space for
non-dfs usage.

Thanks,
Hamed

P.S. This is my first message in the mailing list, so if I have to follow
any rules for sending emails, I'll be thankful if you let me know. :)

Re: HDFS Datanode Capacity

Posted by Anirudh <te...@gmail.com>.
You may want to look at
http://hadoop.apache.org/hdfs/docs/current/hdfs_quota_admin_guide.html

Thanks,
Anirudh

On Sat, Dec 31, 2011 at 10:03 PM, Hamed Ghavamnia <gh...@gmail.com>wrote:

> Hi,
> I've been searching on how to configure the maximum capacity of a
> datanode. I've added big volumes to one of my datanodes, but the configured
> capacity doesn't get bigger than the default 5GB. If I want a datanode with
> 100GB of capacity, I have to add 20 directories, each having 5GB so the
> maximum capacity reaches 100. Is there anywhere this can be set? Can
> different datanodes have different capacities?
>
> Also it seems like the *dfs.datanode.du.reserved *doesn't work either,
> because I've set it to zero, but it still leaves 50% of the free space for
> non-dfs usage.
>
> Thanks,
> Hamed
>
> P.S. This is my first message in the mailing list, so if I have to follow
> any rules for sending emails, I'll be thankful if you let me know. :)
>

Re: HDFS Datanode Capacity

Posted by Eric <er...@gmail.com>.
nonDFS used simply means the amount of data that is on the disk but does
not belong to HDFS. E.g. if your disks has 20G of files that belong to,
say, MySQL or MongoDB that will show up as "nonDFS used".

2012/1/1 Hamed Ghavamnia <gh...@gmail.com>

> I found out what was wrong. I had made a really stupid mistake in the
> directory name, and it wasn't pointing to the mount point of the new
> volume, so the capacity wouldn't change.
> But still it's using too much for nonDFS usage, I've set the
> dfs.datanode.du.reserved to 0, 1000, 1000000 bytes but in all cases it
> keeps 5 GBs for nonDFS usage.
> BTW, the web interface for my datanode isn't working, where do I have to
> configure it?
>
>
> On Sun, Jan 1, 2012 at 4:20 PM, Rajiv Chittajallu <ra...@yahoo-inc.com>wrote:
>
>> what does the mbean  Hadoop:service=DataNode,name=DataNodeInfo on the
>> datanode show? You should see something like this
>>
>>
>> http://dn1.hadoop.apache.org:1006/jmx?qry=Hadoop:service=DataNode,name=DataNodeInfo
>>
>> {
>>
>>    "beans": [
>>        {
>>            "name": "Hadoop:service=DataNode,name=DataNodeInfo",
>>            "modelerType":
>> "org.apache.hadoop.hdfs.server.datanode.DataNode",
>>            "HostName": "dn1.hadoop.apache.org",
>>            "Version": "0.20.205.0.1.1110280215",
>>            "RpcPort": "8020",
>>            "HttpPort": null,
>>            "NamenodeAddress": "phanpy-nn1.hadoop.apache.org",
>>            "VolumeInfo" :
>> "{\"/d2/hadoop/var/hdfs/data/current\":{\"freeSpace\":1080583248089,\"usedSpace\":703127873319,\"reservedSpace\":107374182400},\"/d1/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018521960448,\"usedSpace\":709080305664,\"reservedSpace\":107374182400},\"/d4/hadoop/var/hdfs/data/current\":{\"freeSpace\":1062440529920,\"usedSpace\":721270591488,\"reservedSpace\":107374182400},\"/d5/hadoop/var/hdfs/data/current\":{\"freeSpace\":1073051838417,\"usedSpace\":710659282991,\"reservedSpace\":107374182400},\"/d3/hadoop/var/hdfs/data/current\":{\"freeSpace\":1072318734816,\"usedSpace\":711392386592,\"reservedSpace\":107374182400},\"/d0/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018448723968,\"usedSpace\":709153542144,\"reservedSpace\":107374182400}}"
>>        }
>>    ]
>>
>> }
>>
>> here /d[0-5] are different mount points (from different physical drives).
>> This is
>> the preferred way to do. Having different volumes within the same mount
>> point will not help much. If they are, it will be reported incorrectly.
>> You would see 6x more space than what actually is available (its a bug).
>>
>> -rajive
>>
>>
>> Hamed Ghavamnia wrote on 01/01/12 at 03:55:34 -0800:
>> >   I've already added the new volume with dfs.data.dir, and it adds
>> without
>> >   any problem. My problem is that the volume I'm adding has 150 GBs of
>> free
>> >   space, but when I check the namenode:50070 it only adds 5GB to the
>> total
>> >   capacity, of which has reserved 50% for non-dfs usage. I've set the
>> >   dfs.datanode.du.reserved to zero as well, but it doesn't make any
>> >   difference.
>> >   How am I supposed to tell hadoop to use the whole 150 GB for the
>> datanode.
>> >
>> >   On Sun, Jan 1, 2012 at 2:59 PM, Rajiv Chittajallu
>> >   <[1...@yahoo-inc.com> wrote:
>> >
>> >     dfsadmin -setSpaceQuota applies to hdfs filesystem. This doesn't
>> apply
>> >     to datanode volumes.
>> >
>> >     to add a volume, update dfs.data.dir (hdfs-site.xml on datanode) ,
>> and
>> >     restart the datanode.
>> >
>> >     check the datanode log to see if the new volume as activated. You
>> should
>> >     see additional space in
>> >     namenode:50070/dfsnodelist.jsp?whatNodes=LIVE
>> >
>> >     >________________________________
>> >     > From: Hamed Ghavamnia <[2...@gmail.com>
>> >     >To: [3]hdfs-user@hadoop.apache.org; Rajiv Chittajallu
>> >     <[4...@yahoo-inc.com>
>> >     >Sent: Sunday, January 1, 2012 4:06 PM
>> >     >Subject: Re: HDFS Datanode Capacity
>> >     >
>> >     >
>> >     >Thanks for the help.
>> >     >I checked the quotas, it seems they're used for setting the maximum
>> >     size on the files inside the hdfs, and not the datanode itself. For
>> >     example, if I set my dfs.data.dir to /media/newhard (which I've
>> mounted
>> >     my new hard disk to), I can't use dfsadmin -setSpaceQuota n
>> >     /media/newhard to set the size of this directory, I can change the
>> sizes
>> >     of the directories inside hdfs (tmp, user, ...), which don't have
>> any
>> >     effect on the capacity of the datanode.
>> >     >I can set the my new mounted volume as the datanode directory and
>> it
>> >     runs without a problem, but the capacity is the default 5 GB.
>> >     >
>> >     >
>> >     >On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu
>> >     <[5...@yahoo-inc.com> wrote:
>> >     >
>> >     >Once you updated the configuration is the datanode, restarted?
>> Check if
>> >     the datanode log indicated that it was able to setup the new volume.
>> >     >>
>> >     >>
>> >     >>
>> >     >>>________________________________
>> >     >>> From: Hamed Ghavamnia <[6...@gmail.com>
>> >     >>>To: [7]hdfs-user@hadoop.apache.org
>> >     >>>Sent: Sunday, January 1, 2012 11:33 AM
>> >     >>>Subject: HDFS Datanode Capacity
>> >     >>
>> >     >>>
>> >     >>>
>> >     >>>Hi,
>> >     >>>I've been searching on how to configure the maximum capacity of a
>> >     datanode. I've added big volumes to one of my datanodes, but the
>> >     configured capacity doesn't get bigger than the default 5GB. If I
>> want a
>> >     datanode with 100GB of capacity, I have to add 20 directories, each
>> >     having 5GB so the maximum capacity reaches 100. Is there anywhere
>> this
>> >     can be set? Can different datanodes have different capacities?
>> >     >>>
>> >     >>>Also it seems like the dfs.datanode.du.reserved doesn't work
>> either,
>> >     because I've set it to zero, but it still leaves 50% of the free
>> space
>> >     for non-dfs usage.
>> >     >>>
>> >     >>>Thanks,
>> >     >>>Hamed
>> >     >>>
>> >     >>>P.S. This is my first message in the mailing list, so if I have
>> to
>> >     follow any rules for sending emails, I'll be thankful if you let me
>> >     know. :)
>> >     >>>
>> >     >>>
>> >     >>>
>> >     >>
>> >     >
>> >     >
>> >     >
>> >
>> >References
>> >
>> >   Visible links
>> >   1. mailto:rajive@yahoo-inc.com
>> >   2. mailto:ghavamnia.h@gmail.com
>> >   3. mailto:hdfs-user@hadoop.apache.org
>> >   4. mailto:rajive@yahoo-inc.com
>> >   5. mailto:rajive@yahoo-inc.com
>> >   6. mailto:ghavamnia.h@gmail.com
>> >   7. mailto:hdfs-user@hadoop.apache.org
>>
>
>

Re: HDFS Datanode Capacity

Posted by Rajiv Chittajallu <ra...@yahoo-inc.com>.
Hamed Ghavamnia wrote on 01/01/12 at 05:36:53 -0800:
>   I found out what was wrong. I had made a really stupid mistake in the
>   directory name, and it wasn't pointing to the mount point of the new
>   volume, so the capacity wouldn't change.
>   But still it's using too much for nonDFS usage, I've set the
>   dfs.datanode.du.reserved to 0, 1000, 1000000 bytes but in all cases it
>   keeps 5 GBs for nonDFS usage.

nonDFS used is the ( current usage of mount - hdfs/data dir ). This is
not the same as dfs.datanode.du.reserved. DN will stop writing data to
that volume, if the free space reaches below dfs.datanode.du.reserved . 

>   BTW, the web interface for my datanode isn't working, where do I have to
>   configure it?

It might be running on a different port. see dfs.datanode.http.address. 

>
>   On Sun, Jan 1, 2012 at 4:20 PM, Rajiv Chittajallu
>   <[1...@yahoo-inc.com> wrote:
>
>     what does the mbean  Hadoop:service=DataNode,name=DataNodeInfo on the
>     datanode show? You should see something like this
>
>     [2]http://dn1.hadoop.apache.org:1006/jmx?qry=Hadoop:service=DataNode,name=DataNodeInfo
>
>     {
>
>        "beans": [
>            {
>                "name": "Hadoop:service=DataNode,name=DataNodeInfo",
>                "modelerType":
>     "org.apache.hadoop.hdfs.server.datanode.DataNode",
>                "HostName": "[3]dn1.hadoop.apache.org",
>                "Version": "0.20.205.0.1.1110280215",
>                "RpcPort": "8020",
>                "HttpPort": null,
>                "NamenodeAddress": "[4]phanpy-nn1.hadoop.apache.org",
>                "VolumeInfo" :
>     "{\"/d2/hadoop/var/hdfs/data/current\":{\"freeSpace\":1080583248089,\"usedSpace\":703127873319,\"reservedSpace\":107374182400},\"/d1/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018521960448,\"usedSpace\":709080305664,\"reservedSpace\":107374182400},\"/d4/hadoop/var/hdfs/data/current\":{\"freeSpace\":1062440529920,\"usedSpace\":721270591488,\"reservedSpace\":107374182400},\"/d5/hadoop/var/hdfs/data/current\":{\"freeSpace\":1073051838417,\"usedSpace\":710659282991,\"reservedSpace\":107374182400},\"/d3/hadoop/var/hdfs/data/current\":{\"freeSpace\":1072318734816,\"usedSpace\":711392386592,\"reservedSpace\":107374182400},\"/d0/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018448723968,\"usedSpace\":709153542144,\"reservedSpace\":107374182400}}"
>            }
>        ]
>
>     }
>
>     here /d[0-5] are different mount points (from different physical
>     drives). This is
>     the preferred way to do. Having different volumes within the same mount
>     point will not help much. If they are, it will be reported incorrectly.
>     You would see 6x more space than what actually is available (its a bug).
>
>     -rajive
>
>     Hamed Ghavamnia wrote on 01/01/12 at 03:55:34 -0800:
>     >   I've already added the new volume with dfs.data.dir, and it adds
>     without
>     >   any problem. My problem is that the volume I'm adding has 150 GBs
>     of free
>     >   space, but when I check the namenode:50070 it only adds 5GB to the
>     total
>     >   capacity, of which has reserved 50% for non-dfs usage. I've set the
>     >   dfs.datanode.du.reserved to zero as well, but it doesn't make any
>     >   difference.
>     >   How am I supposed to tell hadoop to use the whole 150 GB for the
>     datanode.
>     >
>     >   On Sun, Jan 1, 2012 at 2:59 PM, Rajiv Chittajallu
>     >   <[1...@yahoo-inc.com> wrote:
>     >
>     >     dfsadmin -setSpaceQuota applies to hdfs filesystem. This doesn't
>     apply
>     >     to datanode volumes.
>     >
>     >     to add a volume, update dfs.data.dir (hdfs-site.xml on datanode)
>     , and
>     >     restart the datanode.
>     >
>     >     check the datanode log to see if the new volume as activated.
>     You should
>     >     see additional space in
>     >     namenode:50070/dfsnodelist.jsp?whatNodes=LIVE
>     >
>     >     >________________________________
>     >     > From: Hamed Ghavamnia <[2...@gmail.com>
>     >     >To: [3][7]hdfs-user@hadoop.apache.org; Rajiv Chittajallu
>     >     <[4...@yahoo-inc.com>
>     >     >Sent: Sunday, January 1, 2012 4:06 PM
>     >     >Subject: Re: HDFS Datanode Capacity
>     >     >
>     >     >
>     >     >Thanks for the help.
>     >     >I checked the quotas, it seems they're used for setting the
>     maximum
>     >     size on the files inside the hdfs, and not the datanode itself.
>     For
>     >     example, if I set my dfs.data.dir to /media/newhard (which I've
>     mounted
>     >     my new hard disk to), I can't use dfsadmin -setSpaceQuota n
>     >     /media/newhard to set the size of this directory, I can change
>     the sizes
>     >     of the directories inside hdfs (tmp, user, ...), which don't
>     have any
>     >     effect on the capacity of the datanode.
>     >     >I can set the my new mounted volume as the datanode directory
>     and it
>     >     runs without a problem, but the capacity is the default 5 GB.
>     >     >
>     >     >
>     >     >On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu
>     >     <[5...@yahoo-inc.com> wrote:
>     >     >
>     >     >Once you updated the configuration is the datanode, restarted?
>     Check if
>     >     the datanode log indicated that it was able to setup the new
>     volume.
>     >     >>
>     >     >>
>     >     >>
>     >     >>>________________________________
>     >     >>> From: Hamed Ghavamnia <[6...@gmail.com>
>     >     >>>To: [7][11]hdfs-user@hadoop.apache.org
>     >     >>>Sent: Sunday, January 1, 2012 11:33 AM
>     >     >>>Subject: HDFS Datanode Capacity
>     >     >>
>     >     >>>
>     >     >>>
>     >     >>>Hi,
>     >     >>>I've been searching on how to configure the maximum capacity
>     of a
>     >     datanode. I've added big volumes to one of my datanodes, but the
>     >     configured capacity doesn't get bigger than the default 5GB. If
>     I want a
>     >     datanode with 100GB of capacity, I have to add 20 directories,
>     each
>     >     having 5GB so the maximum capacity reaches 100. Is there
>     anywhere this
>     >     can be set? Can different datanodes have different capacities?
>     >     >>>
>     >     >>>Also it seems like the dfs.datanode.du.reserved doesn't work
>     either,
>     >     because I've set it to zero, but it still leaves 50% of the free
>     space
>     >     for non-dfs usage.
>     >     >>>
>     >     >>>Thanks,
>     >     >>>Hamed
>     >     >>>
>     >     >>>P.S. This is my first message in the mailing list, so if I
>     have to
>     >     follow any rules for sending emails, I'll be thankful if you let
>     me
>     >     know. :)
>     >     >>>
>     >     >>>
>     >     >>>
>     >     >>
>     >     >
>     >     >
>     >     >
>     >
>     >References
>     >
>     >   Visible links
>     >   1. mailto:[12]rajive@yahoo-inc.com
>     >   2. mailto:[13]ghavamnia.h@gmail.com
>     >   3. mailto:[14]hdfs-user@hadoop.apache.org
>     >   4. mailto:[15]rajive@yahoo-inc.com
>     >   5. mailto:[16]rajive@yahoo-inc.com
>     >   6. mailto:[17]ghavamnia.h@gmail.com
>     >   7. mailto:[18]hdfs-user@hadoop.apache.org
>
>References
>
>   Visible links
>   1. mailto:rajive@yahoo-inc.com
>   2. http://dn1.hadoop.apache.org:1006/jmx?qry=Hadoop:service=DataNode,name=DataNodeInfo
>   3. http://dn1.hadoop.apache.org/
>   4. http://phanpy-nn1.hadoop.apache.org/
>   5. mailto:rajive@yahoo-inc.com
>   6. mailto:ghavamnia.h@gmail.com
>   7. mailto:hdfs-user@hadoop.apache.org
>   8. mailto:rajive@yahoo-inc.com
>   9. mailto:rajive@yahoo-inc.com
>  10. mailto:ghavamnia.h@gmail.com
>  11. mailto:hdfs-user@hadoop.apache.org
>  12. mailto:rajive@yahoo-inc.com
>  13. mailto:ghavamnia.h@gmail.com
>  14. mailto:hdfs-user@hadoop.apache.org
>  15. mailto:rajive@yahoo-inc.com
>  16. mailto:rajive@yahoo-inc.com
>  17. mailto:ghavamnia.h@gmail.com
>  18. mailto:hdfs-user@hadoop.apache.org

Re: HDFS Datanode Capacity

Posted by Hamed Ghavamnia <gh...@gmail.com>.
I found out what was wrong. I had made a really stupid mistake in the
directory name, and it wasn't pointing to the mount point of the new
volume, so the capacity wouldn't change.
But still it's using too much for nonDFS usage, I've set the
dfs.datanode.du.reserved to 0, 1000, 1000000 bytes but in all cases it
keeps 5 GBs for nonDFS usage.
BTW, the web interface for my datanode isn't working, where do I have to
configure it?

On Sun, Jan 1, 2012 at 4:20 PM, Rajiv Chittajallu <ra...@yahoo-inc.com>wrote:

> what does the mbean  Hadoop:service=DataNode,name=DataNodeInfo on the
> datanode show? You should see something like this
>
>
> http://dn1.hadoop.apache.org:1006/jmx?qry=Hadoop:service=DataNode,name=DataNodeInfo
>
> {
>
>    "beans": [
>        {
>            "name": "Hadoop:service=DataNode,name=DataNodeInfo",
>            "modelerType":
> "org.apache.hadoop.hdfs.server.datanode.DataNode",
>            "HostName": "dn1.hadoop.apache.org",
>            "Version": "0.20.205.0.1.1110280215",
>            "RpcPort": "8020",
>            "HttpPort": null,
>            "NamenodeAddress": "phanpy-nn1.hadoop.apache.org",
>            "VolumeInfo" :
> "{\"/d2/hadoop/var/hdfs/data/current\":{\"freeSpace\":1080583248089,\"usedSpace\":703127873319,\"reservedSpace\":107374182400},\"/d1/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018521960448,\"usedSpace\":709080305664,\"reservedSpace\":107374182400},\"/d4/hadoop/var/hdfs/data/current\":{\"freeSpace\":1062440529920,\"usedSpace\":721270591488,\"reservedSpace\":107374182400},\"/d5/hadoop/var/hdfs/data/current\":{\"freeSpace\":1073051838417,\"usedSpace\":710659282991,\"reservedSpace\":107374182400},\"/d3/hadoop/var/hdfs/data/current\":{\"freeSpace\":1072318734816,\"usedSpace\":711392386592,\"reservedSpace\":107374182400},\"/d0/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018448723968,\"usedSpace\":709153542144,\"reservedSpace\":107374182400}}"
>        }
>    ]
>
> }
>
> here /d[0-5] are different mount points (from different physical drives).
> This is
> the preferred way to do. Having different volumes within the same mount
> point will not help much. If they are, it will be reported incorrectly.
> You would see 6x more space than what actually is available (its a bug).
>
> -rajive
>
>
> Hamed Ghavamnia wrote on 01/01/12 at 03:55:34 -0800:
> >   I've already added the new volume with dfs.data.dir, and it adds
> without
> >   any problem. My problem is that the volume I'm adding has 150 GBs of
> free
> >   space, but when I check the namenode:50070 it only adds 5GB to the
> total
> >   capacity, of which has reserved 50% for non-dfs usage. I've set the
> >   dfs.datanode.du.reserved to zero as well, but it doesn't make any
> >   difference.
> >   How am I supposed to tell hadoop to use the whole 150 GB for the
> datanode.
> >
> >   On Sun, Jan 1, 2012 at 2:59 PM, Rajiv Chittajallu
> >   <[1...@yahoo-inc.com> wrote:
> >
> >     dfsadmin -setSpaceQuota applies to hdfs filesystem. This doesn't
> apply
> >     to datanode volumes.
> >
> >     to add a volume, update dfs.data.dir (hdfs-site.xml on datanode) ,
> and
> >     restart the datanode.
> >
> >     check the datanode log to see if the new volume as activated. You
> should
> >     see additional space in
> >     namenode:50070/dfsnodelist.jsp?whatNodes=LIVE
> >
> >     >________________________________
> >     > From: Hamed Ghavamnia <[2...@gmail.com>
> >     >To: [3]hdfs-user@hadoop.apache.org; Rajiv Chittajallu
> >     <[4...@yahoo-inc.com>
> >     >Sent: Sunday, January 1, 2012 4:06 PM
> >     >Subject: Re: HDFS Datanode Capacity
> >     >
> >     >
> >     >Thanks for the help.
> >     >I checked the quotas, it seems they're used for setting the maximum
> >     size on the files inside the hdfs, and not the datanode itself. For
> >     example, if I set my dfs.data.dir to /media/newhard (which I've
> mounted
> >     my new hard disk to), I can't use dfsadmin -setSpaceQuota n
> >     /media/newhard to set the size of this directory, I can change the
> sizes
> >     of the directories inside hdfs (tmp, user, ...), which don't have any
> >     effect on the capacity of the datanode.
> >     >I can set the my new mounted volume as the datanode directory and it
> >     runs without a problem, but the capacity is the default 5 GB.
> >     >
> >     >
> >     >On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu
> >     <[5...@yahoo-inc.com> wrote:
> >     >
> >     >Once you updated the configuration is the datanode, restarted?
> Check if
> >     the datanode log indicated that it was able to setup the new volume.
> >     >>
> >     >>
> >     >>
> >     >>>________________________________
> >     >>> From: Hamed Ghavamnia <[6...@gmail.com>
> >     >>>To: [7]hdfs-user@hadoop.apache.org
> >     >>>Sent: Sunday, January 1, 2012 11:33 AM
> >     >>>Subject: HDFS Datanode Capacity
> >     >>
> >     >>>
> >     >>>
> >     >>>Hi,
> >     >>>I've been searching on how to configure the maximum capacity of a
> >     datanode. I've added big volumes to one of my datanodes, but the
> >     configured capacity doesn't get bigger than the default 5GB. If I
> want a
> >     datanode with 100GB of capacity, I have to add 20 directories, each
> >     having 5GB so the maximum capacity reaches 100. Is there anywhere
> this
> >     can be set? Can different datanodes have different capacities?
> >     >>>
> >     >>>Also it seems like the dfs.datanode.du.reserved doesn't work
> either,
> >     because I've set it to zero, but it still leaves 50% of the free
> space
> >     for non-dfs usage.
> >     >>>
> >     >>>Thanks,
> >     >>>Hamed
> >     >>>
> >     >>>P.S. This is my first message in the mailing list, so if I have to
> >     follow any rules for sending emails, I'll be thankful if you let me
> >     know. :)
> >     >>>
> >     >>>
> >     >>>
> >     >>
> >     >
> >     >
> >     >
> >
> >References
> >
> >   Visible links
> >   1. mailto:rajive@yahoo-inc.com
> >   2. mailto:ghavamnia.h@gmail.com
> >   3. mailto:hdfs-user@hadoop.apache.org
> >   4. mailto:rajive@yahoo-inc.com
> >   5. mailto:rajive@yahoo-inc.com
> >   6. mailto:ghavamnia.h@gmail.com
> >   7. mailto:hdfs-user@hadoop.apache.org
>

Re: HDFS Datanode Capacity

Posted by Rajiv Chittajallu <ra...@yahoo-inc.com>.
what does the mbean  Hadoop:service=DataNode,name=DataNodeInfo on the
datanode show? You should see something like this

http://dn1.hadoop.apache.org:1006/jmx?qry=Hadoop:service=DataNode,name=DataNodeInfo

{

    "beans": [
        {
            "name": "Hadoop:service=DataNode,name=DataNodeInfo",
            "modelerType": "org.apache.hadoop.hdfs.server.datanode.DataNode",
            "HostName": "dn1.hadoop.apache.org",
            "Version": "0.20.205.0.1.1110280215",
            "RpcPort": "8020",
            "HttpPort": null,
            "NamenodeAddress": "phanpy-nn1.hadoop.apache.org",
            "VolumeInfo" : "{\"/d2/hadoop/var/hdfs/data/current\":{\"freeSpace\":1080583248089,\"usedSpace\":703127873319,\"reservedSpace\":107374182400},\"/d1/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018521960448,\"usedSpace\":709080305664,\"reservedSpace\":107374182400},\"/d4/hadoop/var/hdfs/data/current\":{\"freeSpace\":1062440529920,\"usedSpace\":721270591488,\"reservedSpace\":107374182400},\"/d5/hadoop/var/hdfs/data/current\":{\"freeSpace\":1073051838417,\"usedSpace\":710659282991,\"reservedSpace\":107374182400},\"/d3/hadoop/var/hdfs/data/current\":{\"freeSpace\":1072318734816,\"usedSpace\":711392386592,\"reservedSpace\":107374182400},\"/d0/hadoop/var/hdfs/data/current\":{\"freeSpace\":1018448723968,\"usedSpace\":709153542144,\"reservedSpace\":107374182400}}"
        }
    ]

}

here /d[0-5] are different mount points (from different physical drives). This is
the preferred way to do. Having different volumes within the same mount
point will not help much. If they are, it will be reported incorrectly.
You would see 6x more space than what actually is available (its a bug).

-rajive


Hamed Ghavamnia wrote on 01/01/12 at 03:55:34 -0800:
>   I've already added the new volume with dfs.data.dir, and it adds without
>   any problem. My problem is that the volume I'm adding has 150 GBs of free
>   space, but when I check the namenode:50070 it only adds 5GB to the total
>   capacity, of which has reserved 50% for non-dfs usage. I've set the
>   dfs.datanode.du.reserved to zero as well, but it doesn't make any
>   difference.
>   How am I supposed to tell hadoop to use the whole 150 GB for the datanode.
>
>   On Sun, Jan 1, 2012 at 2:59 PM, Rajiv Chittajallu
>   <[1...@yahoo-inc.com> wrote:
>
>     dfsadmin -setSpaceQuota applies to hdfs filesystem. This doesn't apply
>     to datanode volumes.
>
>     to add a volume, update dfs.data.dir (hdfs-site.xml on datanode) , and
>     restart the datanode.
>
>     check the datanode log to see if the new volume as activated. You should
>     see additional space in
>     namenode:50070/dfsnodelist.jsp?whatNodes=LIVE
>
>     >________________________________
>     > From: Hamed Ghavamnia <[2...@gmail.com>
>     >To: [3]hdfs-user@hadoop.apache.org; Rajiv Chittajallu
>     <[4...@yahoo-inc.com>
>     >Sent: Sunday, January 1, 2012 4:06 PM
>     >Subject: Re: HDFS Datanode Capacity
>     >
>     >
>     >Thanks for the help.
>     >I checked the quotas, it seems they're used for setting the maximum
>     size on the files inside the hdfs, and not the datanode itself. For
>     example, if I set my dfs.data.dir to /media/newhard (which I've mounted
>     my new hard disk to), I can't use dfsadmin -setSpaceQuota n
>     /media/newhard to set the size of this directory, I can change the sizes
>     of the directories inside hdfs (tmp, user, ...), which don't have any
>     effect on the capacity of the datanode.
>     >I can set the my new mounted volume as the datanode directory and it
>     runs without a problem, but the capacity is the default 5 GB.
>     >
>     >
>     >On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu
>     <[5...@yahoo-inc.com> wrote:
>     >
>     >Once you updated the configuration is the datanode, restarted? Check if
>     the datanode log indicated that it was able to setup the new volume.
>     >>
>     >>
>     >>
>     >>>________________________________
>     >>> From: Hamed Ghavamnia <[6...@gmail.com>
>     >>>To: [7]hdfs-user@hadoop.apache.org
>     >>>Sent: Sunday, January 1, 2012 11:33 AM
>     >>>Subject: HDFS Datanode Capacity
>     >>
>     >>>
>     >>>
>     >>>Hi,
>     >>>I've been searching on how to configure the maximum capacity of a
>     datanode. I've added big volumes to one of my datanodes, but the
>     configured capacity doesn't get bigger than the default 5GB. If I want a
>     datanode with 100GB of capacity, I have to add 20 directories, each
>     having 5GB so the maximum capacity reaches 100. Is there anywhere this
>     can be set? Can different datanodes have different capacities?
>     >>>
>     >>>Also it seems like the dfs.datanode.du.reserved doesn't work either,
>     because I've set it to zero, but it still leaves 50% of the free space
>     for non-dfs usage.
>     >>>
>     >>>Thanks,
>     >>>Hamed
>     >>>
>     >>>P.S. This is my first message in the mailing list, so if I have to
>     follow any rules for sending emails, I'll be thankful if you let me
>     know. :)
>     >>>
>     >>>
>     >>>
>     >>
>     >
>     >
>     >
>
>References
>
>   Visible links
>   1. mailto:rajive@yahoo-inc.com
>   2. mailto:ghavamnia.h@gmail.com
>   3. mailto:hdfs-user@hadoop.apache.org
>   4. mailto:rajive@yahoo-inc.com
>   5. mailto:rajive@yahoo-inc.com
>   6. mailto:ghavamnia.h@gmail.com
>   7. mailto:hdfs-user@hadoop.apache.org

Re: HDFS Datanode Capacity

Posted by Hamed Ghavamnia <gh...@gmail.com>.
I've already added the new volume with dfs.data.dir, and it adds without
any problem. My problem is that the volume I'm adding has 150 GBs of free
space, but when I check the namenode:50070 it only adds 5GB to the total
capacity, of which has reserved 50% for non-dfs usage. I've set the
dfs.datanode.du.reserved to zero as well, but it doesn't make any
difference.
How am I supposed to tell hadoop to use the whole 150 GB for the datanode.

On Sun, Jan 1, 2012 at 2:59 PM, Rajiv Chittajallu <ra...@yahoo-inc.com>wrote:

> dfsadmin -setSpaceQuota applies to hdfs filesystem. This doesn't apply to
> datanode volumes.
>
>
> to add a volume, update dfs.data.dir (hdfs-site.xml on datanode) , and
> restart the datanode.
>
>
> check the datanode log to see if the new volume as activated. You should
> see additional space in
> namenode:50070/dfsnodelist.jsp?whatNodes=LIVE
>
>
> >________________________________
> > From: Hamed Ghavamnia <gh...@gmail.com>
> >To: hdfs-user@hadoop.apache.org; Rajiv Chittajallu <ra...@yahoo-inc.com>
> >Sent: Sunday, January 1, 2012 4:06 PM
> >Subject: Re: HDFS Datanode Capacity
> >
> >
> >Thanks for the help.
> >I checked the quotas, it seems they're used for setting the maximum size
> on the files inside the hdfs, and not the datanode itself. For example, if
> I set my dfs.data.dir to /media/newhard (which I've mounted my new hard
> disk to), I can't use dfsadmin -setSpaceQuota n /media/newhard to set the
> size of this directory, I can change the sizes of the directories inside
> hdfs (tmp, user, ...), which don't have any effect on the capacity of the
> datanode.
> >I can set the my new mounted volume as the datanode directory and it runs
> without a problem, but the capacity is the default 5 GB.
> >
> >
> >On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu <ra...@yahoo-inc.com>
> wrote:
> >
> >Once you updated the configuration is the datanode, restarted? Check if
> the datanode log indicated that it was able to setup the new volume.
> >>
> >>
> >>
> >>>________________________________
> >>> From: Hamed Ghavamnia <gh...@gmail.com>
> >>>To: hdfs-user@hadoop.apache.org
> >>>Sent: Sunday, January 1, 2012 11:33 AM
> >>>Subject: HDFS Datanode Capacity
> >>
> >>>
> >>>
> >>>Hi,
> >>>I've been searching on how to configure the maximum capacity of a
> datanode. I've added big volumes to one of my datanodes, but the configured
> capacity doesn't get bigger than the default 5GB. If I want a datanode with
> 100GB of capacity, I have to add 20 directories, each having 5GB so the
> maximum capacity reaches 100. Is there anywhere this can be set? Can
> different datanodes have different capacities?
> >>>
> >>>Also it seems like the dfs.datanode.du.reserved doesn't work either,
> because I've set it to zero, but it still leaves 50% of the free space for
> non-dfs usage.
> >>>
> >>>Thanks,
> >>>Hamed
> >>>
> >>>P.S. This is my first message in the mailing list, so if I have to
> follow any rules for sending emails, I'll be thankful if you let me know. :)
> >>>
> >>>
> >>>
> >>
> >
> >
> >
>

Re: HDFS Datanode Capacity

Posted by Rajiv Chittajallu <ra...@yahoo-inc.com>.
dfsadmin -setSpaceQuota applies to hdfs filesystem. This doesn't apply to datanode volumes. 


to add a volume, update dfs.data.dir (hdfs-site.xml on datanode) , and restart the datanode. 


check the datanode log to see if the new volume as activated. You should see additional space in
namenode:50070/dfsnodelist.jsp?whatNodes=LIVE


>________________________________
> From: Hamed Ghavamnia <gh...@gmail.com>
>To: hdfs-user@hadoop.apache.org; Rajiv Chittajallu <ra...@yahoo-inc.com> 
>Sent: Sunday, January 1, 2012 4:06 PM
>Subject: Re: HDFS Datanode Capacity
> 
>
>Thanks for the help.
>I checked the quotas, it seems they're used for setting the maximum size on the files inside the hdfs, and not the datanode itself. For example, if I set my dfs.data.dir to /media/newhard (which I've mounted my new hard disk to), I can't use dfsadmin -setSpaceQuota n /media/newhard to set the size of this directory, I can change the sizes of the directories inside hdfs (tmp, user, ...), which don't have any effect on the capacity of the datanode.
>I can set the my new mounted volume as the datanode directory and it runs without a problem, but the capacity is the default 5 GB.
>
>
>On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu <ra...@yahoo-inc.com> wrote:
>
>Once you updated the configuration is the datanode, restarted? Check if the datanode log indicated that it was able to setup the new volume.
>>
>>
>>
>>>________________________________
>>> From: Hamed Ghavamnia <gh...@gmail.com>
>>>To: hdfs-user@hadoop.apache.org
>>>Sent: Sunday, January 1, 2012 11:33 AM
>>>Subject: HDFS Datanode Capacity
>>
>>>
>>>
>>>Hi,
>>>I've been searching on how to configure the maximum capacity of a datanode. I've added big volumes to one of my datanodes, but the configured capacity doesn't get bigger than the default 5GB. If I want a datanode with 100GB of capacity, I have to add 20 directories, each having 5GB so the maximum capacity reaches 100. Is there anywhere this can be set? Can different datanodes have different capacities?
>>>
>>>Also it seems like the dfs.datanode.du.reserved doesn't work either, because I've set it to zero, but it still leaves 50% of the free space for non-dfs usage.
>>>
>>>Thanks,
>>>Hamed
>>>
>>>P.S. This is my first message in the mailing list, so if I have to follow any rules for sending emails, I'll be thankful if you let me know. :)
>>>
>>>
>>>
>>
>
>
>

Re: HDFS Datanode Capacity

Posted by Hamed Ghavamnia <gh...@gmail.com>.
Thanks for the help.
I checked the quotas, it seems they're used for setting the maximum size on
the files inside the hdfs, and not the datanode itself. For example, if I
set my dfs.data.dir to /media/newhard (which I've mounted my new hard disk
to), I can't use dfsadmin -setSpaceQuota n /media/newhard to set the size
of this directory, I can change the sizes of the directories inside hdfs
(tmp, user, ...), which don't have any effect on the capacity of the
datanode.
I can set the my new mounted volume as the datanode directory and it runs
without a problem, but the capacity is the default 5 GB.

On Sun, Jan 1, 2012 at 10:41 AM, Rajiv Chittajallu <ra...@yahoo-inc.com>wrote:

> Once you updated the configuration is the datanode, restarted? Check if
> the datanode log indicated that it was able to setup the new volume.
>
>
>
> >________________________________
> > From: Hamed Ghavamnia <gh...@gmail.com>
> >To: hdfs-user@hadoop.apache.org
> >Sent: Sunday, January 1, 2012 11:33 AM
> >Subject: HDFS Datanode Capacity
> >
> >
> >Hi,
> >I've been searching on how to configure the maximum capacity of a
> datanode. I've added big volumes to one of my datanodes, but the configured
> capacity doesn't get bigger than the default 5GB. If I want a datanode with
> 100GB of capacity, I have to add 20 directories, each having 5GB so the
> maximum capacity reaches 100. Is there anywhere this can be set? Can
> different datanodes have different capacities?
> >
> >Also it seems like the dfs.datanode.du.reserved doesn't work either,
> because I've set it to zero, but it still leaves 50% of the free space for
> non-dfs usage.
> >
> >Thanks,
> >Hamed
> >
> >P.S. This is my first message in the mailing list, so if I have to follow
> any rules for sending emails, I'll be thankful if you let me know. :)
> >
> >
> >
>

Re: HDFS Datanode Capacity

Posted by Rajiv Chittajallu <ra...@yahoo-inc.com>.
Once you updated the configuration is the datanode, restarted? Check if the datanode log indicated that it was able to setup the new volume. 



>________________________________
> From: Hamed Ghavamnia <gh...@gmail.com>
>To: hdfs-user@hadoop.apache.org 
>Sent: Sunday, January 1, 2012 11:33 AM
>Subject: HDFS Datanode Capacity
> 
>
>Hi,
>I've been searching on how to configure the maximum capacity of a datanode. I've added big volumes to one of my datanodes, but the configured capacity doesn't get bigger than the default 5GB. If I want a datanode with 100GB of capacity, I have to add 20 directories, each having 5GB so the maximum capacity reaches 100. Is there anywhere this can be set? Can different datanodes have different capacities?
>
>Also it seems like the dfs.datanode.du.reserved doesn't work either, because I've set it to zero, but it still leaves 50% of the free space for non-dfs usage.
>
>Thanks,
>Hamed
>
>P.S. This is my first message in the mailing list, so if I have to follow any rules for sending emails, I'll be thankful if you let me know. :)
>
>
>