You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Saumitra <sa...@gmail.com> on 2014/04/13 21:54:00 UTC
HDFS file system size issue
Hello,
We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
Status: HEALTHY
Total size: 38086441332 B
Total dirs: 232
Total files: 802
Total blocks (validated): 796 (avg. block size 47847288 B)
Minimally replicated blocks: 796 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 6 (0.75376886 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.0439699
Corrupt blocks: 0
Missing replicas: 6 (0.24762692 %)
Number of data-nodes: 9
Number of racks: 1
FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
My question is that why disks of slaves are getting full even though there are only few files in DFS?
Fwd: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hello,
Thanks for your replies,
Biswanath, looks like we have confusion in calculation, 1TB would be equal
to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly
in few GBs, I have already configured log4j properties so that logs won’t
be too large.
In our slave machines, we have 450GB disk partition dedicated for hadoop
logs and DFS. Over there, logs directory is < 10GBs and rest space is
occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
Basically I wanted to point out discrepancy in name node status page
and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>
I am talking about name node status page on 50070 port. Below is the
screenshot of my name node status page.
As I understand, 'DFS used' (in Cluster Summary section) is the space taken
by DFS, 'non-DFS used' is spaces taken by non-DFS data like logs or other
local files from users.
In my case, Namenode is showing that DFS used is ~1TB but hadoop dfs
-dus is showing
it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Fwd: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hello,
Thanks for your replies,
Biswanath, looks like we have confusion in calculation, 1TB would be equal
to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly
in few GBs, I have already configured log4j properties so that logs won’t
be too large.
In our slave machines, we have 450GB disk partition dedicated for hadoop
logs and DFS. Over there, logs directory is < 10GBs and rest space is
occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
Basically I wanted to point out discrepancy in name node status page
and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>
I am talking about name node status page on 50070 port. Below is the
screenshot of my name node status page.
As I understand, 'DFS used' (in Cluster Summary section) is the space taken
by DFS, 'non-DFS used' is spaces taken by non-DFS data like logs or other
local files from users.
In my case, Namenode is showing that DFS used is ~1TB but hadoop dfs
-dus is showing
it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Saumitra,
It looks like the over replicated blocks root cause is not the issue that
the cluster is experiencing. I can only think of miss configuring the
dfs.data.dir parameter. Can you ensure that each one of the data
directories is using only one partition(mount) and there is no other data
directory sharing the same partition(mount)?
The role should be one data directory per partition(mount). Also, please
check inside the dfs.data.dir for a third party files/directories. Hope
this helps.
Thanks
-Rahman
On Tue, Apr 15, 2014 at 6:54 AM, Saumitra Shahapure <
saumitra.official@gmail.com> wrote:
> Hi Rahman,
>
> These are few lines from hadoop fsck / -blocks -files -locations
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
> block(s): OK
> 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
> ip2:50010, ip3:50010]
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
> block(s): OK
> 0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
> ip2:50010, ip4:50010]
>
>
> Biswa may have guessed replication factor from fsck summary that I posted
> earlier. I am posting it again for today's run:
>
> Status: HEALTHY
> Total size: 58143055251 B
> Total dirs: 307
> Total files: 5093
> Total blocks (validated): 3903 (avg. block size 14897016 B)
> Minimally replicated blocks: 3903 (100.0 %)
>
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 92 (2.357161 %)
>
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.1401486
> Corrupt blocks: 0
> Missing replicas: 92 (0.75065273 %)
>
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
>
>
> The filesystem under path '/' is HEALTHY
>
> I have not overridden dfs.datanode.du.reserved. It defaults to 0.
>
> $ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
> $ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
> 'dfs.datanode.du.reserved'
> <name>dfs.datanode.du.reserved</name>
> <value>0</value>
> <description>Reserved space in bytes per volume. Always leave this much
> space free for non dfs use.
> </description>
>
> Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
> and all hadoop/hive logs are dumped in /mnt/logs in various directories.
> All machines have 400GB for /mnt.
>
> $for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
> /mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
>
>
> 225G /mnt/hadoop
> 224G /mnt/hadoop/dfs/data
> 61M /mnt/logs
>
> 281G /mnt/hadoop
> 281G /mnt/hadoop/dfs/data
> 63M /mnt/logs
>
> 139G /mnt/hadoop
> 139G /mnt/hadoop/dfs/data
> 68M /mnt/logs
>
> 135G /mnt/hadoop
> 134G /mnt/hadoop/dfs/data
> 92M /mnt/logs
>
> 165G /mnt/hadoop
> 164G /mnt/hadoop/dfs/data
> 75M /mnt/logs
>
> 137G /mnt/hadoop
> 137G /mnt/hadoop/dfs/data
> 95M /mnt/logs
>
> 160G /mnt/hadoop
> 160G /mnt/hadoop/dfs/data
> 74M /mnt/logs
>
> 180G /mnt/hadoop
> 122G /mnt/hadoop/dfs/data
> 23M /mnt/logs
>
> 139G /mnt/hadoop
> 138G /mnt/hadoop/dfs/data
> 76M /mnt/logs
>
>
>
> All these numbers are for today, and may differ bit from yesterday.
>
> Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
>
> Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
> readable,
>
> -- Saumitra S. Shahapure
>
>
> On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
> ashettia@hortonworks.com> wrote:
>
>> Hi Biswa,
>>
>> Are you sure that the replication factor of the files are three? Please
>> run a 'hadoop fsck / -blocks -files -locations' and see the replication
>> factor for each file. Also, Post the configuration of <name>dfs.datanode.
>> du.reserved</name> and please check the real space presented by a
>> DataNode by running 'du -h'
>>
>> Thanks,
>> Rahman
>>
>> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> Biswanath, looks like we have confusion in calculation, 1TB would be
>> equal to 1024GB, not 114GB.
>>
>>
>> Sandeep, I checked log directory size as well. Log directories are hardly
>> in few GBs, I have configured log4j properties so that logs won't be too
>> large.
>>
>> In our slave machines, we have 450GB disk partition for hadoop logs and
>> DFS. Over there logs directory is < 10GBs and rest space is occupied by
>> DFS. 10GB partition is for /.
>>
>> Let me quote my confusion point once again:
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>>> one reports it to be 35GB. What are the factors that can cause this
>>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>
>>
>> I am talking about name node status page on 50070 port. Here is the
>> screenshot of my name node status page
>>
>> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>>
>> As I understand, 'DFS used' is the space taken by DFS, non-DFS used is
>> spaces taken by non-DFS data like logs or other local files from users.
>> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
>> ~38GB.
>>
>>
>>
>> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
>> biswajit.nayak@inmobi.com> wrote:
>>
>>> Whats the replication factor you have? I believe it should be 3. hadoop
>>> dus shows that disk usage without replication. While name node ui page
>>> gives with replication.
>>>
>>> 38gb * 3 =114gb ~ 1TB
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hi Biswajeet,
>>>>
>>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>>> nowhere near 1TB.
>>>>
>>>> Basically I wanted to point out discrepancy in name node status page
>>>> and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB
>>>> and later one reports it to be 35GB. What are the factors that can cause
>>>> this difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>>
>>>>
>>>>
>>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>>> wrote:
>>>>
>>>> Hi Saumitra,
>>>>
>>>> Could you please check the non-dfs usage. They also contribute to
>>>> filling up the disk space.
>>>>
>>>>
>>>>
>>>> ~Biswa
>>>> -----oThe important thing is not to stop questioning o-----
>>>>
>>>>
>>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>>> We are using default HDFS block size.
>>>>>
>>>>> We have noticed that disks of slaves are almost full. From name node's
>>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>>
>>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>>> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
>>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>>> that there is no internal fragmentation because the files in our Hive
>>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>>> hadoop fsck / -files -blocks
>>>>>
>>>>> Status: HEALTHY
>>>>> Total size: 38086441332 B
>>>>> Total dirs: 232
>>>>> Total files: 802
>>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>>> Minimally replicated blocks: 796 (100.0 %)
>>>>> Over-replicated blocks: 0 (0.0 %)
>>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>>> Mis-replicated blocks: 0 (0.0 %)
>>>>> Default replication factor: 2
>>>>> Average block replication: 3.0439699
>>>>> Corrupt blocks: 0
>>>>> Missing replicas: 6 (0.24762692 %)
>>>>> Number of data-nodes: 9
>>>>> Number of racks: 1
>>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>>
>>>>>
>>>>> My question is that why disks of slaves are getting full even though
>>>>> there are only few files in DFS?
>>>>>
>>>>
>>>>
>>>> _____________________________________________________________
>>>> The information contained in this communication is intended solely for
>>>> the use of the individual or entity to whom it is addressed and others
>>>> authorized to receive it. It may contain confidential or legally privileged
>>>> information. If you are not the intended recipient you are hereby notified
>>>> that any disclosure, copying, distribution or taking any action in reliance
>>>> on the contents of this information is strictly prohibited and may be
>>>> unlawful. If you have received this communication in error, please notify
>>>> us immediately by responding to this email and then delete it from your
>>>> system. The firm is neither liable for the proper and complete transmission
>>>> of the information contained in this communication nor for any delay in its
>>>> receipt.
>>>>
>>>>
>>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Saumitra,
It looks like the over replicated blocks root cause is not the issue that
the cluster is experiencing. I can only think of miss configuring the
dfs.data.dir parameter. Can you ensure that each one of the data
directories is using only one partition(mount) and there is no other data
directory sharing the same partition(mount)?
The role should be one data directory per partition(mount). Also, please
check inside the dfs.data.dir for a third party files/directories. Hope
this helps.
Thanks
-Rahman
On Tue, Apr 15, 2014 at 6:54 AM, Saumitra Shahapure <
saumitra.official@gmail.com> wrote:
> Hi Rahman,
>
> These are few lines from hadoop fsck / -blocks -files -locations
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
> block(s): OK
> 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
> ip2:50010, ip3:50010]
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
> block(s): OK
> 0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
> ip2:50010, ip4:50010]
>
>
> Biswa may have guessed replication factor from fsck summary that I posted
> earlier. I am posting it again for today's run:
>
> Status: HEALTHY
> Total size: 58143055251 B
> Total dirs: 307
> Total files: 5093
> Total blocks (validated): 3903 (avg. block size 14897016 B)
> Minimally replicated blocks: 3903 (100.0 %)
>
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 92 (2.357161 %)
>
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.1401486
> Corrupt blocks: 0
> Missing replicas: 92 (0.75065273 %)
>
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
>
>
> The filesystem under path '/' is HEALTHY
>
> I have not overridden dfs.datanode.du.reserved. It defaults to 0.
>
> $ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
> $ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
> 'dfs.datanode.du.reserved'
> <name>dfs.datanode.du.reserved</name>
> <value>0</value>
> <description>Reserved space in bytes per volume. Always leave this much
> space free for non dfs use.
> </description>
>
> Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
> and all hadoop/hive logs are dumped in /mnt/logs in various directories.
> All machines have 400GB for /mnt.
>
> $for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
> /mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
>
>
> 225G /mnt/hadoop
> 224G /mnt/hadoop/dfs/data
> 61M /mnt/logs
>
> 281G /mnt/hadoop
> 281G /mnt/hadoop/dfs/data
> 63M /mnt/logs
>
> 139G /mnt/hadoop
> 139G /mnt/hadoop/dfs/data
> 68M /mnt/logs
>
> 135G /mnt/hadoop
> 134G /mnt/hadoop/dfs/data
> 92M /mnt/logs
>
> 165G /mnt/hadoop
> 164G /mnt/hadoop/dfs/data
> 75M /mnt/logs
>
> 137G /mnt/hadoop
> 137G /mnt/hadoop/dfs/data
> 95M /mnt/logs
>
> 160G /mnt/hadoop
> 160G /mnt/hadoop/dfs/data
> 74M /mnt/logs
>
> 180G /mnt/hadoop
> 122G /mnt/hadoop/dfs/data
> 23M /mnt/logs
>
> 139G /mnt/hadoop
> 138G /mnt/hadoop/dfs/data
> 76M /mnt/logs
>
>
>
> All these numbers are for today, and may differ bit from yesterday.
>
> Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
>
> Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
> readable,
>
> -- Saumitra S. Shahapure
>
>
> On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
> ashettia@hortonworks.com> wrote:
>
>> Hi Biswa,
>>
>> Are you sure that the replication factor of the files are three? Please
>> run a 'hadoop fsck / -blocks -files -locations' and see the replication
>> factor for each file. Also, Post the configuration of <name>dfs.datanode.
>> du.reserved</name> and please check the real space presented by a
>> DataNode by running 'du -h'
>>
>> Thanks,
>> Rahman
>>
>> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> Biswanath, looks like we have confusion in calculation, 1TB would be
>> equal to 1024GB, not 114GB.
>>
>>
>> Sandeep, I checked log directory size as well. Log directories are hardly
>> in few GBs, I have configured log4j properties so that logs won't be too
>> large.
>>
>> In our slave machines, we have 450GB disk partition for hadoop logs and
>> DFS. Over there logs directory is < 10GBs and rest space is occupied by
>> DFS. 10GB partition is for /.
>>
>> Let me quote my confusion point once again:
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>>> one reports it to be 35GB. What are the factors that can cause this
>>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>
>>
>> I am talking about name node status page on 50070 port. Here is the
>> screenshot of my name node status page
>>
>> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>>
>> As I understand, 'DFS used' is the space taken by DFS, non-DFS used is
>> spaces taken by non-DFS data like logs or other local files from users.
>> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
>> ~38GB.
>>
>>
>>
>> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
>> biswajit.nayak@inmobi.com> wrote:
>>
>>> Whats the replication factor you have? I believe it should be 3. hadoop
>>> dus shows that disk usage without replication. While name node ui page
>>> gives with replication.
>>>
>>> 38gb * 3 =114gb ~ 1TB
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hi Biswajeet,
>>>>
>>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>>> nowhere near 1TB.
>>>>
>>>> Basically I wanted to point out discrepancy in name node status page
>>>> and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB
>>>> and later one reports it to be 35GB. What are the factors that can cause
>>>> this difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>>
>>>>
>>>>
>>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>>> wrote:
>>>>
>>>> Hi Saumitra,
>>>>
>>>> Could you please check the non-dfs usage. They also contribute to
>>>> filling up the disk space.
>>>>
>>>>
>>>>
>>>> ~Biswa
>>>> -----oThe important thing is not to stop questioning o-----
>>>>
>>>>
>>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>>> We are using default HDFS block size.
>>>>>
>>>>> We have noticed that disks of slaves are almost full. From name node's
>>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>>
>>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>>> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
>>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>>> that there is no internal fragmentation because the files in our Hive
>>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>>> hadoop fsck / -files -blocks
>>>>>
>>>>> Status: HEALTHY
>>>>> Total size: 38086441332 B
>>>>> Total dirs: 232
>>>>> Total files: 802
>>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>>> Minimally replicated blocks: 796 (100.0 %)
>>>>> Over-replicated blocks: 0 (0.0 %)
>>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>>> Mis-replicated blocks: 0 (0.0 %)
>>>>> Default replication factor: 2
>>>>> Average block replication: 3.0439699
>>>>> Corrupt blocks: 0
>>>>> Missing replicas: 6 (0.24762692 %)
>>>>> Number of data-nodes: 9
>>>>> Number of racks: 1
>>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>>
>>>>>
>>>>> My question is that why disks of slaves are getting full even though
>>>>> there are only few files in DFS?
>>>>>
>>>>
>>>>
>>>> _____________________________________________________________
>>>> The information contained in this communication is intended solely for
>>>> the use of the individual or entity to whom it is addressed and others
>>>> authorized to receive it. It may contain confidential or legally privileged
>>>> information. If you are not the intended recipient you are hereby notified
>>>> that any disclosure, copying, distribution or taking any action in reliance
>>>> on the contents of this information is strictly prohibited and may be
>>>> unlawful. If you have received this communication in error, please notify
>>>> us immediately by responding to this email and then delete it from your
>>>> system. The firm is neither liable for the proper and complete transmission
>>>> of the information contained in this communication nor for any delay in its
>>>> receipt.
>>>>
>>>>
>>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Saumitra,
It looks like the over replicated blocks root cause is not the issue that
the cluster is experiencing. I can only think of miss configuring the
dfs.data.dir parameter. Can you ensure that each one of the data
directories is using only one partition(mount) and there is no other data
directory sharing the same partition(mount)?
The role should be one data directory per partition(mount). Also, please
check inside the dfs.data.dir for a third party files/directories. Hope
this helps.
Thanks
-Rahman
On Tue, Apr 15, 2014 at 6:54 AM, Saumitra Shahapure <
saumitra.official@gmail.com> wrote:
> Hi Rahman,
>
> These are few lines from hadoop fsck / -blocks -files -locations
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
> block(s): OK
> 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
> ip2:50010, ip3:50010]
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
> block(s): OK
> 0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
> ip2:50010, ip4:50010]
>
>
> Biswa may have guessed replication factor from fsck summary that I posted
> earlier. I am posting it again for today's run:
>
> Status: HEALTHY
> Total size: 58143055251 B
> Total dirs: 307
> Total files: 5093
> Total blocks (validated): 3903 (avg. block size 14897016 B)
> Minimally replicated blocks: 3903 (100.0 %)
>
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 92 (2.357161 %)
>
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.1401486
> Corrupt blocks: 0
> Missing replicas: 92 (0.75065273 %)
>
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
>
>
> The filesystem under path '/' is HEALTHY
>
> I have not overridden dfs.datanode.du.reserved. It defaults to 0.
>
> $ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
> $ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
> 'dfs.datanode.du.reserved'
> <name>dfs.datanode.du.reserved</name>
> <value>0</value>
> <description>Reserved space in bytes per volume. Always leave this much
> space free for non dfs use.
> </description>
>
> Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
> and all hadoop/hive logs are dumped in /mnt/logs in various directories.
> All machines have 400GB for /mnt.
>
> $for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
> /mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
>
>
> 225G /mnt/hadoop
> 224G /mnt/hadoop/dfs/data
> 61M /mnt/logs
>
> 281G /mnt/hadoop
> 281G /mnt/hadoop/dfs/data
> 63M /mnt/logs
>
> 139G /mnt/hadoop
> 139G /mnt/hadoop/dfs/data
> 68M /mnt/logs
>
> 135G /mnt/hadoop
> 134G /mnt/hadoop/dfs/data
> 92M /mnt/logs
>
> 165G /mnt/hadoop
> 164G /mnt/hadoop/dfs/data
> 75M /mnt/logs
>
> 137G /mnt/hadoop
> 137G /mnt/hadoop/dfs/data
> 95M /mnt/logs
>
> 160G /mnt/hadoop
> 160G /mnt/hadoop/dfs/data
> 74M /mnt/logs
>
> 180G /mnt/hadoop
> 122G /mnt/hadoop/dfs/data
> 23M /mnt/logs
>
> 139G /mnt/hadoop
> 138G /mnt/hadoop/dfs/data
> 76M /mnt/logs
>
>
>
> All these numbers are for today, and may differ bit from yesterday.
>
> Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
>
> Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
> readable,
>
> -- Saumitra S. Shahapure
>
>
> On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
> ashettia@hortonworks.com> wrote:
>
>> Hi Biswa,
>>
>> Are you sure that the replication factor of the files are three? Please
>> run a 'hadoop fsck / -blocks -files -locations' and see the replication
>> factor for each file. Also, Post the configuration of <name>dfs.datanode.
>> du.reserved</name> and please check the real space presented by a
>> DataNode by running 'du -h'
>>
>> Thanks,
>> Rahman
>>
>> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> Biswanath, looks like we have confusion in calculation, 1TB would be
>> equal to 1024GB, not 114GB.
>>
>>
>> Sandeep, I checked log directory size as well. Log directories are hardly
>> in few GBs, I have configured log4j properties so that logs won't be too
>> large.
>>
>> In our slave machines, we have 450GB disk partition for hadoop logs and
>> DFS. Over there logs directory is < 10GBs and rest space is occupied by
>> DFS. 10GB partition is for /.
>>
>> Let me quote my confusion point once again:
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>>> one reports it to be 35GB. What are the factors that can cause this
>>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>
>>
>> I am talking about name node status page on 50070 port. Here is the
>> screenshot of my name node status page
>>
>> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>>
>> As I understand, 'DFS used' is the space taken by DFS, non-DFS used is
>> spaces taken by non-DFS data like logs or other local files from users.
>> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
>> ~38GB.
>>
>>
>>
>> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
>> biswajit.nayak@inmobi.com> wrote:
>>
>>> Whats the replication factor you have? I believe it should be 3. hadoop
>>> dus shows that disk usage without replication. While name node ui page
>>> gives with replication.
>>>
>>> 38gb * 3 =114gb ~ 1TB
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hi Biswajeet,
>>>>
>>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>>> nowhere near 1TB.
>>>>
>>>> Basically I wanted to point out discrepancy in name node status page
>>>> and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB
>>>> and later one reports it to be 35GB. What are the factors that can cause
>>>> this difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>>
>>>>
>>>>
>>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>>> wrote:
>>>>
>>>> Hi Saumitra,
>>>>
>>>> Could you please check the non-dfs usage. They also contribute to
>>>> filling up the disk space.
>>>>
>>>>
>>>>
>>>> ~Biswa
>>>> -----oThe important thing is not to stop questioning o-----
>>>>
>>>>
>>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>>> We are using default HDFS block size.
>>>>>
>>>>> We have noticed that disks of slaves are almost full. From name node's
>>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>>
>>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>>> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
>>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>>> that there is no internal fragmentation because the files in our Hive
>>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>>> hadoop fsck / -files -blocks
>>>>>
>>>>> Status: HEALTHY
>>>>> Total size: 38086441332 B
>>>>> Total dirs: 232
>>>>> Total files: 802
>>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>>> Minimally replicated blocks: 796 (100.0 %)
>>>>> Over-replicated blocks: 0 (0.0 %)
>>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>>> Mis-replicated blocks: 0 (0.0 %)
>>>>> Default replication factor: 2
>>>>> Average block replication: 3.0439699
>>>>> Corrupt blocks: 0
>>>>> Missing replicas: 6 (0.24762692 %)
>>>>> Number of data-nodes: 9
>>>>> Number of racks: 1
>>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>>
>>>>>
>>>>> My question is that why disks of slaves are getting full even though
>>>>> there are only few files in DFS?
>>>>>
>>>>
>>>>
>>>> _____________________________________________________________
>>>> The information contained in this communication is intended solely for
>>>> the use of the individual or entity to whom it is addressed and others
>>>> authorized to receive it. It may contain confidential or legally privileged
>>>> information. If you are not the intended recipient you are hereby notified
>>>> that any disclosure, copying, distribution or taking any action in reliance
>>>> on the contents of this information is strictly prohibited and may be
>>>> unlawful. If you have received this communication in error, please notify
>>>> us immediately by responding to this email and then delete it from your
>>>> system. The firm is neither liable for the proper and complete transmission
>>>> of the information contained in this communication nor for any delay in its
>>>> receipt.
>>>>
>>>>
>>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Saumitra,
It looks like the over replicated blocks root cause is not the issue that
the cluster is experiencing. I can only think of miss configuring the
dfs.data.dir parameter. Can you ensure that each one of the data
directories is using only one partition(mount) and there is no other data
directory sharing the same partition(mount)?
The role should be one data directory per partition(mount). Also, please
check inside the dfs.data.dir for a third party files/directories. Hope
this helps.
Thanks
-Rahman
On Tue, Apr 15, 2014 at 6:54 AM, Saumitra Shahapure <
saumitra.official@gmail.com> wrote:
> Hi Rahman,
>
> These are few lines from hadoop fsck / -blocks -files -locations
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
> block(s): OK
> 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
> ip2:50010, ip3:50010]
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
> block(s): OK
> 0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
> ip2:50010, ip4:50010]
>
>
> Biswa may have guessed replication factor from fsck summary that I posted
> earlier. I am posting it again for today's run:
>
> Status: HEALTHY
> Total size: 58143055251 B
> Total dirs: 307
> Total files: 5093
> Total blocks (validated): 3903 (avg. block size 14897016 B)
> Minimally replicated blocks: 3903 (100.0 %)
>
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 92 (2.357161 %)
>
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.1401486
> Corrupt blocks: 0
> Missing replicas: 92 (0.75065273 %)
>
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
>
>
> The filesystem under path '/' is HEALTHY
>
> I have not overridden dfs.datanode.du.reserved. It defaults to 0.
>
> $ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
> $ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
> 'dfs.datanode.du.reserved'
> <name>dfs.datanode.du.reserved</name>
> <value>0</value>
> <description>Reserved space in bytes per volume. Always leave this much
> space free for non dfs use.
> </description>
>
> Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
> and all hadoop/hive logs are dumped in /mnt/logs in various directories.
> All machines have 400GB for /mnt.
>
> $for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
> /mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
>
>
> 225G /mnt/hadoop
> 224G /mnt/hadoop/dfs/data
> 61M /mnt/logs
>
> 281G /mnt/hadoop
> 281G /mnt/hadoop/dfs/data
> 63M /mnt/logs
>
> 139G /mnt/hadoop
> 139G /mnt/hadoop/dfs/data
> 68M /mnt/logs
>
> 135G /mnt/hadoop
> 134G /mnt/hadoop/dfs/data
> 92M /mnt/logs
>
> 165G /mnt/hadoop
> 164G /mnt/hadoop/dfs/data
> 75M /mnt/logs
>
> 137G /mnt/hadoop
> 137G /mnt/hadoop/dfs/data
> 95M /mnt/logs
>
> 160G /mnt/hadoop
> 160G /mnt/hadoop/dfs/data
> 74M /mnt/logs
>
> 180G /mnt/hadoop
> 122G /mnt/hadoop/dfs/data
> 23M /mnt/logs
>
> 139G /mnt/hadoop
> 138G /mnt/hadoop/dfs/data
> 76M /mnt/logs
>
>
>
> All these numbers are for today, and may differ bit from yesterday.
>
> Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
>
> Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
> readable,
>
> -- Saumitra S. Shahapure
>
>
> On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
> ashettia@hortonworks.com> wrote:
>
>> Hi Biswa,
>>
>> Are you sure that the replication factor of the files are three? Please
>> run a 'hadoop fsck / -blocks -files -locations' and see the replication
>> factor for each file. Also, Post the configuration of <name>dfs.datanode.
>> du.reserved</name> and please check the real space presented by a
>> DataNode by running 'du -h'
>>
>> Thanks,
>> Rahman
>>
>> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> Biswanath, looks like we have confusion in calculation, 1TB would be
>> equal to 1024GB, not 114GB.
>>
>>
>> Sandeep, I checked log directory size as well. Log directories are hardly
>> in few GBs, I have configured log4j properties so that logs won't be too
>> large.
>>
>> In our slave machines, we have 450GB disk partition for hadoop logs and
>> DFS. Over there logs directory is < 10GBs and rest space is occupied by
>> DFS. 10GB partition is for /.
>>
>> Let me quote my confusion point once again:
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>>> one reports it to be 35GB. What are the factors that can cause this
>>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>
>>
>> I am talking about name node status page on 50070 port. Here is the
>> screenshot of my name node status page
>>
>> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>>
>> As I understand, 'DFS used' is the space taken by DFS, non-DFS used is
>> spaces taken by non-DFS data like logs or other local files from users.
>> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
>> ~38GB.
>>
>>
>>
>> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
>> biswajit.nayak@inmobi.com> wrote:
>>
>>> Whats the replication factor you have? I believe it should be 3. hadoop
>>> dus shows that disk usage without replication. While name node ui page
>>> gives with replication.
>>>
>>> 38gb * 3 =114gb ~ 1TB
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hi Biswajeet,
>>>>
>>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>>> nowhere near 1TB.
>>>>
>>>> Basically I wanted to point out discrepancy in name node status page
>>>> and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB
>>>> and later one reports it to be 35GB. What are the factors that can cause
>>>> this difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>>
>>>>
>>>>
>>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>>> wrote:
>>>>
>>>> Hi Saumitra,
>>>>
>>>> Could you please check the non-dfs usage. They also contribute to
>>>> filling up the disk space.
>>>>
>>>>
>>>>
>>>> ~Biswa
>>>> -----oThe important thing is not to stop questioning o-----
>>>>
>>>>
>>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>>> We are using default HDFS block size.
>>>>>
>>>>> We have noticed that disks of slaves are almost full. From name node's
>>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>>
>>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>>> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
>>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>>> that there is no internal fragmentation because the files in our Hive
>>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>>> hadoop fsck / -files -blocks
>>>>>
>>>>> Status: HEALTHY
>>>>> Total size: 38086441332 B
>>>>> Total dirs: 232
>>>>> Total files: 802
>>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>>> Minimally replicated blocks: 796 (100.0 %)
>>>>> Over-replicated blocks: 0 (0.0 %)
>>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>>> Mis-replicated blocks: 0 (0.0 %)
>>>>> Default replication factor: 2
>>>>> Average block replication: 3.0439699
>>>>> Corrupt blocks: 0
>>>>> Missing replicas: 6 (0.24762692 %)
>>>>> Number of data-nodes: 9
>>>>> Number of racks: 1
>>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>>
>>>>>
>>>>> My question is that why disks of slaves are getting full even though
>>>>> there are only few files in DFS?
>>>>>
>>>>
>>>>
>>>> _____________________________________________________________
>>>> The information contained in this communication is intended solely for
>>>> the use of the individual or entity to whom it is addressed and others
>>>> authorized to receive it. It may contain confidential or legally privileged
>>>> information. If you are not the intended recipient you are hereby notified
>>>> that any disclosure, copying, distribution or taking any action in reliance
>>>> on the contents of this information is strictly prohibited and may be
>>>> unlawful. If you have received this communication in error, please notify
>>>> us immediately by responding to this email and then delete it from your
>>>> system. The firm is neither liable for the proper and complete transmission
>>>> of the information contained in this communication nor for any delay in its
>>>> receipt.
>>>>
>>>>
>>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Saumitra,
It looks like the over replicated blocks root cause is not the issue that
the cluster is experiencing. I can only think of miss configuring the
dfs.data.dir parameter. Can you ensure that each one of the data
directories is using only one partition(mount) and there is no other data
directory sharing the same partition(mount)?
The role should be one data directory per partition(mount). Also, please
check inside the dfs.data.dir for a third party files/directories. Hope
this helps.
Thanks
-Rahman
On Tue, Apr 15, 2014 at 6:54 AM, Saumitra Shahapure <
saumitra.official@gmail.com> wrote:
> Hi Rahman,
>
> These are few lines from hadoop fsck / -blocks -files -locations
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
> block(s): OK
> 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
> ip2:50010, ip3:50010]
>
> /mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
> block(s): OK
> 0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
> ip2:50010, ip4:50010]
>
>
> Biswa may have guessed replication factor from fsck summary that I posted
> earlier. I am posting it again for today's run:
>
> Status: HEALTHY
> Total size: 58143055251 B
> Total dirs: 307
> Total files: 5093
> Total blocks (validated): 3903 (avg. block size 14897016 B)
> Minimally replicated blocks: 3903 (100.0 %)
>
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 92 (2.357161 %)
>
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.1401486
> Corrupt blocks: 0
> Missing replicas: 92 (0.75065273 %)
>
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
>
>
> The filesystem under path '/' is HEALTHY
>
> I have not overridden dfs.datanode.du.reserved. It defaults to 0.
>
> $ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
> $ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
> 'dfs.datanode.du.reserved'
> <name>dfs.datanode.du.reserved</name>
> <value>0</value>
> <description>Reserved space in bytes per volume. Always leave this much
> space free for non dfs use.
> </description>
>
> Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
> and all hadoop/hive logs are dumped in /mnt/logs in various directories.
> All machines have 400GB for /mnt.
>
> $for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
> /mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
>
>
> 225G /mnt/hadoop
> 224G /mnt/hadoop/dfs/data
> 61M /mnt/logs
>
> 281G /mnt/hadoop
> 281G /mnt/hadoop/dfs/data
> 63M /mnt/logs
>
> 139G /mnt/hadoop
> 139G /mnt/hadoop/dfs/data
> 68M /mnt/logs
>
> 135G /mnt/hadoop
> 134G /mnt/hadoop/dfs/data
> 92M /mnt/logs
>
> 165G /mnt/hadoop
> 164G /mnt/hadoop/dfs/data
> 75M /mnt/logs
>
> 137G /mnt/hadoop
> 137G /mnt/hadoop/dfs/data
> 95M /mnt/logs
>
> 160G /mnt/hadoop
> 160G /mnt/hadoop/dfs/data
> 74M /mnt/logs
>
> 180G /mnt/hadoop
> 122G /mnt/hadoop/dfs/data
> 23M /mnt/logs
>
> 139G /mnt/hadoop
> 138G /mnt/hadoop/dfs/data
> 76M /mnt/logs
>
>
>
> All these numbers are for today, and may differ bit from yesterday.
>
> Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
>
> Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
> readable,
>
> -- Saumitra S. Shahapure
>
>
> On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
> ashettia@hortonworks.com> wrote:
>
>> Hi Biswa,
>>
>> Are you sure that the replication factor of the files are three? Please
>> run a 'hadoop fsck / -blocks -files -locations' and see the replication
>> factor for each file. Also, Post the configuration of <name>dfs.datanode.
>> du.reserved</name> and please check the real space presented by a
>> DataNode by running 'du -h'
>>
>> Thanks,
>> Rahman
>>
>> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> Biswanath, looks like we have confusion in calculation, 1TB would be
>> equal to 1024GB, not 114GB.
>>
>>
>> Sandeep, I checked log directory size as well. Log directories are hardly
>> in few GBs, I have configured log4j properties so that logs won't be too
>> large.
>>
>> In our slave machines, we have 450GB disk partition for hadoop logs and
>> DFS. Over there logs directory is < 10GBs and rest space is occupied by
>> DFS. 10GB partition is for /.
>>
>> Let me quote my confusion point once again:
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>>> one reports it to be 35GB. What are the factors that can cause this
>>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>
>>
>> I am talking about name node status page on 50070 port. Here is the
>> screenshot of my name node status page
>>
>> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>>
>> As I understand, 'DFS used' is the space taken by DFS, non-DFS used is
>> spaces taken by non-DFS data like logs or other local files from users.
>> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
>> ~38GB.
>>
>>
>>
>> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
>> biswajit.nayak@inmobi.com> wrote:
>>
>>> Whats the replication factor you have? I believe it should be 3. hadoop
>>> dus shows that disk usage without replication. While name node ui page
>>> gives with replication.
>>>
>>> 38gb * 3 =114gb ~ 1TB
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hi Biswajeet,
>>>>
>>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>>> nowhere near 1TB.
>>>>
>>>> Basically I wanted to point out discrepancy in name node status page
>>>> and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB
>>>> and later one reports it to be 35GB. What are the factors that can cause
>>>> this difference? And why is just 35GB data causing DFS to hit its limits?
>>>>
>>>>
>>>>
>>>>
>>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>>> wrote:
>>>>
>>>> Hi Saumitra,
>>>>
>>>> Could you please check the non-dfs usage. They also contribute to
>>>> filling up the disk space.
>>>>
>>>>
>>>>
>>>> ~Biswa
>>>> -----oThe important thing is not to stop questioning o-----
>>>>
>>>>
>>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>>> We are using default HDFS block size.
>>>>>
>>>>> We have noticed that disks of slaves are almost full. From name node's
>>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>>
>>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>>> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
>>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>>> that there is no internal fragmentation because the files in our Hive
>>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>>> hadoop fsck / -files -blocks
>>>>>
>>>>> Status: HEALTHY
>>>>> Total size: 38086441332 B
>>>>> Total dirs: 232
>>>>> Total files: 802
>>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>>> Minimally replicated blocks: 796 (100.0 %)
>>>>> Over-replicated blocks: 0 (0.0 %)
>>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>>> Mis-replicated blocks: 0 (0.0 %)
>>>>> Default replication factor: 2
>>>>> Average block replication: 3.0439699
>>>>> Corrupt blocks: 0
>>>>> Missing replicas: 6 (0.24762692 %)
>>>>> Number of data-nodes: 9
>>>>> Number of racks: 1
>>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>>
>>>>>
>>>>> My question is that why disks of slaves are getting full even though
>>>>> there are only few files in DFS?
>>>>>
>>>>
>>>>
>>>> _____________________________________________________________
>>>> The information contained in this communication is intended solely for
>>>> the use of the individual or entity to whom it is addressed and others
>>>> authorized to receive it. It may contain confidential or legally privileged
>>>> information. If you are not the intended recipient you are hereby notified
>>>> that any disclosure, copying, distribution or taking any action in reliance
>>>> on the contents of this information is strictly prohibited and may be
>>>> unlawful. If you have received this communication in error, please notify
>>>> us immediately by responding to this email and then delete it from your
>>>> system. The firm is neither liable for the proper and complete transmission
>>>> of the information contained in this communication nor for any delay in its
>>>> receipt.
>>>>
>>>>
>>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hi Rahman,
These are few lines from hadoop fsck / -blocks -files -locations
/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s): OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]
/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s): OK
0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]
Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:
Status: HEALTHY
Total size: 58143055251 B
Total dirs: 307
Total files: 5093
Total blocks (validated): 3903 (avg. block size 14897016 B)
Minimally replicated blocks: 3903 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 92 (2.357161 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.1401486
Corrupt blocks: 0
Missing replicas: 92 (0.75065273 %)
Number of data-nodes: 9
Number of racks: 1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
The filesystem under path '/' is HEALTHY
I have not overridden dfs.datanode.du.reserved. It defaults to 0.
$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.
$for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
225G /mnt/hadoop
224G /mnt/hadoop/dfs/data
61M /mnt/logs
281G /mnt/hadoop
281G /mnt/hadoop/dfs/data
63M /mnt/logs
139G /mnt/hadoop
139G /mnt/hadoop/dfs/data
68M /mnt/logs
135G /mnt/hadoop
134G /mnt/hadoop/dfs/data
92M /mnt/logs
165G /mnt/hadoop
164G /mnt/hadoop/dfs/data
75M /mnt/logs
137G /mnt/hadoop
137G /mnt/hadoop/dfs/data
95M /mnt/logs
160G /mnt/hadoop
160G /mnt/hadoop/dfs/data
74M /mnt/logs
180G /mnt/hadoop
122G /mnt/hadoop/dfs/data
23M /mnt/logs
139G /mnt/hadoop
138G /mnt/hadoop/dfs/data
76M /mnt/logs
All these numbers are for today, and may differ bit from yesterday.
Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,
-- Saumitra S. Shahapure
On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
ashettia@hortonworks.com> wrote:
> Hi Biswa,
>
> Are you sure that the replication factor of the files are three? Please
> run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
> factor for each file. Also, Post the configuration of <name>dfs.datanode.
> du.reserved</name> and please check the real space presented by a
> DataNode by running ‘du -h’
>
> Thanks,
> Rahman
>
> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
>
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal
> to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly
> in few GBs, I have configured log4j properties so that logs won’t be too
> large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and
> DFS. Over there logs directory is < 10GBs and rest space is occupied by
> DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>
>
> I am talking about name node status page on 50070 port. Here is the
> screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
> spaces taken by non-DFS data like logs or other local files from users.
> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
> ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
> biswajit.nayak@inmobi.com> wrote:
>
>> Whats the replication factor you have? I believe it should be 3. hadoop
>> dus shows that disk usage without replication. While name node ui page
>> gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hi Biswajeet,
>>>
>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>> nowhere near 1TB.
>>>
>>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>>
>>>
>>>
>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>> wrote:
>>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to
>>> filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>> We are using default HDFS block size.
>>>>
>>>> We have noticed that disks of slaves are almost full. From name node’s
>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>
>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>> that there is no internal fragmentation because the files in our Hive
>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>> hadoop fsck / -files -blocks
>>>>
>>>> Status: HEALTHY
>>>> Total size: 38086441332 B
>>>> Total dirs: 232
>>>> Total files: 802
>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>> Minimally replicated blocks: 796 (100.0 %)
>>>> Over-replicated blocks: 0 (0.0 %)
>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>> Mis-replicated blocks: 0 (0.0 %)
>>>> Default replication factor: 2
>>>> Average block replication: 3.0439699
>>>> Corrupt blocks: 0
>>>> Missing replicas: 6 (0.24762692 %)
>>>> Number of data-nodes: 9
>>>> Number of racks: 1
>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>
>>>>
>>>> My question is that why disks of slaves are getting full even though
>>>> there are only few files in DFS?
>>>>
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>
>
>
> --
> --Regards
> Sandeep Nemuri
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hi Rahman,
These are few lines from hadoop fsck / -blocks -files -locations
/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s): OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]
/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s): OK
0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]
Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:
Status: HEALTHY
Total size: 58143055251 B
Total dirs: 307
Total files: 5093
Total blocks (validated): 3903 (avg. block size 14897016 B)
Minimally replicated blocks: 3903 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 92 (2.357161 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.1401486
Corrupt blocks: 0
Missing replicas: 92 (0.75065273 %)
Number of data-nodes: 9
Number of racks: 1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
The filesystem under path '/' is HEALTHY
I have not overridden dfs.datanode.du.reserved. It defaults to 0.
$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.
$for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
225G /mnt/hadoop
224G /mnt/hadoop/dfs/data
61M /mnt/logs
281G /mnt/hadoop
281G /mnt/hadoop/dfs/data
63M /mnt/logs
139G /mnt/hadoop
139G /mnt/hadoop/dfs/data
68M /mnt/logs
135G /mnt/hadoop
134G /mnt/hadoop/dfs/data
92M /mnt/logs
165G /mnt/hadoop
164G /mnt/hadoop/dfs/data
75M /mnt/logs
137G /mnt/hadoop
137G /mnt/hadoop/dfs/data
95M /mnt/logs
160G /mnt/hadoop
160G /mnt/hadoop/dfs/data
74M /mnt/logs
180G /mnt/hadoop
122G /mnt/hadoop/dfs/data
23M /mnt/logs
139G /mnt/hadoop
138G /mnt/hadoop/dfs/data
76M /mnt/logs
All these numbers are for today, and may differ bit from yesterday.
Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,
-- Saumitra S. Shahapure
On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
ashettia@hortonworks.com> wrote:
> Hi Biswa,
>
> Are you sure that the replication factor of the files are three? Please
> run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
> factor for each file. Also, Post the configuration of <name>dfs.datanode.
> du.reserved</name> and please check the real space presented by a
> DataNode by running ‘du -h’
>
> Thanks,
> Rahman
>
> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
>
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal
> to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly
> in few GBs, I have configured log4j properties so that logs won’t be too
> large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and
> DFS. Over there logs directory is < 10GBs and rest space is occupied by
> DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>
>
> I am talking about name node status page on 50070 port. Here is the
> screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
> spaces taken by non-DFS data like logs or other local files from users.
> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
> ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
> biswajit.nayak@inmobi.com> wrote:
>
>> Whats the replication factor you have? I believe it should be 3. hadoop
>> dus shows that disk usage without replication. While name node ui page
>> gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hi Biswajeet,
>>>
>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>> nowhere near 1TB.
>>>
>>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>>
>>>
>>>
>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>> wrote:
>>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to
>>> filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>> We are using default HDFS block size.
>>>>
>>>> We have noticed that disks of slaves are almost full. From name node’s
>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>
>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>> that there is no internal fragmentation because the files in our Hive
>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>> hadoop fsck / -files -blocks
>>>>
>>>> Status: HEALTHY
>>>> Total size: 38086441332 B
>>>> Total dirs: 232
>>>> Total files: 802
>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>> Minimally replicated blocks: 796 (100.0 %)
>>>> Over-replicated blocks: 0 (0.0 %)
>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>> Mis-replicated blocks: 0 (0.0 %)
>>>> Default replication factor: 2
>>>> Average block replication: 3.0439699
>>>> Corrupt blocks: 0
>>>> Missing replicas: 6 (0.24762692 %)
>>>> Number of data-nodes: 9
>>>> Number of racks: 1
>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>
>>>>
>>>> My question is that why disks of slaves are getting full even though
>>>> there are only few files in DFS?
>>>>
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>
>
>
> --
> --Regards
> Sandeep Nemuri
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hi Rahman,
These are few lines from hadoop fsck / -blocks -files -locations
/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s): OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]
/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s): OK
0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]
Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:
Status: HEALTHY
Total size: 58143055251 B
Total dirs: 307
Total files: 5093
Total blocks (validated): 3903 (avg. block size 14897016 B)
Minimally replicated blocks: 3903 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 92 (2.357161 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.1401486
Corrupt blocks: 0
Missing replicas: 92 (0.75065273 %)
Number of data-nodes: 9
Number of racks: 1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
The filesystem under path '/' is HEALTHY
I have not overridden dfs.datanode.du.reserved. It defaults to 0.
$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.
$for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
225G /mnt/hadoop
224G /mnt/hadoop/dfs/data
61M /mnt/logs
281G /mnt/hadoop
281G /mnt/hadoop/dfs/data
63M /mnt/logs
139G /mnt/hadoop
139G /mnt/hadoop/dfs/data
68M /mnt/logs
135G /mnt/hadoop
134G /mnt/hadoop/dfs/data
92M /mnt/logs
165G /mnt/hadoop
164G /mnt/hadoop/dfs/data
75M /mnt/logs
137G /mnt/hadoop
137G /mnt/hadoop/dfs/data
95M /mnt/logs
160G /mnt/hadoop
160G /mnt/hadoop/dfs/data
74M /mnt/logs
180G /mnt/hadoop
122G /mnt/hadoop/dfs/data
23M /mnt/logs
139G /mnt/hadoop
138G /mnt/hadoop/dfs/data
76M /mnt/logs
All these numbers are for today, and may differ bit from yesterday.
Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,
-- Saumitra S. Shahapure
On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
ashettia@hortonworks.com> wrote:
> Hi Biswa,
>
> Are you sure that the replication factor of the files are three? Please
> run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
> factor for each file. Also, Post the configuration of <name>dfs.datanode.
> du.reserved</name> and please check the real space presented by a
> DataNode by running ‘du -h’
>
> Thanks,
> Rahman
>
> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
>
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal
> to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly
> in few GBs, I have configured log4j properties so that logs won’t be too
> large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and
> DFS. Over there logs directory is < 10GBs and rest space is occupied by
> DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>
>
> I am talking about name node status page on 50070 port. Here is the
> screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
> spaces taken by non-DFS data like logs or other local files from users.
> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
> ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
> biswajit.nayak@inmobi.com> wrote:
>
>> Whats the replication factor you have? I believe it should be 3. hadoop
>> dus shows that disk usage without replication. While name node ui page
>> gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hi Biswajeet,
>>>
>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>> nowhere near 1TB.
>>>
>>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>>
>>>
>>>
>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>> wrote:
>>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to
>>> filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>> We are using default HDFS block size.
>>>>
>>>> We have noticed that disks of slaves are almost full. From name node’s
>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>
>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>> that there is no internal fragmentation because the files in our Hive
>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>> hadoop fsck / -files -blocks
>>>>
>>>> Status: HEALTHY
>>>> Total size: 38086441332 B
>>>> Total dirs: 232
>>>> Total files: 802
>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>> Minimally replicated blocks: 796 (100.0 %)
>>>> Over-replicated blocks: 0 (0.0 %)
>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>> Mis-replicated blocks: 0 (0.0 %)
>>>> Default replication factor: 2
>>>> Average block replication: 3.0439699
>>>> Corrupt blocks: 0
>>>> Missing replicas: 6 (0.24762692 %)
>>>> Number of data-nodes: 9
>>>> Number of racks: 1
>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>
>>>>
>>>> My question is that why disks of slaves are getting full even though
>>>> there are only few files in DFS?
>>>>
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>
>
>
> --
> --Regards
> Sandeep Nemuri
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hi Rahman,
These are few lines from hadoop fsck / -blocks -files -locations
/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s): OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]
/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s): OK
0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]
Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:
Status: HEALTHY
Total size: 58143055251 B
Total dirs: 307
Total files: 5093
Total blocks (validated): 3903 (avg. block size 14897016 B)
Minimally replicated blocks: 3903 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 92 (2.357161 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.1401486
Corrupt blocks: 0
Missing replicas: 92 (0.75065273 %)
Number of data-nodes: 9
Number of racks: 1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
The filesystem under path '/' is HEALTHY
I have not overridden dfs.datanode.du.reserved. It defaults to 0.
$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.
$for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
225G /mnt/hadoop
224G /mnt/hadoop/dfs/data
61M /mnt/logs
281G /mnt/hadoop
281G /mnt/hadoop/dfs/data
63M /mnt/logs
139G /mnt/hadoop
139G /mnt/hadoop/dfs/data
68M /mnt/logs
135G /mnt/hadoop
134G /mnt/hadoop/dfs/data
92M /mnt/logs
165G /mnt/hadoop
164G /mnt/hadoop/dfs/data
75M /mnt/logs
137G /mnt/hadoop
137G /mnt/hadoop/dfs/data
95M /mnt/logs
160G /mnt/hadoop
160G /mnt/hadoop/dfs/data
74M /mnt/logs
180G /mnt/hadoop
122G /mnt/hadoop/dfs/data
23M /mnt/logs
139G /mnt/hadoop
138G /mnt/hadoop/dfs/data
76M /mnt/logs
All these numbers are for today, and may differ bit from yesterday.
Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,
-- Saumitra S. Shahapure
On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
ashettia@hortonworks.com> wrote:
> Hi Biswa,
>
> Are you sure that the replication factor of the files are three? Please
> run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
> factor for each file. Also, Post the configuration of <name>dfs.datanode.
> du.reserved</name> and please check the real space presented by a
> DataNode by running ‘du -h’
>
> Thanks,
> Rahman
>
> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
>
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal
> to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly
> in few GBs, I have configured log4j properties so that logs won’t be too
> large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and
> DFS. Over there logs directory is < 10GBs and rest space is occupied by
> DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>
>
> I am talking about name node status page on 50070 port. Here is the
> screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
> spaces taken by non-DFS data like logs or other local files from users.
> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
> ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
> biswajit.nayak@inmobi.com> wrote:
>
>> Whats the replication factor you have? I believe it should be 3. hadoop
>> dus shows that disk usage without replication. While name node ui page
>> gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hi Biswajeet,
>>>
>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>> nowhere near 1TB.
>>>
>>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>>
>>>
>>>
>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>> wrote:
>>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to
>>> filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>> We are using default HDFS block size.
>>>>
>>>> We have noticed that disks of slaves are almost full. From name node’s
>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>
>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>> that there is no internal fragmentation because the files in our Hive
>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>> hadoop fsck / -files -blocks
>>>>
>>>> Status: HEALTHY
>>>> Total size: 38086441332 B
>>>> Total dirs: 232
>>>> Total files: 802
>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>> Minimally replicated blocks: 796 (100.0 %)
>>>> Over-replicated blocks: 0 (0.0 %)
>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>> Mis-replicated blocks: 0 (0.0 %)
>>>> Default replication factor: 2
>>>> Average block replication: 3.0439699
>>>> Corrupt blocks: 0
>>>> Missing replicas: 6 (0.24762692 %)
>>>> Number of data-nodes: 9
>>>> Number of racks: 1
>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>
>>>>
>>>> My question is that why disks of slaves are getting full even though
>>>> there are only few files in DFS?
>>>>
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>
>
>
> --
> --Regards
> Sandeep Nemuri
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hi Rahman,
These are few lines from hadoop fsck / -blocks -files -locations
/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s): OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]
/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s): OK
0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]
Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:
Status: HEALTHY
Total size: 58143055251 B
Total dirs: 307
Total files: 5093
Total blocks (validated): 3903 (avg. block size 14897016 B)
Minimally replicated blocks: 3903 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 92 (2.357161 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.1401486
Corrupt blocks: 0
Missing replicas: 92 (0.75065273 %)
Number of data-nodes: 9
Number of racks: 1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds
The filesystem under path '/' is HEALTHY
I have not overridden dfs.datanode.du.reserved. It defaults to 0.
$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.
$for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done
225G /mnt/hadoop
224G /mnt/hadoop/dfs/data
61M /mnt/logs
281G /mnt/hadoop
281G /mnt/hadoop/dfs/data
63M /mnt/logs
139G /mnt/hadoop
139G /mnt/hadoop/dfs/data
68M /mnt/logs
135G /mnt/hadoop
134G /mnt/hadoop/dfs/data
92M /mnt/logs
165G /mnt/hadoop
164G /mnt/hadoop/dfs/data
75M /mnt/logs
137G /mnt/hadoop
137G /mnt/hadoop/dfs/data
95M /mnt/logs
160G /mnt/hadoop
160G /mnt/hadoop/dfs/data
74M /mnt/logs
180G /mnt/hadoop
122G /mnt/hadoop/dfs/data
23M /mnt/logs
139G /mnt/hadoop
138G /mnt/hadoop/dfs/data
76M /mnt/logs
All these numbers are for today, and may differ bit from yesterday.
Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.
Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,
-- Saumitra S. Shahapure
On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
ashettia@hortonworks.com> wrote:
> Hi Biswa,
>
> Are you sure that the replication factor of the files are three? Please
> run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
> factor for each file. Also, Post the configuration of <name>dfs.datanode.
> du.reserved</name> and please check the real space presented by a
> DataNode by running ‘du -h’
>
> Thanks,
> Rahman
>
> On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
>
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal
> to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly
> in few GBs, I have configured log4j properties so that logs won’t be too
> large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and
> DFS. Over there logs directory is < 10GBs and rest space is occupied by
> DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>
>
> I am talking about name node status page on 50070 port. Here is the
> screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
> spaces taken by non-DFS data like logs or other local files from users.
> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
> ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
> biswajit.nayak@inmobi.com> wrote:
>
>> Whats the replication factor you have? I believe it should be 3. hadoop
>> dus shows that disk usage without replication. While name node ui page
>> gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hi Biswajeet,
>>>
>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>> nowhere near 1TB.
>>>
>>> Basically I wanted to point out discrepancy in name node status page and hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>>
>>>
>>>
>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>>> wrote:
>>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to
>>> filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>> We are using default HDFS block size.
>>>>
>>>> We have noticed that disks of slaves are almost full. From name node’s
>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>> full and DFS Used% in cluster summary page is ~1TB.
>>>>
>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>> that there is no internal fragmentation because the files in our Hive
>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>> hadoop fsck / -files -blocks
>>>>
>>>> Status: HEALTHY
>>>> Total size: 38086441332 B
>>>> Total dirs: 232
>>>> Total files: 802
>>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>>> Minimally replicated blocks: 796 (100.0 %)
>>>> Over-replicated blocks: 0 (0.0 %)
>>>> Under-replicated blocks: 6 (0.75376886 %)
>>>> Mis-replicated blocks: 0 (0.0 %)
>>>> Default replication factor: 2
>>>> Average block replication: 3.0439699
>>>> Corrupt blocks: 0
>>>> Missing replicas: 6 (0.24762692 %)
>>>> Number of data-nodes: 9
>>>> Number of racks: 1
>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>
>>>>
>>>> My question is that why disks of slaves are getting full even though
>>>> there are only few files in DFS?
>>>>
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>
>
>
> --
> --Regards
> Sandeep Nemuri
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Biswa,
Are you sure that the replication factor of the files are three? Please run a ‘hadoop fsck / -blocks -files -locations’ and see the replication factor for each file. Also, Post the configuration of <name>dfs.datanode.du.reserved</name> and please check the real space presented by a DataNode by running ‘du -h’
Thanks,
Rahman
On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
> I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
>> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Biswa,
Are you sure that the replication factor of the files are three? Please run a ‘hadoop fsck / -blocks -files -locations’ and see the replication factor for each file. Also, Post the configuration of <name>dfs.datanode.du.reserved</name> and please check the real space presented by a DataNode by running ‘du -h’
Thanks,
Rahman
On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
> I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
>> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Fwd: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hello,
Thanks for your replies,
Biswanath, looks like we have confusion in calculation, 1TB would be equal
to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly
in few GBs, I have already configured log4j properties so that logs won’t
be too large.
In our slave machines, we have 450GB disk partition dedicated for hadoop
logs and DFS. Over there, logs directory is < 10GBs and rest space is
occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
Basically I wanted to point out discrepancy in name node status page
and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>
I am talking about name node status page on 50070 port. Below is the
screenshot of my name node status page.
As I understand, 'DFS used' (in Cluster Summary section) is the space taken
by DFS, 'non-DFS used' is spaces taken by non-DFS data like logs or other
local files from users.
In my case, Namenode is showing that DFS used is ~1TB but hadoop dfs
-dus is showing
it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Biswa,
Are you sure that the replication factor of the files are three? Please run a ‘hadoop fsck / -blocks -files -locations’ and see the replication factor for each file. Also, Post the configuration of <name>dfs.datanode.du.reserved</name> and please check the real space presented by a DataNode by running ‘du -h’
Thanks,
Rahman
On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
> I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
>> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Fwd: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hello,
Thanks for your replies,
Biswanath, looks like we have confusion in calculation, 1TB would be equal
to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly
in few GBs, I have already configured log4j properties so that logs won’t
be too large.
In our slave machines, we have 450GB disk partition dedicated for hadoop
logs and DFS. Over there, logs directory is < 10GBs and rest space is
occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
Basically I wanted to point out discrepancy in name node status page
and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>
I am talking about name node status page on 50070 port. Below is the
screenshot of my name node status page.
As I understand, 'DFS used' (in Cluster Summary section) is the space taken
by DFS, 'non-DFS used' is spaces taken by non-DFS data like logs or other
local files from users.
In my case, Namenode is showing that DFS used is ~1TB but hadoop dfs
-dus is showing
it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Biswa,
Are you sure that the replication factor of the files are three? Please run a ‘hadoop fsck / -blocks -files -locations’ and see the replication factor for each file. Also, Post the configuration of <name>dfs.datanode.du.reserved</name> and please check the real space presented by a DataNode by running ‘du -h’
Thanks,
Rahman
On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
> I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
>> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: HDFS file system size issue
Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi Biswa,
Are you sure that the replication factor of the files are three? Please run a ‘hadoop fsck / -blocks -files -locations’ and see the replication factor for each file. Also, Post the configuration of <name>dfs.datanode.du.reserved</name> and please check the real space presented by a DataNode by running ‘du -h’
Thanks,
Rahman
On Apr 14, 2014, at 2:07 PM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
> I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
>
>> Please check your logs directory usage.
>>
>>
>>
>> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
>> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>>
>>
>>
>> --
>> --Regards
>> Sandeep Nemuri
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Fwd: HDFS file system size issue
Posted by Saumitra Shahapure <sa...@gmail.com>.
Hello,
Thanks for your replies,
Biswanath, looks like we have confusion in calculation, 1TB would be equal
to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly
in few GBs, I have already configured log4j properties so that logs won’t
be too large.
In our slave machines, we have 450GB disk partition dedicated for hadoop
logs and DFS. Over there, logs directory is < 10GBs and rest space is
occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
Basically I wanted to point out discrepancy in name node status page
and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>
I am talking about name node status page on 50070 port. Below is the
screenshot of my name node status page.
As I understand, 'DFS used' (in Cluster Summary section) is the space taken
by DFS, 'non-DFS used' is spaces taken by non-DFS data like logs or other
local files from users.
In my case, Namenode is showing that DFS used is ~1TB but hadoop dfs
-dus is showing
it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hello,
Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
> Hi Biswajeet,
>
> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
>
> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>> Hello,
>>
>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>
>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>
>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>
>> Status: HEALTHY
>> Total size: 38086441332 B
>> Total dirs: 232
>> Total files: 802
>> Total blocks (validated): 796 (avg. block size 47847288 B)
>> Minimally replicated blocks: 796 (100.0 %)
>> Over-replicated blocks: 0 (0.0 %)
>> Under-replicated blocks: 6 (0.75376886 %)
>> Mis-replicated blocks: 0 (0.0 %)
>> Default replication factor: 2
>> Average block replication: 3.0439699
>> Corrupt blocks: 0
>> Missing replicas: 6 (0.24762692 %)
>> Number of data-nodes: 9
>> Number of racks: 1
>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>
>>
>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> --
> --Regards
> Sandeep Nemuri
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hello,
Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
> Hi Biswajeet,
>
> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
>
> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>> Hello,
>>
>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>
>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>
>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>
>> Status: HEALTHY
>> Total size: 38086441332 B
>> Total dirs: 232
>> Total files: 802
>> Total blocks (validated): 796 (avg. block size 47847288 B)
>> Minimally replicated blocks: 796 (100.0 %)
>> Over-replicated blocks: 0 (0.0 %)
>> Under-replicated blocks: 6 (0.75376886 %)
>> Mis-replicated blocks: 0 (0.0 %)
>> Default replication factor: 2
>> Average block replication: 3.0439699
>> Corrupt blocks: 0
>> Missing replicas: 6 (0.24762692 %)
>> Number of data-nodes: 9
>> Number of racks: 1
>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>
>>
>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> --
> --Regards
> Sandeep Nemuri
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hello,
Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
> Hi Biswajeet,
>
> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
>
> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>> Hello,
>>
>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>
>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>
>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>
>> Status: HEALTHY
>> Total size: 38086441332 B
>> Total dirs: 232
>> Total files: 802
>> Total blocks (validated): 796 (avg. block size 47847288 B)
>> Minimally replicated blocks: 796 (100.0 %)
>> Over-replicated blocks: 0 (0.0 %)
>> Under-replicated blocks: 6 (0.75376886 %)
>> Mis-replicated blocks: 0 (0.0 %)
>> Default replication factor: 2
>> Average block replication: 3.0439699
>> Corrupt blocks: 0
>> Missing replicas: 6 (0.24762692 %)
>> Number of data-nodes: 9
>> Number of racks: 1
>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>
>>
>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> --
> --Regards
> Sandeep Nemuri
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hello,
Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
> Hi Biswajeet,
>
> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
>
> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>> Hello,
>>
>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>
>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>
>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>
>> Status: HEALTHY
>> Total size: 38086441332 B
>> Total dirs: 232
>> Total files: 802
>> Total blocks (validated): 796 (avg. block size 47847288 B)
>> Minimally replicated blocks: 796 (100.0 %)
>> Over-replicated blocks: 0 (0.0 %)
>> Under-replicated blocks: 6 (0.75376886 %)
>> Mis-replicated blocks: 0 (0.0 %)
>> Default replication factor: 2
>> Average block replication: 3.0439699
>> Corrupt blocks: 0
>> Missing replicas: 6 (0.24762692 %)
>> Number of data-nodes: 9
>> Number of racks: 1
>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>
>>
>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> --
> --Regards
> Sandeep Nemuri
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hello,
Biswanath, looks like we have confusion in calculation, 1TB would be equal to 1024GB, not 114GB.
Sandeep, I checked log directory size as well. Log directories are hardly in few GBs, I have configured log4j properties so that logs won’t be too large.
In our slave machines, we have 450GB disk partition for hadoop logs and DFS. Over there logs directory is < 10GBs and rest space is occupied by DFS. 10GB partition is for /.
Let me quote my confusion point once again:
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
I am talking about name node status page on 50070 port. Here is the screenshot of my name node status page
As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces taken by non-DFS data like logs or other local files from users. Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nh...@gmail.com> wrote:
> Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <bi...@inmobi.com> wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop dus shows that disk usage without replication. While name node ui page gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com> wrote:
> Hi Biswajeet,
>
> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
>
> Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
>
>
>
>
> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
>> Hello,
>>
>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>>
>> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>>
>> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>>
>> Status: HEALTHY
>> Total size: 38086441332 B
>> Total dirs: 232
>> Total files: 802
>> Total blocks (validated): 796 (avg. block size 47847288 B)
>> Minimally replicated blocks: 796 (100.0 %)
>> Over-replicated blocks: 0 (0.0 %)
>> Under-replicated blocks: 6 (0.75376886 %)
>> Mis-replicated blocks: 0 (0.0 %)
>> Default replication factor: 2
>> Average block replication: 3.0439699
>> Corrupt blocks: 0
>> Missing replicas: 6 (0.24762692 %)
>> Number of data-nodes: 9
>> Number of racks: 1
>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>
>>
>> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
>
>
>
> --
> --Regards
> Sandeep Nemuri
Re: HDFS file system size issue
Posted by Sandeep Nemuri <nh...@gmail.com>.
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Sandeep Nemuri <nh...@gmail.com>.
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Sandeep Nemuri <nh...@gmail.com>.
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Sandeep Nemuri <nh...@gmail.com>.
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Sandeep Nemuri <nh...@gmail.com>.
Please check your logs directory usage.
On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
<bi...@inmobi.com>wrote:
> Whats the replication factor you have? I believe it should be 3. hadoop
> dus shows that disk usage without replication. While name node ui page
> gives with replication.
>
> 38gb * 3 =114gb ~ 1TB
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hi Biswajeet,
>>
>> Non-dfs usage is ~100GB over the cluster. But still the number are
>> nowhere near 1TB.
>>
>> Basically I wanted to point out discrepancy in name node status page and hadoop
>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
>> reports it to be 35GB. What are the factors that can cause this difference?
>> And why is just 35GB data causing DFS to hit its limits?
>>
>>
>>
>>
>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
>> wrote:
>>
>> Hi Saumitra,
>>
>> Could you please check the non-dfs usage. They also contribute to filling
>> up the disk space.
>>
>>
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>> We are using default HDFS block size.
>>>
>>> We have noticed that disks of slaves are almost full. From name node’s
>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>> full and DFS Used% in cluster summary page is ~1TB.
>>>
>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>> 38GB number looks to be correct because we keep only few Hive tables and
>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>> that there is no internal fragmentation because the files in our Hive
>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>> hadoop fsck / -files -blocks
>>>
>>> Status: HEALTHY
>>> Total size: 38086441332 B
>>> Total dirs: 232
>>> Total files: 802
>>> Total blocks (validated): 796 (avg. block size 47847288 B)
>>> Minimally replicated blocks: 796 (100.0 %)
>>> Over-replicated blocks: 0 (0.0 %)
>>> Under-replicated blocks: 6 (0.75376886 %)
>>> Mis-replicated blocks: 0 (0.0 %)
>>> Default replication factor: 2
>>> Average block replication: 3.0439699
>>> Corrupt blocks: 0
>>> Missing replicas: 6 (0.24762692 %)
>>> Number of data-nodes: 9
>>> Number of racks: 1
>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>
>>>
>>> My question is that why disks of slaves are getting full even though
>>> there are only few files in DFS?
>>>
>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
--
--Regards
Sandeep Nemuri
Re: HDFS file system size issue
Posted by Biswajit Nayak <bi...@inmobi.com>.
Whats the replication factor you have? I believe it should be 3. hadoop dus
shows that disk usage without replication. While name node ui page gives
with replication.
38gb * 3 =114gb ~ 1TB
~Biswa
-----oThe important thing is not to stop questioning o-----
On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <sa...@gmail.com>wrote:
> Hi Biswajeet,
>
> Non-dfs usage is ~100GB over the cluster. But still the number are nowhere
> near 1TB.
>
> Basically I wanted to point out discrepancy in name node status page and hadoop
> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
> reports it to be 35GB. What are the factors that can cause this difference?
> And why is just 35GB data causing DFS to hit its limits?
>
>
>
>
> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com>
> wrote:
>
> Hi Saumitra,
>
> Could you please check the non-dfs usage. They also contribute to filling
> up the disk space.
>
>
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
>
>> Hello,
>>
>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We
>> are using default HDFS block size.
>>
>> We have noticed that disks of slaves are almost full. From name node's
>> status page (namenode:50070), we could see that disks of live nodes are 90%
>> full and DFS Used% in cluster summary page is ~1TB.
>>
>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>> 38GB number looks to be correct because we keep only few Hive tables and
>> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>> that there is no internal fragmentation because the files in our Hive
>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>> hadoop fsck / -files -blocks
>>
>> Status: HEALTHY
>> Total size: 38086441332 B
>> Total dirs: 232
>> Total files: 802
>> Total blocks (validated): 796 (avg. block size 47847288 B)
>> Minimally replicated blocks: 796 (100.0 %)
>> Over-replicated blocks: 0 (0.0 %)
>> Under-replicated blocks: 6 (0.75376886 %)
>> Mis-replicated blocks: 0 (0.0 %)
>> Default replication factor: 2
>> Average block replication: 3.0439699
>> Corrupt blocks: 0
>> Missing replicas: 6 (0.24762692 %)
>> Number of data-nodes: 9
>> Number of racks: 1
>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>
>>
>> My question is that why disks of slaves are getting full even though
>> there are only few files in DFS?
>>
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
>
>
--
_____________________________________________________________
The information contained in this communication is intended solely for the
use of the individual or entity to whom it is addressed and others
authorized to receive it. It may contain confidential or legally privileged
information. If you are not the intended recipient you are hereby notified
that any disclosure, copying, distribution or taking any action in reliance
on the contents of this information is strictly prohibited and may be
unlawful. If you have received this communication in error, please notify
us immediately by responding to this email and then delete it from your
system. The firm is neither liable for the proper and complete transmission
of the information contained in this communication nor for any delay in its
receipt.
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hi Biswajeet,
Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
> Hi Saumitra,
>
> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>
>
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>
> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>
> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>
> Status: HEALTHY
> Total size: 38086441332 B
> Total dirs: 232
> Total files: 802
> Total blocks (validated): 796 (avg. block size 47847288 B)
> Minimally replicated blocks: 796 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 6 (0.75376886 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.0439699
> Corrupt blocks: 0
> Missing replicas: 6 (0.24762692 %)
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>
>
> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hi Biswajeet,
Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
> Hi Saumitra,
>
> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>
>
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>
> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>
> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>
> Status: HEALTHY
> Total size: 38086441332 B
> Total dirs: 232
> Total files: 802
> Total blocks (validated): 796 (avg. block size 47847288 B)
> Minimally replicated blocks: 796 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 6 (0.75376886 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.0439699
> Corrupt blocks: 0
> Missing replicas: 6 (0.24762692 %)
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>
>
> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hi Biswajeet,
Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
> Hi Saumitra,
>
> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>
>
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>
> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>
> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>
> Status: HEALTHY
> Total size: 38086441332 B
> Total dirs: 232
> Total files: 802
> Total blocks (validated): 796 (avg. block size 47847288 B)
> Minimally replicated blocks: 796 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 6 (0.75376886 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.0439699
> Corrupt blocks: 0
> Missing replicas: 6 (0.24762692 %)
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>
>
> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hi Biswajeet,
Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
> Hi Saumitra,
>
> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>
>
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>
> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>
> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>
> Status: HEALTHY
> Total size: 38086441332 B
> Total dirs: 232
> Total files: 802
> Total blocks (validated): 796 (avg. block size 47847288 B)
> Minimally replicated blocks: 796 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 6 (0.75376886 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.0439699
> Corrupt blocks: 0
> Missing replicas: 6 (0.24762692 %)
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>
>
> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: HDFS file system size issue
Posted by Saumitra <sa...@gmail.com>.
Hi Biswajeet,
Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 1TB.
Basically I wanted to point out discrepancy in name node status page and hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one reports it to be 35GB. What are the factors that can cause this difference? And why is just 35GB data causing DFS to hit its limits?
On 14-Apr-2014, at 8:31 am, Biswajit Nayak <bi...@inmobi.com> wrote:
> Hi Saumitra,
>
> Could you please check the non-dfs usage. They also contribute to filling up the disk space.
>
>
>
> ~Biswa
> -----oThe important thing is not to stop questioning o-----
>
>
> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com> wrote:
> Hello,
>
> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are using default HDFS block size.
>
> We have noticed that disks of slaves are almost full. From name node’s status page (namenode:50070), we could see that disks of live nodes are 90% full and DFS Used% in cluster summary page is ~1TB.
>
> However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB number looks to be correct because we keep only few Hive tables and hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that there is no internal fragmentation because the files in our Hive tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
>
> Status: HEALTHY
> Total size: 38086441332 B
> Total dirs: 232
> Total files: 802
> Total blocks (validated): 796 (avg. block size 47847288 B)
> Minimally replicated blocks: 796 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 6 (0.75376886 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.0439699
> Corrupt blocks: 0
> Missing replicas: 6 (0.24762692 %)
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>
>
> My question is that why disks of slaves are getting full even though there are only few files in DFS?
>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: HDFS file system size issue
Posted by Biswajit Nayak <bi...@inmobi.com>.
Hi Saumitra,
Could you please check the non-dfs usage. They also contribute to filling
up the disk space.
~Biswa
-----oThe important thing is not to stop questioning o-----
On Mon, Apr 14, 2014 at 1:24 AM, Saumitra <sa...@gmail.com>wrote:
> Hello,
>
> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We
> are using default HDFS block size.
>
> We have noticed that disks of slaves are almost full. From name node's
> status page (namenode:50070), we could see that disks of live nodes are 90%
> full and DFS Used% in cluster summary page is ~1TB.
>
> However hadoop dfs -dus / shows that file system size is merely 38GB.
> 38GB number looks to be correct because we keep only few Hive tables and
> hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
> that there is no internal fragmentation because the files in our Hive
> tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck
> / -files -blocks
>
> Status: HEALTHY
> Total size: 38086441332 B
> Total dirs: 232
> Total files: 802
> Total blocks (validated): 796 (avg. block size 47847288 B)
> Minimally replicated blocks: 796 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 6 (0.75376886 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 3.0439699
> Corrupt blocks: 0
> Missing replicas: 6 (0.24762692 %)
> Number of data-nodes: 9
> Number of racks: 1
> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>
>
> My question is that why disks of slaves are getting full even though there
> are only few files in DFS?
>
--
_____________________________________________________________
The information contained in this communication is intended solely for the
use of the individual or entity to whom it is addressed and others
authorized to receive it. It may contain confidential or legally privileged
information. If you are not the intended recipient you are hereby notified
that any disclosure, copying, distribution or taking any action in reliance
on the contents of this information is strictly prohibited and may be
unlawful. If you have received this communication in error, please notify
us immediately by responding to this email and then delete it from your
system. The firm is neither liable for the proper and complete transmission
of the information contained in this communication nor for any delay in its
receipt.