You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Martin Traverso <mt...@gmail.com> on 2008/02/15 22:05:40 UTC
dfsadmin reporting wrong disk usage numbers
Hi,
Are there any known issues on how dfsadmin reports disk usage? I'm getting
some weird values:
Name: 10.15.104.46:50010
State : In Service
Total raw bytes: 1433244008448 (1.3 TB)
Remaining raw bytes: 383128089432(356.82 GB)
Used raw bytes: 1042296986024 (970.71 GB)
% used: 72.72%
However, usage on that box is:
size used avail capacity Mounted on
650G 240G 409G 37% /local/data/hadoop/d0
685G 243G 443G 36% /local/data/hadoop/d1
d0 and d1 are mounted on two separate drives. The used raw bytes count is
off by 2x.
Thanks,
Martin
Re: dfsadmin reporting wrong disk usage numbers
Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
Yes, please file a bug.
There are file systems with different block sizes out there Linux or Solaris.
Thanks,
--Konstantin
Martin Traverso wrote:
> I think I found the issue. The class org.apache.hadoop.fs.DU assumes
> 1024-byte blocks when reporting usage information:
>
> this.used = Long.parseLong(tokens[0])*1024;
>
> This works fine in linux, but in Solaris and Mac OS the reported number of
> blocks is based on 512-byte blocks.
>
> The solution is simple: DU should use "du -sk" instead of "du -s".
>
> Should I file I bug for this?
>
> Martin
>
Re: dfsadmin reporting wrong disk usage numbers
Posted by Martin Traverso <mt...@gmail.com>.
I think I found the issue. The class org.apache.hadoop.fs.DU assumes
1024-byte blocks when reporting usage information:
this.used = Long.parseLong(tokens[0])*1024;
This works fine in linux, but in Solaris and Mac OS the reported number of
blocks is based on 512-byte blocks.
The solution is simple: DU should use "du -sk" instead of "du -s".
Should I file I bug for this?
Martin
Re: dfsadmin reporting wrong disk usage numbers
Posted by Martin Traverso <mt...@gmail.com>.
>
> What are the data directories
> specified in your configuration? Have you specified two data directories
> per
> volume?
>
No, just one directory per volume. This is the value of dfs.data.dir in my
hadoop-site.xml:
<property>
<name>dfs.data.dir</name>
<value>/local/data/hadoop/d0/dfs/data,/local/data/hadoop/d1/dfs/data</value>
</property>
Martin
>
> Hairong
>
> On 2/15/08 1:05 PM, "Martin Traverso" <mt...@gmail.com> wrote:
>
> > Hi,
> >
> > Are there any known issues on how dfsadmin reports disk usage? I'm
> getting
> > some weird values:
> >
> > Name: 10.15.104.46:50010
> > State : In Service
> > Total raw bytes: 1433244008448 (1.3 TB)
> > Remaining raw bytes: 383128089432(356.82 GB)
> > Used raw bytes: 1042296986024 (970.71 GB)
> > % used: 72.72%
> >
> >
> > However, usage on that box is:
> >
> > size used avail capacity Mounted on
> > 650G 240G 409G 37% /local/data/hadoop/d0
> > 685G 243G 443G 36% /local/data/hadoop/d1
> >
> > d0 and d1 are mounted on two separate drives. The used raw bytes count
> is
> > off by 2x.
> >
> > Thanks,
> > Martin
>
>
Re: dfsadmin reporting wrong disk usage numbers
Posted by Hairong Kuang <ha...@yahoo-inc.com>.
Datanode run du on data directories hourly. In between two "du"s, used space
is updated when a block is added or deleted. What are the data directories
specified in your configuration? Have you specified two data directories per
volume?
Hairong
On 2/15/08 1:05 PM, "Martin Traverso" <mt...@gmail.com> wrote:
> Hi,
>
> Are there any known issues on how dfsadmin reports disk usage? I'm getting
> some weird values:
>
> Name: 10.15.104.46:50010
> State : In Service
> Total raw bytes: 1433244008448 (1.3 TB)
> Remaining raw bytes: 383128089432(356.82 GB)
> Used raw bytes: 1042296986024 (970.71 GB)
> % used: 72.72%
>
>
> However, usage on that box is:
>
> size used avail capacity Mounted on
> 650G 240G 409G 37% /local/data/hadoop/d0
> 685G 243G 443G 36% /local/data/hadoop/d1
>
> d0 and d1 are mounted on two separate drives. The used raw bytes count is
> off by 2x.
>
> Thanks,
> Martin