You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Tang <sh...@gmail.com> on 2014/09/25 03:51:50 UTC

HDFS du and df don't match

hi,

see the du and df command output below.

[hadoop@master-26161  hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -du  -h /
14/09/25 01:35:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
91.9 G   /hbase
492.7 M  /var
390.9 M  /tmp
4.3 M    /user

[hadoop@master-26161 hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -df -h
14/09/25 01:41:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Filesystem                 Size     Used  Available  Use%
hdfs://master-26151:9000  1.3 T  642.8 G    711.0 G   47%

you can see "du" say there are  93G data under "/", but "df" says we used 642.8G. we set replica facotr is 3.
using 93*3=279G would make sense. 
so my question: who used other disks? how to clean them?

Thanks.
Tang

Re: HDFS du and df don't match

Posted by vivek <vi...@gmail.com>.
Hi,

du == *Disk Usage <http://en.wikipedia.org/wiki/Du_%28Unix%29>*. It walks
through directory tree and counts the sum size of all files therein. It may
not output exact information due to the possibility of unreadable files,
hardlinks in directory tree, etc. It will show information about the
specific directory requested. Think, *"How much disk space is being used by
these files?"*

df == *Disk Free <http://en.wikipedia.org/wiki/Df_%28Unix%29>*. Looks at
disk used blocks directly in filesystem metadata. Because of this it
returns much faster that du but can only show info about the entire
disk/partition. Think, *"How much free disk space do I have?"*



*You can go through the link
http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html
<http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html>*

On Wed, Sep 24, 2014 at 9:51 PM, Tang <sh...@gmail.com> wrote:

> hi,
>
> see the du and df command output below.
>
> [hadoop@master-26161  hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -du  -h /
> 14/09/25 01:35:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 91.9 G   /hbase
> 492.7 M  /var
> 390.9 M  /tmp
> 4.3 M    /user
>
> [hadoop@master-26161 hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -df -h
> 14/09/25 01:41:58 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Filesystem                 Size     Used  Available  Use%
> hdfs://master-26151:9000  1.3 T  642.8 G    711.0 G   47%
>
> you can see "du" say there are  93G data under "/", but "df" says we used
> 642.8G. we set replica facotr is 3.
> using 93*3=279G would make sense.
> so my question: who used other disks? how to clean them?
>
> Thanks.
> Tang
>
>


-- 







Thanks and Regards,

VIVEK KOUL

Re: HDFS du and df don't match

Posted by vivek <vi...@gmail.com>.
Hi,

du == *Disk Usage <http://en.wikipedia.org/wiki/Du_%28Unix%29>*. It walks
through directory tree and counts the sum size of all files therein. It may
not output exact information due to the possibility of unreadable files,
hardlinks in directory tree, etc. It will show information about the
specific directory requested. Think, *"How much disk space is being used by
these files?"*

df == *Disk Free <http://en.wikipedia.org/wiki/Df_%28Unix%29>*. Looks at
disk used blocks directly in filesystem metadata. Because of this it
returns much faster that du but can only show info about the entire
disk/partition. Think, *"How much free disk space do I have?"*



*You can go through the link
http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html
<http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html>*

On Wed, Sep 24, 2014 at 9:51 PM, Tang <sh...@gmail.com> wrote:

> hi,
>
> see the du and df command output below.
>
> [hadoop@master-26161  hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -du  -h /
> 14/09/25 01:35:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 91.9 G   /hbase
> 492.7 M  /var
> 390.9 M  /tmp
> 4.3 M    /user
>
> [hadoop@master-26161 hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -df -h
> 14/09/25 01:41:58 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Filesystem                 Size     Used  Available  Use%
> hdfs://master-26151:9000  1.3 T  642.8 G    711.0 G   47%
>
> you can see "du" say there are  93G data under "/", but "df" says we used
> 642.8G. we set replica facotr is 3.
> using 93*3=279G would make sense.
> so my question: who used other disks? how to clean them?
>
> Thanks.
> Tang
>
>


-- 







Thanks and Regards,

VIVEK KOUL

Re: HDFS du and df don't match

Posted by vivek <vi...@gmail.com>.
Hi,

du == *Disk Usage <http://en.wikipedia.org/wiki/Du_%28Unix%29>*. It walks
through directory tree and counts the sum size of all files therein. It may
not output exact information due to the possibility of unreadable files,
hardlinks in directory tree, etc. It will show information about the
specific directory requested. Think, *"How much disk space is being used by
these files?"*

df == *Disk Free <http://en.wikipedia.org/wiki/Df_%28Unix%29>*. Looks at
disk used blocks directly in filesystem metadata. Because of this it
returns much faster that du but can only show info about the entire
disk/partition. Think, *"How much free disk space do I have?"*



*You can go through the link
http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html
<http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html>*

On Wed, Sep 24, 2014 at 9:51 PM, Tang <sh...@gmail.com> wrote:

> hi,
>
> see the du and df command output below.
>
> [hadoop@master-26161  hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -du  -h /
> 14/09/25 01:35:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 91.9 G   /hbase
> 492.7 M  /var
> 390.9 M  /tmp
> 4.3 M    /user
>
> [hadoop@master-26161 hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -df -h
> 14/09/25 01:41:58 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Filesystem                 Size     Used  Available  Use%
> hdfs://master-26151:9000  1.3 T  642.8 G    711.0 G   47%
>
> you can see "du" say there are  93G data under "/", but "df" says we used
> 642.8G. we set replica facotr is 3.
> using 93*3=279G would make sense.
> so my question: who used other disks? how to clean them?
>
> Thanks.
> Tang
>
>


-- 







Thanks and Regards,

VIVEK KOUL

Re: HDFS du and df don't match

Posted by vivek <vi...@gmail.com>.
Hi,

du == *Disk Usage <http://en.wikipedia.org/wiki/Du_%28Unix%29>*. It walks
through directory tree and counts the sum size of all files therein. It may
not output exact information due to the possibility of unreadable files,
hardlinks in directory tree, etc. It will show information about the
specific directory requested. Think, *"How much disk space is being used by
these files?"*

df == *Disk Free <http://en.wikipedia.org/wiki/Df_%28Unix%29>*. Looks at
disk used blocks directly in filesystem metadata. Because of this it
returns much faster that du but can only show info about the entire
disk/partition. Think, *"How much free disk space do I have?"*



*You can go through the link
http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html
<http://linuxshellaccount.blogspot.in/2008/12/why-du-and-df-display-different-values.html>*

On Wed, Sep 24, 2014 at 9:51 PM, Tang <sh...@gmail.com> wrote:

> hi,
>
> see the du and df command output below.
>
> [hadoop@master-26161  hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -du  -h /
> 14/09/25 01:35:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 91.9 G   /hbase
> 492.7 M  /var
> 390.9 M  /tmp
> 4.3 M    /user
>
> [hadoop@master-26161 hadoop]$ /opt/hadoop-2.4.1/bin/hdfs dfs -df -h
> 14/09/25 01:41:58 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Filesystem                 Size     Used  Available  Use%
> hdfs://master-26151:9000  1.3 T  642.8 G    711.0 G   47%
>
> you can see "du" say there are  93G data under "/", but "df" says we used
> 642.8G. we set replica facotr is 3.
> using 93*3=279G would make sense.
> so my question: who used other disks? how to clean them?
>
> Thanks.
> Tang
>
>


-- 







Thanks and Regards,

VIVEK KOUL