You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2016/03/31 22:13:29 UTC

difference between dus and df output

Have you seen this thread ?

http://search-hadoop.com/m/uOzYtatlmAcqgzM

On Thu, Mar 31, 2016 at 11:58 AM, Ted Tuttle <te...@mentacapital.com> wrote:

> Hello-
>
> We are running v0.94.9 cluster.
>
> I am seeing that 'fs -dus' reports 24TB used and 'fs -df' reports 74.TB
> used.
>
> Does anyone know why these do not reconcile? Our replication factor is 2
> so that is not a likely explanation.
>
> Shown below are results from my cluster (doctored to TB for ease of
> reading):
>
> bash-4.1$ hadoop fs -dus /hbase
> hdfs://host/hbase      24.5TB
>
> bash-4.1$ hadoop fs -df /hbase
> Filesystem              Size    Used    Avail   Use%
> /hbase          103.8TB 74.2TB 24.3TB  71%
>

RE: difference between dus and df output

Posted by Ted Tuttle <te...@mentacapital.com>.
We have daily major compactions configured.  

Looking at a sample region server we several major compactions in the last day due to activity.

No snapshots were returned by hbase shell > list_snapshots:

hbase(main):001:0> list_snapshots
SNAPSHOT                                                TABLE + CREATION TIME
0 row(s) in 5.5120 seconds

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, March 31, 2016 2:57 PM
To: user@hbase.apache.org
Subject: Re: difference between dus and df output

Have you performed major compaction lately ?

Are there non-expired hbase snapshots ?

Cheers

On Thu, Mar 31, 2016 at 2:50 PM, Ted Tuttle <te...@mentacapital.com> wrote:

> This is very interesting, Ted. Thank you.
>
> We are only running HBase on hdfs.
>
> Does this mostly empty block appending behavior make sense for 
> HBase-only usage?
>
> If this is, in fact, unused storage how do we get it back?
>
> Currently df shows 75% filled while du shows 25%.  The former is 
> prompting us to consider more hardware.  If in fact we are 25% we don't need to.
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Thursday, March 31, 2016 1:13 PM
> To: user@hbase.apache.org
> Subject: difference between dus and df output
>
> Have you seen this thread ?
>
> http://search-hadoop.com/m/uOzYtatlmAcqgzM
>
> On Thu, Mar 31, 2016 at 11:58 AM, Ted Tuttle <te...@mentacapital.com> wrote:
>
> > Hello-
> >
> > We are running v0.94.9 cluster.
> >
> > I am seeing that 'fs -dus' reports 24TB used and 'fs -df' reports 
> > 74.TB used.
> >
> > Does anyone know why these do not reconcile? Our replication factor 
> > is
> > 2 so that is not a likely explanation.
> >
> > Shown below are results from my cluster (doctored to TB for ease of
> > reading):
> >
> > bash-4.1$ hadoop fs -dus /hbase
> > hdfs://host/hbase      24.5TB
> >
> > bash-4.1$ hadoop fs -df /hbase
> > Filesystem              Size    Used    Avail   Use%
> > /hbase          103.8TB 74.2TB 24.3TB  71%
> >
>

Re: difference between dus and df output

Posted by Ted Yu <yu...@gmail.com>.
Have you performed major compaction lately ?

Are there non-expired hbase snapshots ?

Cheers

On Thu, Mar 31, 2016 at 2:50 PM, Ted Tuttle <te...@mentacapital.com> wrote:

> This is very interesting, Ted. Thank you.
>
> We are only running HBase on hdfs.
>
> Does this mostly empty block appending behavior make sense for HBase-only
> usage?
>
> If this is, in fact, unused storage how do we get it back?
>
> Currently df shows 75% filled while du shows 25%.  The former is prompting
> us to consider more hardware.  If in fact we are 25% we don't need to.
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Thursday, March 31, 2016 1:13 PM
> To: user@hbase.apache.org
> Subject: difference between dus and df output
>
> Have you seen this thread ?
>
> http://search-hadoop.com/m/uOzYtatlmAcqgzM
>
> On Thu, Mar 31, 2016 at 11:58 AM, Ted Tuttle <te...@mentacapital.com> wrote:
>
> > Hello-
> >
> > We are running v0.94.9 cluster.
> >
> > I am seeing that 'fs -dus' reports 24TB used and 'fs -df' reports
> > 74.TB used.
> >
> > Does anyone know why these do not reconcile? Our replication factor is
> > 2 so that is not a likely explanation.
> >
> > Shown below are results from my cluster (doctored to TB for ease of
> > reading):
> >
> > bash-4.1$ hadoop fs -dus /hbase
> > hdfs://host/hbase      24.5TB
> >
> > bash-4.1$ hadoop fs -df /hbase
> > Filesystem              Size    Used    Avail   Use%
> > /hbase          103.8TB 74.2TB 24.3TB  71%
> >
>

RE: difference between dus and df output

Posted by Ted Tuttle <te...@mentacapital.com>.
This is very interesting, Ted. Thank you.

We are only running HBase on hdfs.

Does this mostly empty block appending behavior make sense for HBase-only usage?

If this is, in fact, unused storage how do we get it back? 

Currently df shows 75% filled while du shows 25%.  The former is prompting us to consider more hardware.  If in fact we are 25% we don't need to.

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, March 31, 2016 1:13 PM
To: user@hbase.apache.org
Subject: difference between dus and df output

Have you seen this thread ?

http://search-hadoop.com/m/uOzYtatlmAcqgzM

On Thu, Mar 31, 2016 at 11:58 AM, Ted Tuttle <te...@mentacapital.com> wrote:

> Hello-
>
> We are running v0.94.9 cluster.
>
> I am seeing that 'fs -dus' reports 24TB used and 'fs -df' reports 
> 74.TB used.
>
> Does anyone know why these do not reconcile? Our replication factor is 
> 2 so that is not a likely explanation.
>
> Shown below are results from my cluster (doctored to TB for ease of
> reading):
>
> bash-4.1$ hadoop fs -dus /hbase
> hdfs://host/hbase      24.5TB
>
> bash-4.1$ hadoop fs -df /hbase
> Filesystem              Size    Used    Avail   Use%
> /hbase          103.8TB 74.2TB 24.3TB  71%
>