You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Otis Gospodnetić <ot...@gmail.com> on 2015/10/24 04:21:45 UTC

Disk usage drops after RegionServer restart? (0.98)

Hello,

Is/was there a known issue with HBase 0.98 "holding onto" files?

We noticed the used disk space metric going up, up and up and we could not
stop it with major compaction.
But we noticed that if we restart a RegionServer 2 things happen:
1) its disk usage immediately drops a lot
2) the disk usage of other RegionServers drops some as well

Have a look at this chart:
  https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq

At 1:54 we restarted the first RS (blue line)
At 2:03 we restarted the second RS (dark green line)

Is/was this a known HBase 0.98 issue?

Thanks,
Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Vladimir Rodionov <vl...@gmail.com>.
I think this is because some files with open handles get deleted. The space
can be reclaimed only when process exit. This is known "feature" of Linux.

-Vlad

On Mon, Nov 30, 2015 at 10:56 PM, Stack <st...@duboce.net> wrote:

> Thanks for writing back Otis. What was your CP doing?
> St.Ack
>
> On Sat, Nov 28, 2015 at 7:08 PM, Otis Gospodnetić <
> otis.gospodnetic@gmail.com> wrote:
>
> > Hi,
> >
> > In our case it turned out to be co-processors.  More specifically, thanks
> > to Logsene <http://sematext.com/logsene> we would that one of our
> > co-processors logged some exceptions on start.  Once we fixed those
> errors
> > we stopped having issues with growing disk usage.  Sorry I don't have
> more
> > details, but maybe this helps somebody.
> >
> > Otis
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> > On Thu, Oct 29, 2015 at 1:52 PM, Stack <st...@duboce.net> wrote:
> >
> > > Are you printing out filesize (I don't see the -s arg on lsof).
> > > St.Ack
> > >
> > > On Fri, Oct 23, 2015 at 8:08 PM, Otis Gospodnetić <
> > > otis.gospodnetic@gmail.com> wrote:
> > >
> > > > Hi Ted,
> > > >
> > > > 0.98.6-cdh5.3.0
> > > >
> > > > I did actually try to use lsof, but I didn't see anything unusual
> > there.
> > > > Is there something specific I should look for?  Things owned by hbase
> > > user
> > > > or hdfs or yarn?  Hm, here, I don't really see anything interesting
> > > >
> > > > $ sudo lsof| grep '/mnt' <== this is where all data lives and where
> > disk
> > > > usage drops after RS restart
> > > >
> > > > java       2654      hdfs    1w      REG             202,16     89487
> > > > 44042562
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> > > > java       2654      hdfs    2w      REG             202,16     89487
> > > > 44042562
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> > > > java       2654      hdfs  286w      REG             202,16 108938205
> > > > 44044137
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.log
> > > > java       2654      hdfs  289w      REG             202,16         0
> > > > 44040203 /mnt/hadoop-hdfs/log/SecurityAuth-hdfs.audit
> > > > java       2654      hdfs  314w      REG             202,16    261462
> > > > 44040213
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/dncp_block_verification.log.curr
> > > > java       2654      hdfs  316r      REG             202,16 134217728
> > > > 44045060
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir58/blk_1078606358
> > > > java       2654      hdfs  318r      REG             202,16 134217728
> > > > 44057015
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir224/blk_1078648930
> > > > java       2654      hdfs  319uW     REG             202,16        36
> > > > 44042741 /mnt/hadoop-hdfs/data/in_use.lock
> > > > java       2654      hdfs  321r      REG             202,16   1048583
> > > > 44042793
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889_4918820.meta
> > > > java       2654      hdfs  330u      REG             202,16    352563
> > > > 44048279
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432_4935363.meta
> > > > java       2654      hdfs  333r      REG             202,16 134217728
> > > > 44055769
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir9/blk_1078659381
> > > > java       2654      hdfs  335u      REG             202,16  45127168
> > > > 44048273
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432
> > > > java       2654      hdfs  340r      REG             202,16 134217728
> > > > 44042791
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889
> > > > java       2654      hdfs  343r      REG             202,16  13882119
> > > > 44048053
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675385
> > > > java       2654      hdfs  345u      REG             202,16    485059
> > > > 44048209
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399_4935330.meta
> > > > java       2654      hdfs  346r      REG             202,16 134217728
> > > > 44053723
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir4/blk_1078658098
> > > > java       2654      hdfs  347u      REG             202,16    371455
> > > > 44047931
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364_4935295.meta
> > > > java       2654      hdfs  348u      REG             202,16  47545282
> > > > 44047927
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364
> > > > java       2654      hdfs  354u      REG             202,16  20386405
> > > > 44047875
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659266
> > > > java       2654      hdfs  355r      REG             202,16 134217728
> > > > 44042762
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir243/blk_1078653797
> > > > java       2654      hdfs  357r      REG             202,16 134217728
> > > > 44042535
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674123
> > > > java       2654      hdfs  359u      REG             202,16      1839
> > > > 44045445
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506_4934437.meta
> > > > java       2654      hdfs  360u      REG             202,16    234130
> > > > 44045440
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506
> > > > java       2654      hdfs  363r      REG             202,16  20629437
> > > > 44046774
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir17/blk_1078661533
> > > > java       2654      hdfs  369r      REG             202,16  18304945
> > > > 44047599
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675270
> > > > java       2654      hdfs  370r      REG             202,16  62086413
> > > > 44048199
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399
> > > > java       2654      hdfs  379r      REG             202,16 134217728
> > > > 44050035
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir3/blk_1078657983
> > > > java       2654      hdfs  390u      REG             202,16  20857780
> > > > 44050270
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659267
> > > > java       2654      hdfs  408r      REG             202,16 115453375
> > > > 44042299
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674120
> > > > java       2654      hdfs  415r      REG             202,16  20253192
> > > > 44053520
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir60/blk_1078672624
> > > > java       2654      hdfs  423r      REG             202,16  18382878
> > > > 44047547
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675257
> > > > java       2654      hdfs  424r      REG             202,16  19555559
> > > > 44040692
> > > >
> > > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir65/blk_1078673801
> > > > bash      15005  ec2-user  cwd       DIR             202,16      4096
> > > >    2 /mnt
> > > > sudo      16055      root  cwd       DIR             202,16      4096
> > > >    2 /mnt
> > > > grep      16056  ec2-user  cwd       DIR             202,16      4096
> > > >    2 /mnt
> > > > sed       16057  ec2-user  cwd       DIR             202,16      4096
> > > >    2 /mnt
> > > > lsof      16058      root  cwd       DIR             202,16      4096
> > > >    2 /mnt
> > > > lsof      16059      root  cwd       DIR             202,16      4096
> > > >    2 /mnt
> > > > bash      18748     hbase    1w      REG             202,16     12843
> > > >  4980744
> > > >
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > > bash      18748     hbase    2w      REG             202,16     12843
> > > >  4980744
> > > >
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > > java      18761     hbase    1w      REG             202,16     12843
> > > >  4980744
> > > >
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > > java      18761     hbase    2w      REG             202,16     12843
> > > >  4980744
> > > >
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > > java      18761     hbase  338w      REG             202,16 117537786
> > > >  4980753
> > > >
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.log
> > > > java      18761     hbase  339w      REG             202,16         0
> > > >  4980741 /mnt/hbase/log/SecurityAuth.audit
> > > > java      29057      yarn    1w      REG             202,16    130105
> > > > 51380228
> > > >
> > > >
> > >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> > > > java      29057      yarn    2w      REG             202,16    130105
> > > > 51380228
> > > >
> > > >
> > >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> > > > java      29057      yarn  286w      REG             202,16 103611255
> > > > 51380852
> > > >
> > > >
> > >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.log
> > > >
> > > > I don't see anything big there...
> > > >
> > > > Thanks,
> > > > Otis
> > > > --
> > > > Monitoring - Log Management - Alerting - Anomaly Detection
> > > > Solr & Elasticsearch Consulting Support Training -
> > http://sematext.com/
> > > >
> > > >
> > > > On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com>
> wrote:
> > > >
> > > > > Which specific release of 0.98 are you using ?
> > > > >
> > > > > Have you used lsof to see which files were being held onto ?
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
> > > > > otis.gospodnetic@gmail.com> wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > Is/was there a known issue with HBase 0.98 "holding onto" files?
> > > > > >
> > > > > > We noticed the used disk space metric going up, up and up and we
> > > could
> > > > > not
> > > > > > stop it with major compaction.
> > > > > > But we noticed that if we restart a RegionServer 2 things happen:
> > > > > > 1) its disk usage immediately drops a lot
> > > > > > 2) the disk usage of other RegionServers drops some as well
> > > > > >
> > > > > > Have a look at this chart:
> > > > > >   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
> > > > > >
> > > > > > At 1:54 we restarted the first RS (blue line)
> > > > > > At 2:03 we restarted the second RS (dark green line)
> > > > > >
> > > > > > Is/was this a known HBase 0.98 issue?
> > > > > >
> > > > > > Thanks,
> > > > > > Otis
> > > > > > --
> > > > > > Monitoring - Log Management - Alerting - Anomaly Detection
> > > > > > Solr & Elasticsearch Consulting Support Training -
> > > > http://sematext.com/
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Stack <st...@duboce.net>.
Thanks for writing back Otis. What was your CP doing?
St.Ack

On Sat, Nov 28, 2015 at 7:08 PM, Otis Gospodnetić <
otis.gospodnetic@gmail.com> wrote:

> Hi,
>
> In our case it turned out to be co-processors.  More specifically, thanks
> to Logsene <http://sematext.com/logsene> we would that one of our
> co-processors logged some exceptions on start.  Once we fixed those errors
> we stopped having issues with growing disk usage.  Sorry I don't have more
> details, but maybe this helps somebody.
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Thu, Oct 29, 2015 at 1:52 PM, Stack <st...@duboce.net> wrote:
>
> > Are you printing out filesize (I don't see the -s arg on lsof).
> > St.Ack
> >
> > On Fri, Oct 23, 2015 at 8:08 PM, Otis Gospodnetić <
> > otis.gospodnetic@gmail.com> wrote:
> >
> > > Hi Ted,
> > >
> > > 0.98.6-cdh5.3.0
> > >
> > > I did actually try to use lsof, but I didn't see anything unusual
> there.
> > > Is there something specific I should look for?  Things owned by hbase
> > user
> > > or hdfs or yarn?  Hm, here, I don't really see anything interesting
> > >
> > > $ sudo lsof| grep '/mnt' <== this is where all data lives and where
> disk
> > > usage drops after RS restart
> > >
> > > java       2654      hdfs    1w      REG             202,16     89487
> > > 44042562
> > >
> > >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> > > java       2654      hdfs    2w      REG             202,16     89487
> > > 44042562
> > >
> > >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> > > java       2654      hdfs  286w      REG             202,16 108938205
> > > 44044137
> > >
> > >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.log
> > > java       2654      hdfs  289w      REG             202,16         0
> > > 44040203 /mnt/hadoop-hdfs/log/SecurityAuth-hdfs.audit
> > > java       2654      hdfs  314w      REG             202,16    261462
> > > 44040213
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/dncp_block_verification.log.curr
> > > java       2654      hdfs  316r      REG             202,16 134217728
> > > 44045060
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir58/blk_1078606358
> > > java       2654      hdfs  318r      REG             202,16 134217728
> > > 44057015
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir224/blk_1078648930
> > > java       2654      hdfs  319uW     REG             202,16        36
> > > 44042741 /mnt/hadoop-hdfs/data/in_use.lock
> > > java       2654      hdfs  321r      REG             202,16   1048583
> > > 44042793
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889_4918820.meta
> > > java       2654      hdfs  330u      REG             202,16    352563
> > > 44048279
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432_4935363.meta
> > > java       2654      hdfs  333r      REG             202,16 134217728
> > > 44055769
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir9/blk_1078659381
> > > java       2654      hdfs  335u      REG             202,16  45127168
> > > 44048273
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432
> > > java       2654      hdfs  340r      REG             202,16 134217728
> > > 44042791
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889
> > > java       2654      hdfs  343r      REG             202,16  13882119
> > > 44048053
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675385
> > > java       2654      hdfs  345u      REG             202,16    485059
> > > 44048209
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399_4935330.meta
> > > java       2654      hdfs  346r      REG             202,16 134217728
> > > 44053723
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir4/blk_1078658098
> > > java       2654      hdfs  347u      REG             202,16    371455
> > > 44047931
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364_4935295.meta
> > > java       2654      hdfs  348u      REG             202,16  47545282
> > > 44047927
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364
> > > java       2654      hdfs  354u      REG             202,16  20386405
> > > 44047875
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659266
> > > java       2654      hdfs  355r      REG             202,16 134217728
> > > 44042762
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir243/blk_1078653797
> > > java       2654      hdfs  357r      REG             202,16 134217728
> > > 44042535
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674123
> > > java       2654      hdfs  359u      REG             202,16      1839
> > > 44045445
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506_4934437.meta
> > > java       2654      hdfs  360u      REG             202,16    234130
> > > 44045440
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506
> > > java       2654      hdfs  363r      REG             202,16  20629437
> > > 44046774
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir17/blk_1078661533
> > > java       2654      hdfs  369r      REG             202,16  18304945
> > > 44047599
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675270
> > > java       2654      hdfs  370r      REG             202,16  62086413
> > > 44048199
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399
> > > java       2654      hdfs  379r      REG             202,16 134217728
> > > 44050035
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir3/blk_1078657983
> > > java       2654      hdfs  390u      REG             202,16  20857780
> > > 44050270
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659267
> > > java       2654      hdfs  408r      REG             202,16 115453375
> > > 44042299
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674120
> > > java       2654      hdfs  415r      REG             202,16  20253192
> > > 44053520
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir60/blk_1078672624
> > > java       2654      hdfs  423r      REG             202,16  18382878
> > > 44047547
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675257
> > > java       2654      hdfs  424r      REG             202,16  19555559
> > > 44040692
> > >
> > >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir65/blk_1078673801
> > > bash      15005  ec2-user  cwd       DIR             202,16      4096
> > >    2 /mnt
> > > sudo      16055      root  cwd       DIR             202,16      4096
> > >    2 /mnt
> > > grep      16056  ec2-user  cwd       DIR             202,16      4096
> > >    2 /mnt
> > > sed       16057  ec2-user  cwd       DIR             202,16      4096
> > >    2 /mnt
> > > lsof      16058      root  cwd       DIR             202,16      4096
> > >    2 /mnt
> > > lsof      16059      root  cwd       DIR             202,16      4096
> > >    2 /mnt
> > > bash      18748     hbase    1w      REG             202,16     12843
> > >  4980744
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > bash      18748     hbase    2w      REG             202,16     12843
> > >  4980744
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > java      18761     hbase    1w      REG             202,16     12843
> > >  4980744
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > java      18761     hbase    2w      REG             202,16     12843
> > >  4980744
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > > java      18761     hbase  338w      REG             202,16 117537786
> > >  4980753
> > >
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.log
> > > java      18761     hbase  339w      REG             202,16         0
> > >  4980741 /mnt/hbase/log/SecurityAuth.audit
> > > java      29057      yarn    1w      REG             202,16    130105
> > > 51380228
> > >
> > >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> > > java      29057      yarn    2w      REG             202,16    130105
> > > 51380228
> > >
> > >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> > > java      29057      yarn  286w      REG             202,16 103611255
> > > 51380852
> > >
> > >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.log
> > >
> > > I don't see anything big there...
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Monitoring - Log Management - Alerting - Anomaly Detection
> > > Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> > >
> > >
> > > On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Which specific release of 0.98 are you using ?
> > > >
> > > > Have you used lsof to see which files were being held onto ?
> > > >
> > > > Thanks
> > > >
> > > > On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
> > > > otis.gospodnetic@gmail.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > Is/was there a known issue with HBase 0.98 "holding onto" files?
> > > > >
> > > > > We noticed the used disk space metric going up, up and up and we
> > could
> > > > not
> > > > > stop it with major compaction.
> > > > > But we noticed that if we restart a RegionServer 2 things happen:
> > > > > 1) its disk usage immediately drops a lot
> > > > > 2) the disk usage of other RegionServers drops some as well
> > > > >
> > > > > Have a look at this chart:
> > > > >   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
> > > > >
> > > > > At 1:54 we restarted the first RS (blue line)
> > > > > At 2:03 we restarted the second RS (dark green line)
> > > > >
> > > > > Is/was this a known HBase 0.98 issue?
> > > > >
> > > > > Thanks,
> > > > > Otis
> > > > > --
> > > > > Monitoring - Log Management - Alerting - Anomaly Detection
> > > > > Solr & Elasticsearch Consulting Support Training -
> > > http://sematext.com/
> > > > >
> > > >
> > >
> >
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Otis Gospodnetić <ot...@gmail.com>.
Hi,

In our case it turned out to be co-processors.  More specifically, thanks
to Logsene <http://sematext.com/logsene> we would that one of our
co-processors logged some exceptions on start.  Once we fixed those errors
we stopped having issues with growing disk usage.  Sorry I don't have more
details, but maybe this helps somebody.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Thu, Oct 29, 2015 at 1:52 PM, Stack <st...@duboce.net> wrote:

> Are you printing out filesize (I don't see the -s arg on lsof).
> St.Ack
>
> On Fri, Oct 23, 2015 at 8:08 PM, Otis Gospodnetić <
> otis.gospodnetic@gmail.com> wrote:
>
> > Hi Ted,
> >
> > 0.98.6-cdh5.3.0
> >
> > I did actually try to use lsof, but I didn't see anything unusual there.
> > Is there something specific I should look for?  Things owned by hbase
> user
> > or hdfs or yarn?  Hm, here, I don't really see anything interesting
> >
> > $ sudo lsof| grep '/mnt' <== this is where all data lives and where disk
> > usage drops after RS restart
> >
> > java       2654      hdfs    1w      REG             202,16     89487
> > 44042562
> >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> > java       2654      hdfs    2w      REG             202,16     89487
> > 44042562
> >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> > java       2654      hdfs  286w      REG             202,16 108938205
> > 44044137
> >
> >
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.log
> > java       2654      hdfs  289w      REG             202,16         0
> > 44040203 /mnt/hadoop-hdfs/log/SecurityAuth-hdfs.audit
> > java       2654      hdfs  314w      REG             202,16    261462
> > 44040213
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/dncp_block_verification.log.curr
> > java       2654      hdfs  316r      REG             202,16 134217728
> > 44045060
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir58/blk_1078606358
> > java       2654      hdfs  318r      REG             202,16 134217728
> > 44057015
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir224/blk_1078648930
> > java       2654      hdfs  319uW     REG             202,16        36
> > 44042741 /mnt/hadoop-hdfs/data/in_use.lock
> > java       2654      hdfs  321r      REG             202,16   1048583
> > 44042793
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889_4918820.meta
> > java       2654      hdfs  330u      REG             202,16    352563
> > 44048279
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432_4935363.meta
> > java       2654      hdfs  333r      REG             202,16 134217728
> > 44055769
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir9/blk_1078659381
> > java       2654      hdfs  335u      REG             202,16  45127168
> > 44048273
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432
> > java       2654      hdfs  340r      REG             202,16 134217728
> > 44042791
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889
> > java       2654      hdfs  343r      REG             202,16  13882119
> > 44048053
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675385
> > java       2654      hdfs  345u      REG             202,16    485059
> > 44048209
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399_4935330.meta
> > java       2654      hdfs  346r      REG             202,16 134217728
> > 44053723
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir4/blk_1078658098
> > java       2654      hdfs  347u      REG             202,16    371455
> > 44047931
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364_4935295.meta
> > java       2654      hdfs  348u      REG             202,16  47545282
> > 44047927
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364
> > java       2654      hdfs  354u      REG             202,16  20386405
> > 44047875
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659266
> > java       2654      hdfs  355r      REG             202,16 134217728
> > 44042762
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir243/blk_1078653797
> > java       2654      hdfs  357r      REG             202,16 134217728
> > 44042535
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674123
> > java       2654      hdfs  359u      REG             202,16      1839
> > 44045445
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506_4934437.meta
> > java       2654      hdfs  360u      REG             202,16    234130
> > 44045440
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506
> > java       2654      hdfs  363r      REG             202,16  20629437
> > 44046774
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir17/blk_1078661533
> > java       2654      hdfs  369r      REG             202,16  18304945
> > 44047599
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675270
> > java       2654      hdfs  370r      REG             202,16  62086413
> > 44048199
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399
> > java       2654      hdfs  379r      REG             202,16 134217728
> > 44050035
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir3/blk_1078657983
> > java       2654      hdfs  390u      REG             202,16  20857780
> > 44050270
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659267
> > java       2654      hdfs  408r      REG             202,16 115453375
> > 44042299
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674120
> > java       2654      hdfs  415r      REG             202,16  20253192
> > 44053520
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir60/blk_1078672624
> > java       2654      hdfs  423r      REG             202,16  18382878
> > 44047547
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675257
> > java       2654      hdfs  424r      REG             202,16  19555559
> > 44040692
> >
> >
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir65/blk_1078673801
> > bash      15005  ec2-user  cwd       DIR             202,16      4096
> >    2 /mnt
> > sudo      16055      root  cwd       DIR             202,16      4096
> >    2 /mnt
> > grep      16056  ec2-user  cwd       DIR             202,16      4096
> >    2 /mnt
> > sed       16057  ec2-user  cwd       DIR             202,16      4096
> >    2 /mnt
> > lsof      16058      root  cwd       DIR             202,16      4096
> >    2 /mnt
> > lsof      16059      root  cwd       DIR             202,16      4096
> >    2 /mnt
> > bash      18748     hbase    1w      REG             202,16     12843
> >  4980744
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > bash      18748     hbase    2w      REG             202,16     12843
> >  4980744
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > java      18761     hbase    1w      REG             202,16     12843
> >  4980744
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > java      18761     hbase    2w      REG             202,16     12843
> >  4980744
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> > java      18761     hbase  338w      REG             202,16 117537786
> >  4980753
> >
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.log
> > java      18761     hbase  339w      REG             202,16         0
> >  4980741 /mnt/hbase/log/SecurityAuth.audit
> > java      29057      yarn    1w      REG             202,16    130105
> > 51380228
> >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> > java      29057      yarn    2w      REG             202,16    130105
> > 51380228
> >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> > java      29057      yarn  286w      REG             202,16 103611255
> > 51380852
> >
> >
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.log
> >
> > I don't see anything big there...
> >
> > Thanks,
> > Otis
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> > On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Which specific release of 0.98 are you using ?
> > >
> > > Have you used lsof to see which files were being held onto ?
> > >
> > > Thanks
> > >
> > > On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
> > > otis.gospodnetic@gmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > Is/was there a known issue with HBase 0.98 "holding onto" files?
> > > >
> > > > We noticed the used disk space metric going up, up and up and we
> could
> > > not
> > > > stop it with major compaction.
> > > > But we noticed that if we restart a RegionServer 2 things happen:
> > > > 1) its disk usage immediately drops a lot
> > > > 2) the disk usage of other RegionServers drops some as well
> > > >
> > > > Have a look at this chart:
> > > >   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
> > > >
> > > > At 1:54 we restarted the first RS (blue line)
> > > > At 2:03 we restarted the second RS (dark green line)
> > > >
> > > > Is/was this a known HBase 0.98 issue?
> > > >
> > > > Thanks,
> > > > Otis
> > > > --
> > > > Monitoring - Log Management - Alerting - Anomaly Detection
> > > > Solr & Elasticsearch Consulting Support Training -
> > http://sematext.com/
> > > >
> > >
> >
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Stack <st...@duboce.net>.
Are you printing out filesize (I don't see the -s arg on lsof).
St.Ack

On Fri, Oct 23, 2015 at 8:08 PM, Otis Gospodnetić <
otis.gospodnetic@gmail.com> wrote:

> Hi Ted,
>
> 0.98.6-cdh5.3.0
>
> I did actually try to use lsof, but I didn't see anything unusual there.
> Is there something specific I should look for?  Things owned by hbase user
> or hdfs or yarn?  Hm, here, I don't really see anything interesting
>
> $ sudo lsof| grep '/mnt' <== this is where all data lives and where disk
> usage drops after RS restart
>
> java       2654      hdfs    1w      REG             202,16     89487
> 44042562
>
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> java       2654      hdfs    2w      REG             202,16     89487
> 44042562
>
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
> java       2654      hdfs  286w      REG             202,16 108938205
> 44044137
>
> /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.log
> java       2654      hdfs  289w      REG             202,16         0
> 44040203 /mnt/hadoop-hdfs/log/SecurityAuth-hdfs.audit
> java       2654      hdfs  314w      REG             202,16    261462
> 44040213
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/dncp_block_verification.log.curr
> java       2654      hdfs  316r      REG             202,16 134217728
> 44045060
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir58/blk_1078606358
> java       2654      hdfs  318r      REG             202,16 134217728
> 44057015
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir224/blk_1078648930
> java       2654      hdfs  319uW     REG             202,16        36
> 44042741 /mnt/hadoop-hdfs/data/in_use.lock
> java       2654      hdfs  321r      REG             202,16   1048583
> 44042793
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889_4918820.meta
> java       2654      hdfs  330u      REG             202,16    352563
> 44048279
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432_4935363.meta
> java       2654      hdfs  333r      REG             202,16 134217728
> 44055769
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir9/blk_1078659381
> java       2654      hdfs  335u      REG             202,16  45127168
> 44048273
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432
> java       2654      hdfs  340r      REG             202,16 134217728
> 44042791
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889
> java       2654      hdfs  343r      REG             202,16  13882119
> 44048053
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675385
> java       2654      hdfs  345u      REG             202,16    485059
> 44048209
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399_4935330.meta
> java       2654      hdfs  346r      REG             202,16 134217728
> 44053723
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir4/blk_1078658098
> java       2654      hdfs  347u      REG             202,16    371455
> 44047931
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364_4935295.meta
> java       2654      hdfs  348u      REG             202,16  47545282
> 44047927
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364
> java       2654      hdfs  354u      REG             202,16  20386405
> 44047875
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659266
> java       2654      hdfs  355r      REG             202,16 134217728
> 44042762
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir243/blk_1078653797
> java       2654      hdfs  357r      REG             202,16 134217728
> 44042535
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674123
> java       2654      hdfs  359u      REG             202,16      1839
> 44045445
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506_4934437.meta
> java       2654      hdfs  360u      REG             202,16    234130
> 44045440
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506
> java       2654      hdfs  363r      REG             202,16  20629437
> 44046774
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir17/blk_1078661533
> java       2654      hdfs  369r      REG             202,16  18304945
> 44047599
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675270
> java       2654      hdfs  370r      REG             202,16  62086413
> 44048199
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399
> java       2654      hdfs  379r      REG             202,16 134217728
> 44050035
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir3/blk_1078657983
> java       2654      hdfs  390u      REG             202,16  20857780
> 44050270
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659267
> java       2654      hdfs  408r      REG             202,16 115453375
> 44042299
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674120
> java       2654      hdfs  415r      REG             202,16  20253192
> 44053520
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir60/blk_1078672624
> java       2654      hdfs  423r      REG             202,16  18382878
> 44047547
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675257
> java       2654      hdfs  424r      REG             202,16  19555559
> 44040692
>
> /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir65/blk_1078673801
> bash      15005  ec2-user  cwd       DIR             202,16      4096
>    2 /mnt
> sudo      16055      root  cwd       DIR             202,16      4096
>    2 /mnt
> grep      16056  ec2-user  cwd       DIR             202,16      4096
>    2 /mnt
> sed       16057  ec2-user  cwd       DIR             202,16      4096
>    2 /mnt
> lsof      16058      root  cwd       DIR             202,16      4096
>    2 /mnt
> lsof      16059      root  cwd       DIR             202,16      4096
>    2 /mnt
> bash      18748     hbase    1w      REG             202,16     12843
>  4980744
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> bash      18748     hbase    2w      REG             202,16     12843
>  4980744
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> java      18761     hbase    1w      REG             202,16     12843
>  4980744
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> java      18761     hbase    2w      REG             202,16     12843
>  4980744
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
> java      18761     hbase  338w      REG             202,16 117537786
>  4980753
> /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.log
> java      18761     hbase  339w      REG             202,16         0
>  4980741 /mnt/hbase/log/SecurityAuth.audit
> java      29057      yarn    1w      REG             202,16    130105
> 51380228
>
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> java      29057      yarn    2w      REG             202,16    130105
> 51380228
>
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
> java      29057      yarn  286w      REG             202,16 103611255
> 51380852
>
> /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.log
>
> I don't see anything big there...
>
> Thanks,
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Which specific release of 0.98 are you using ?
> >
> > Have you used lsof to see which files were being held onto ?
> >
> > Thanks
> >
> > On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
> > otis.gospodnetic@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > Is/was there a known issue with HBase 0.98 "holding onto" files?
> > >
> > > We noticed the used disk space metric going up, up and up and we could
> > not
> > > stop it with major compaction.
> > > But we noticed that if we restart a RegionServer 2 things happen:
> > > 1) its disk usage immediately drops a lot
> > > 2) the disk usage of other RegionServers drops some as well
> > >
> > > Have a look at this chart:
> > >   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
> > >
> > > At 1:54 we restarted the first RS (blue line)
> > > At 2:03 we restarted the second RS (dark green line)
> > >
> > > Is/was this a known HBase 0.98 issue?
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Monitoring - Log Management - Alerting - Anomaly Detection
> > > Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> > >
> >
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Loïc Chanel <lo...@telecomnancy.net>.
Here is the problem : the config is the exact same for HBase and anything
Hadoop or system-related into the cluster.
There is no apparent difference.

Loïc

Loïc CHANEL
System & virtualization engineer
TO - XaaS Ind - Worldline (Villeurbanne, France)

2015-10-29 9:48 GMT+01:00 Ted Yu <yu...@gmail.com>:

> Interesting
>
> By same config I guess you mean same hbase config.
> Can you find out what was different between the two clusters ?
>
> Thanks
>
> > On Oct 29, 2015, at 1:26 AM, Loïc Chanel <lo...@telecomnancy.net>
> wrote:
> >
> > I can see that too on one of our clusters, and the thing which is really
> > weird is that another one of ours has the exact same configuration (as it
> > is the pre-production cluster) and we don't see the problem there.
> > I also did a lot of googling, but as we couldn't find a solution we
> simply
> > made a cron to restart periodically the RegionServers (to avoid a full on
> > Hadoop data partitions).
> >
> > Regards,
> >
> >
> > Loïc
> >
> > Loïc CHANEL
> > System & virtualization engineer
> > TO - XaaS Ind - Worldline (Villeurbanne, France)
> >
> > 2015-10-28 23:20 GMT+01:00 Yahoo <mi...@gmail.com>:
> >
> >> I see exactly the same thing on one of our clusters, also running HBase
> >> 0.98 (not sure of the rest of the version number since I'm not in the
> >> office right now). The non-hdfs disk space slowly fills up and I failed
> to
> >> locate the actual files using 'du'. I did a lot of googling but couldn't
> >> find any other mentions of the problem at the time.
> >>
> >> Mike.
> >>
> >>> On 24/10/2015 04:08, Otis Gospodnetić wrote:
> >>>
> >>> Hi Ted,
> >>>
> >>> 0.98.6-cdh5.3.0
> >>>
> >>> I did actually try to use lsof, but I didn't see anything unusual
> there.
> >>> Is there something specific I should look for?  Things owned by hbase
> user
> >>> or hdfs or yarn?  Hm, here, I don't really see anything interesting
> >> <snip>
> >>
> >>
> >>> Thanks,
> >>> Otis
> >>> --
> >>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>
> >>>
> >>> On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
> >>>
> >>> Which specific release of 0.98 are you using ?
> >>>>
> >>>> Have you used lsof to see which files were being held onto ?
> >>>>
> >>>> Thanks
> >>>>
> >>>> On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
> >>>> otis.gospodnetic@gmail.com> wrote:
> >>>>
> >>>> Hello,
> >>>>>
> >>>>> Is/was there a known issue with HBase 0.98 "holding onto" files?
> >>>>>
> >>>>> We noticed the used disk space metric going up, up and up and we
> could
> >>>> not
> >>>>
> >>>>> stop it with major compaction.
> >>>>> But we noticed that if we restart a RegionServer 2 things happen:
> >>>>> 1) its disk usage immediately drops a lot
> >>>>> 2) the disk usage of other RegionServers drops some as well
> >>>>>
> >>>>> Have a look at this chart:
> >>>>>   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
> >>>>>
> >>>>> At 1:54 we restarted the first RS (blue line)
> >>>>> At 2:03 we restarted the second RS (dark green line)
> >>>>>
> >>>>> Is/was this a known HBase 0.98 issue?
> >>>>>
> >>>>> Thanks,
> >>>>> Otis
> >>>>> --
> >>>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Ted Yu <yu...@gmail.com>.
Interesting 

By same config I guess you mean same hbase config. 
Can you find out what was different between the two clusters ?

Thanks

> On Oct 29, 2015, at 1:26 AM, Loïc Chanel <lo...@telecomnancy.net> wrote:
> 
> I can see that too on one of our clusters, and the thing which is really
> weird is that another one of ours has the exact same configuration (as it
> is the pre-production cluster) and we don't see the problem there.
> I also did a lot of googling, but as we couldn't find a solution we simply
> made a cron to restart periodically the RegionServers (to avoid a full on
> Hadoop data partitions).
> 
> Regards,
> 
> 
> Loïc
> 
> Loïc CHANEL
> System & virtualization engineer
> TO - XaaS Ind - Worldline (Villeurbanne, France)
> 
> 2015-10-28 23:20 GMT+01:00 Yahoo <mi...@gmail.com>:
> 
>> I see exactly the same thing on one of our clusters, also running HBase
>> 0.98 (not sure of the rest of the version number since I'm not in the
>> office right now). The non-hdfs disk space slowly fills up and I failed to
>> locate the actual files using 'du'. I did a lot of googling but couldn't
>> find any other mentions of the problem at the time.
>> 
>> Mike.
>> 
>>> On 24/10/2015 04:08, Otis Gospodnetić wrote:
>>> 
>>> Hi Ted,
>>> 
>>> 0.98.6-cdh5.3.0
>>> 
>>> I did actually try to use lsof, but I didn't see anything unusual there.
>>> Is there something specific I should look for?  Things owned by hbase user
>>> or hdfs or yarn?  Hm, here, I don't really see anything interesting
>> <snip>
>> 
>> 
>>> Thanks,
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>> 
>>> 
>>> On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
>>> 
>>> Which specific release of 0.98 are you using ?
>>>> 
>>>> Have you used lsof to see which files were being held onto ?
>>>> 
>>>> Thanks
>>>> 
>>>> On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
>>>> otis.gospodnetic@gmail.com> wrote:
>>>> 
>>>> Hello,
>>>>> 
>>>>> Is/was there a known issue with HBase 0.98 "holding onto" files?
>>>>> 
>>>>> We noticed the used disk space metric going up, up and up and we could
>>>> not
>>>> 
>>>>> stop it with major compaction.
>>>>> But we noticed that if we restart a RegionServer 2 things happen:
>>>>> 1) its disk usage immediately drops a lot
>>>>> 2) the disk usage of other RegionServers drops some as well
>>>>> 
>>>>> Have a look at this chart:
>>>>>   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
>>>>> 
>>>>> At 1:54 we restarted the first RS (blue line)
>>>>> At 2:03 we restarted the second RS (dark green line)
>>>>> 
>>>>> Is/was this a known HBase 0.98 issue?
>>>>> 
>>>>> Thanks,
>>>>> Otis
>>>>> --
>>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Loïc Chanel <lo...@telecomnancy.net>.
I can see that too on one of our clusters, and the thing which is really
weird is that another one of ours has the exact same configuration (as it
is the pre-production cluster) and we don't see the problem there.
I also did a lot of googling, but as we couldn't find a solution we simply
made a cron to restart periodically the RegionServers (to avoid a full on
Hadoop data partitions).

Regards,


Loïc

Loïc CHANEL
System & virtualization engineer
TO - XaaS Ind - Worldline (Villeurbanne, France)

2015-10-28 23:20 GMT+01:00 Yahoo <mi...@gmail.com>:

> I see exactly the same thing on one of our clusters, also running HBase
> 0.98 (not sure of the rest of the version number since I'm not in the
> office right now). The non-hdfs disk space slowly fills up and I failed to
> locate the actual files using 'du'. I did a lot of googling but couldn't
> find any other mentions of the problem at the time.
>
> Mike.
>
> On 24/10/2015 04:08, Otis Gospodnetić wrote:
>
>> Hi Ted,
>>
>> 0.98.6-cdh5.3.0
>>
>> I did actually try to use lsof, but I didn't see anything unusual there.
>> Is there something specific I should look for?  Things owned by hbase user
>> or hdfs or yarn?  Hm, here, I don't really see anything interesting
>>
> <snip>
>
>
>> Thanks,
>> Otis
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>> On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>> Which specific release of 0.98 are you using ?
>>>
>>> Have you used lsof to see which files were being held onto ?
>>>
>>> Thanks
>>>
>>> On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
>>> otis.gospodnetic@gmail.com> wrote:
>>>
>>> Hello,
>>>>
>>>> Is/was there a known issue with HBase 0.98 "holding onto" files?
>>>>
>>>> We noticed the used disk space metric going up, up and up and we could
>>>>
>>> not
>>>
>>>> stop it with major compaction.
>>>> But we noticed that if we restart a RegionServer 2 things happen:
>>>> 1) its disk usage immediately drops a lot
>>>> 2) the disk usage of other RegionServers drops some as well
>>>>
>>>> Have a look at this chart:
>>>>    https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
>>>>
>>>> At 1:54 we restarted the first RS (blue line)
>>>> At 2:03 we restarted the second RS (dark green line)
>>>>
>>>> Is/was this a known HBase 0.98 issue?
>>>>
>>>> Thanks,
>>>> Otis
>>>> --
>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>
>>>>
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Yahoo <mi...@gmail.com>.
I see exactly the same thing on one of our clusters, also running HBase 
0.98 (not sure of the rest of the version number since I'm not in the 
office right now). The non-hdfs disk space slowly fills up and I failed 
to locate the actual files using 'du'. I did a lot of googling but 
couldn't find any other mentions of the problem at the time.

Mike.

On 24/10/2015 04:08, Otis Gospodnetić wrote:
> Hi Ted,
>
> 0.98.6-cdh5.3.0
>
> I did actually try to use lsof, but I didn't see anything unusual there.
> Is there something specific I should look for?  Things owned by hbase user
> or hdfs or yarn?  Hm, here, I don't really see anything interesting
<snip>
>
> Thanks,
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> Which specific release of 0.98 are you using ?
>>
>> Have you used lsof to see which files were being held onto ?
>>
>> Thanks
>>
>> On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
>> otis.gospodnetic@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> Is/was there a known issue with HBase 0.98 "holding onto" files?
>>>
>>> We noticed the used disk space metric going up, up and up and we could
>> not
>>> stop it with major compaction.
>>> But we noticed that if we restart a RegionServer 2 things happen:
>>> 1) its disk usage immediately drops a lot
>>> 2) the disk usage of other RegionServers drops some as well
>>>
>>> Have a look at this chart:
>>>    https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
>>>
>>> At 1:54 we restarted the first RS (blue line)
>>> At 2:03 we restarted the second RS (dark green line)
>>>
>>> Is/was this a known HBase 0.98 issue?
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>


Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Otis Gospodnetić <ot...@gmail.com>.
Hi Ted,

0.98.6-cdh5.3.0

I did actually try to use lsof, but I didn't see anything unusual there.
Is there something specific I should look for?  Things owned by hbase user
or hdfs or yarn?  Hm, here, I don't really see anything interesting

$ sudo lsof| grep '/mnt' <== this is where all data lives and where disk
usage drops after RS restart

java       2654      hdfs    1w      REG             202,16     89487
44042562
/mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
java       2654      hdfs    2w      REG             202,16     89487
44042562
/mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.out
java       2654      hdfs  286w      REG             202,16 108938205
44044137
/mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext.log
java       2654      hdfs  289w      REG             202,16         0
44040203 /mnt/hadoop-hdfs/log/SecurityAuth-hdfs.audit
java       2654      hdfs  314w      REG             202,16    261462
44040213
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/dncp_block_verification.log.curr
java       2654      hdfs  316r      REG             202,16 134217728
44045060
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir58/blk_1078606358
java       2654      hdfs  318r      REG             202,16 134217728
44057015
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir224/blk_1078648930
java       2654      hdfs  319uW     REG             202,16        36
44042741 /mnt/hadoop-hdfs/data/in_use.lock
java       2654      hdfs  321r      REG             202,16   1048583
44042793
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889_4918820.meta
java       2654      hdfs  330u      REG             202,16    352563
44048279
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432_4935363.meta
java       2654      hdfs  333r      REG             202,16 134217728
44055769
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir9/blk_1078659381
java       2654      hdfs  335u      REG             202,16  45127168
44048273
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675432
java       2654      hdfs  340r      REG             202,16 134217728
44042791
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir7/blk_1078658889
java       2654      hdfs  343r      REG             202,16  13882119
44048053
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675385
java       2654      hdfs  345u      REG             202,16    485059
44048209
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399_4935330.meta
java       2654      hdfs  346r      REG             202,16 134217728
44053723
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir4/blk_1078658098
java       2654      hdfs  347u      REG             202,16    371455
44047931
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364_4935295.meta
java       2654      hdfs  348u      REG             202,16  47545282
44047927
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675364
java       2654      hdfs  354u      REG             202,16  20386405
44047875
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659266
java       2654      hdfs  355r      REG             202,16 134217728
44042762
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir74/subdir243/blk_1078653797
java       2654      hdfs  357r      REG             202,16 134217728
44042535
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674123
java       2654      hdfs  359u      REG             202,16      1839
44045445
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506_4934437.meta
java       2654      hdfs  360u      REG             202,16    234130
44045440
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078674506
java       2654      hdfs  363r      REG             202,16  20629437
44046774
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir17/blk_1078661533
java       2654      hdfs  369r      REG             202,16  18304945
44047599
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675270
java       2654      hdfs  370r      REG             202,16  62086413
44048199
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/rbw/blk_1078675399
java       2654      hdfs  379r      REG             202,16 134217728
44050035
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir3/blk_1078657983
java       2654      hdfs  390u      REG             202,16  20857780
44050270
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir8/blk_1078659267
java       2654      hdfs  408r      REG             202,16 115453375
44042299
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir66/blk_1078674120
java       2654      hdfs  415r      REG             202,16  20253192
44053520
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir60/blk_1078672624
java       2654      hdfs  423r      REG             202,16  18382878
44047547
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir71/blk_1078675257
java       2654      hdfs  424r      REG             202,16  19555559
44040692
/mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/current/finalized/subdir75/subdir65/blk_1078673801
bash      15005  ec2-user  cwd       DIR             202,16      4096
   2 /mnt
sudo      16055      root  cwd       DIR             202,16      4096
   2 /mnt
grep      16056  ec2-user  cwd       DIR             202,16      4096
   2 /mnt
sed       16057  ec2-user  cwd       DIR             202,16      4096
   2 /mnt
lsof      16058      root  cwd       DIR             202,16      4096
   2 /mnt
lsof      16059      root  cwd       DIR             202,16      4096
   2 /mnt
bash      18748     hbase    1w      REG             202,16     12843
 4980744
/mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
bash      18748     hbase    2w      REG             202,16     12843
 4980744
/mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
java      18761     hbase    1w      REG             202,16     12843
 4980744
/mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
java      18761     hbase    2w      REG             202,16     12843
 4980744
/mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.out
java      18761     hbase  338w      REG             202,16 117537786
 4980753
/mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.log
java      18761     hbase  339w      REG             202,16         0
 4980741 /mnt/hbase/log/SecurityAuth.audit
java      29057      yarn    1w      REG             202,16    130105
51380228
/mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
java      29057      yarn    2w      REG             202,16    130105
51380228
/mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.out
java      29057      yarn  286w      REG             202,16 103611255
51380852
/mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematext.log

I don't see anything big there...

Thanks,
Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu <yu...@gmail.com> wrote:

> Which specific release of 0.98 are you using ?
>
> Have you used lsof to see which files were being held onto ?
>
> Thanks
>
> On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
> otis.gospodnetic@gmail.com> wrote:
>
> > Hello,
> >
> > Is/was there a known issue with HBase 0.98 "holding onto" files?
> >
> > We noticed the used disk space metric going up, up and up and we could
> not
> > stop it with major compaction.
> > But we noticed that if we restart a RegionServer 2 things happen:
> > 1) its disk usage immediately drops a lot
> > 2) the disk usage of other RegionServers drops some as well
> >
> > Have a look at this chart:
> >   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
> >
> > At 1:54 we restarted the first RS (blue line)
> > At 2:03 we restarted the second RS (dark green line)
> >
> > Is/was this a known HBase 0.98 issue?
> >
> > Thanks,
> > Otis
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
>

Re: Disk usage drops after RegionServer restart? (0.98)

Posted by Ted Yu <yu...@gmail.com>.
Which specific release of 0.98 are you using ?

Have you used lsof to see which files were being held onto ?

Thanks

On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodnetić <
otis.gospodnetic@gmail.com> wrote:

> Hello,
>
> Is/was there a known issue with HBase 0.98 "holding onto" files?
>
> We noticed the used disk space metric going up, up and up and we could not
> stop it with major compaction.
> But we noticed that if we restart a RegionServer 2 things happen:
> 1) its disk usage immediately drops a lot
> 2) the disk usage of other RegionServers drops some as well
>
> Have a look at this chart:
>   https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq
>
> At 1:54 we restarted the first RS (blue line)
> At 2:03 we restarted the second RS (dark green line)
>
> Is/was this a known HBase 0.98 issue?
>
> Thanks,
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>