You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Alex Baranau <al...@gmail.com> on 2012/04/30 21:48:07 UTC

RegionServer silently stops (only "issue": CMS-concurrent-mark ~80sec)

Hello,

During recent weeks I constantly see some RSs *silently* dying on our HBase
cluster. By "silently" I mean that process stops, but no errors in logs [1].

The only thing I can relate to it is long CMS-concurrent-mark: almost 80
seconds. But this should not cause issues as it is not a "stop-the-world"
process.

Any advice?

HBase: hbase-0.90.4-cdh3u3
Hadoop: 0.20.2-cdh3u3

Thank you,
Alex Baranau

[1]

last lines from RS log (no errors before too, and nothing written in *.out
file):

2012-04-30 18:52:11,806 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for agg-sa-1.3,0011|
te|dtc|\x00\x00\x00\x00\x00\x00<\x1E\x002\x00\x00\x00\x015\x9C_n\x00\x00\x00\x00\x00\x00\x00\x00\x00,1334852280902.4285f9339b520ee617c087c0fd0dbf65.
because regionserver60020.cacheFlusher; priority=-1, compaction queue size=0
2012-04-30 18:54:58,779 DEBUG
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new
createWriter -- HADOOP-6840
2012-04-30 18:54:58,779 DEBUG
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
Path=hdfs://xxx.ec2.internal/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335812098651,
syncFs=true, hflush=false
2012-04-30 18:54:58,874 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Roll
/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335811856672,
entries=73789, filesize=63773934. New hlog
/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335812098651
2012-04-30 18:56:31,867 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up
with memory above low water.
2012-04-30 18:56:31,867 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
due to global heap pressure
2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.,
current region memstore size 138.1m
2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-04-30 18:56:56,303 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=322.84 MB,
free=476.34 MB, max=799.17 MB, blocks=5024, accesses=12189396, hits=127592,
hitRatio=1.04%%, cachingAccesses=132480, cachingHits=126949,
cachingHitsRatio=95.82%%, evictions=0, evicted=0, evictedPerRun=NaN
2012-04-30 18:56:59,026 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/.tmp/391890051647401997
to
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/1139737908876846168
2012-04-30 18:56:59,034 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/1139737908876846168,
entries=476418, sequenceid=880198761, memsize=138.1m, filesize=5.7m
2012-04-30 18:56:59,097 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~138.1m for region agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
in 27230ms, sequenceid=880198761, compaction requested=false
~

[2]

last lines from GC log:

2012-04-30T18:58:46.683+0000: 105717.791: [GC 105717.791: [ParNew:
35638K->1118K(38336K), 0.0548970 secs] 3145651K->3111412K(4091776K)
icms_dc=6 , 0.0550360 secs] [Times: user=0.08 sys=0.00, real=0.09 secs]
2012-04-30T18:58:46.961+0000: 105718.069: [GC 105718.069: [ParNew:
35230K->2224K(38336K), 0.0802440 secs] 3145524K->3112533K(4091776K)
icms_dc=6 , 0.0803810 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-mark:
8.770/80.230 secs] [Times: user=61.34 sys=5.69, real=80.23 secs]
2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-preclean-start]
2012-04-30T18:58:47.183+0000: 105718.291: [GC 105718.291: [ParNew:
36336K->3895K(38336K), 0.0272610 secs] 3146645K->3114281K(4091776K)
icms_dc=6 , 0.0274040 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
2012-04-30T18:58:47.452+0000: 105718.560: [GC 105718.560: [ParNew:
37990K->3082K(38336K), 0.0504660 secs] 3148376K->3114329K(4091776K)
icms_dc=6 , 0.0506020 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
2012-04-30T18:58:47.823+0000: 105718.930: [GC 105718.931: [ParNew:
37194K->3340K(38336K), 0.0419660 secs] 3148441K->3115727K(4091776K)
icms_dc=6 , 0.0421110 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
2012-04-30T18:58:48.099+0000: 105719.207: [GC 105719.207: [ParNew:
37351K->3249K(38336K), 0.0695950 secs] 3149738K->3115636K(4091776K)
icms_dc=6 , 0.0697320 secs] [Times: user=0.07 sys=0.00, real=0.07 secs]
2012-04-30T18:58:48.447+0000: 105719.555: [GC 105719.555: [ParNew:
37361K->2512K(38336K), 0.0510560 secs] 3149748K->3116166K(4091776K)
icms_dc=6 , 0.0512050 secs] [Times: user=0.05 sys=0.00, real=0.00 secs]
2012-04-30T18:58:48.794+0000: 105719.902: [GC 105719.902: [ParNew:
36624K->2616K(38336K), 0.1986770 secs] 3150278K->3116270K(4091776K)
icms_dc=6 , 0.1988160 secs] [Times: user=0.05 sys=0.00, real=0.20 secs]
2012-04-30T18:58:49.614+0000: 105720.722: [GC 105720.722: [ParNew:
36728K->3450K(38336K), 0.1099190 secs] 3150382K->3117103K(4091776K)
icms_dc=6 , 0.1100650 secs] [Times: user=0.05 sys=0.00, real=0.11 secs]
2012-04-30T18:58:50.631+0000: 105721.739: [GC 105721.739: [ParNew:
37546K->3176K(38336K), 0.2096280 secs] 3151199K->3117154K(4091776K)
icms_dc=6 , 0.2097770 secs] [Times: user=0.10 sys=0.00, real=0.21 secs]
2012-04-30T18:58:51.300+0000: 105722.408: [GC 105722.408: [ParNew:
37288K->3928K(38336K), 0.0873820 secs] 3151266K->3117907K(4091776K)
icms_dc=6 , 0.0875310 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
2012-04-30T18:58:52.109+0000: 105723.217: [GC 105723.217: [ParNew:
38040K->2095K(38336K), 0.0791560 secs] 3152019K->3116839K(4091776K)
icms_dc=6 , 0.0792970 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
2012-04-30T18:58:53.132+0000: 105724.240: [GC 105724.240: [ParNew:
36207K->2581K(38336K), 0.1651210 secs] 3150951K->3117700K(4091776K)
icms_dc=6 , 0.1652650 secs] [Times: user=0.06 sys=0.00, real=0.16 secs]
2012-04-30T18:58:55.402+0000: 105726.510: [GC 105726.510: [ParNew:
36693K->2297K(38336K), 0.1262480 secs] 3151812K->3117742K(4091776K)
icms_dc=6 , 0.1264420 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
2012-04-30T18:58:56.103+0000: 105727.211: [GC 105727.211: [ParNew:
36409K->4224K(38336K), 0.3658760 secs] 3151854K->3120205K(4091776K)
icms_dc=6 , 0.3660220 secs]

Re: RegionServer silently stops (only "issue": CMS-concurrent-mark ~80sec)

Posted by N Keywal <nk...@gmail.com>.
Hi Alex,

On the same idea, note that hbase is launched with
-XX:OnOutOfMemoryError="kill -9 %p".

N.

On Tue, May 1, 2012 at 10:41 AM, Igal Shilman <ig...@wix.com> wrote:

> Hi Alex, just to rule out, oom killer,
> Try this:
>
> http://stackoverflow.com/questions/624857/finding-which-process-was-killed-by-linux-oom-killer
>
>
> On Mon, Apr 30, 2012 at 10:48 PM, Alex Baranau <alex.baranov.v@gmail.com
> >wrote:
>
> > Hello,
> >
> > During recent weeks I constantly see some RSs *silently* dying on our
> HBase
> > cluster. By "silently" I mean that process stops, but no errors in logs
> > [1].
> >
> > The only thing I can relate to it is long CMS-concurrent-mark: almost 80
> > seconds. But this should not cause issues as it is not a "stop-the-world"
> > process.
> >
> > Any advice?
> >
> > HBase: hbase-0.90.4-cdh3u3
> > Hadoop: 0.20.2-cdh3u3
> >
> > Thank you,
> > Alex Baranau
> >
> > [1]
> >
> > last lines from RS log (no errors before too, and nothing written in
> *.out
> > file):
> >
> > 2012-04-30 18:52:11,806 DEBUG
> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
> > requested for agg-sa-1.3,0011|
> >
> >
> te|dtc|\x00\x00\x00\x00\x00\x00<\x1E\x002\x00\x00\x00\x015\x9C_n\x00\x00\x00\x00\x00\x00\x00\x00\x00,1334852280902.4285f9339b520ee617c087c0fd0dbf65.
> > because regionserver60020.cacheFlusher; priority=-1, compaction queue
> > size=0
> > 2012-04-30 18:54:58,779 DEBUG
> > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new
> > createWriter -- HADOOP-6840
> > 2012-04-30 18:54:58,779 DEBUG
> > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
> >
> >
> Path=hdfs://xxx.ec2.internal/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335812098651,
> > syncFs=true, hflush=false
> > 2012-04-30 18:54:58,874 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLog:
> > Roll
> >
> >
> /hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335811856672,
> > entries=73789, filesize=63773934. New hlog
> >
> >
> /hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335812098651
> > 2012-04-30 18:56:31,867 INFO
> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke
> up
> > with memory above low water.
> > 2012-04-30 18:56:31,867 INFO
> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
> > agg-sa-1.3,s_00I4|
> >
> >
> tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
> > due to global heap pressure
> > 2012-04-30 18:56:31,867 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Started memstore flush for agg-sa-1.3,s_00I4|
> >
> >
> tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.,
> > current region memstore size 138.1m
> > 2012-04-30 18:56:31,867 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Finished snapshotting, commencing flushing stores
> > 2012-04-30 18:56:56,303 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=322.84
> MB,
> > free=476.34 MB, max=799.17 MB, blocks=5024, accesses=12189396,
> hits=127592,
> > hitRatio=1.04%%, cachingAccesses=132480, cachingHits=126949,
> > cachingHitsRatio=95.82%%, evictions=0, evicted=0, evictedPerRun=NaN
> > 2012-04-30 18:56:59,026 INFO org.apache.hadoop.hbase.regionserver.Store:
> > Renaming flushed file at
> >
> >
> hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/.tmp/391890051647401997
> > to
> >
> >
> hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/1139737908876846168
> > 2012-04-30 18:56:59,034 INFO org.apache.hadoop.hbase.regionserver.Store:
> > Added
> >
> >
> hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/1139737908876846168,
> > entries=476418, sequenceid=880198761, memsize=138.1m, filesize=5.7m
> > 2012-04-30 18:56:59,097 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Finished memstore flush of ~138.1m for region agg-sa-1.3,s_00I4|
> >
> >
> tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
> > in 27230ms, sequenceid=880198761, compaction requested=false
> > ~
> >
> > [2]
> >
> > last lines from GC log:
> >
> > 2012-04-30T18:58:46.683+0000: 105717.791: [GC 105717.791: [ParNew:
> > 35638K->1118K(38336K), 0.0548970 secs] 3145651K->3111412K(4091776K)
> > icms_dc=6 , 0.0550360 secs] [Times: user=0.08 sys=0.00, real=0.09 secs]
> > 2012-04-30T18:58:46.961+0000: 105718.069: [GC 105718.069: [ParNew:
> > 35230K->2224K(38336K), 0.0802440 secs] 3145524K->3112533K(4091776K)
> > icms_dc=6 , 0.0803810 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
> > 2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-mark:
> > 8.770/80.230 secs] [Times: user=61.34 sys=5.69, real=80.23 secs]
> > 2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-preclean-start]
> > 2012-04-30T18:58:47.183+0000: 105718.291: [GC 105718.291: [ParNew:
> > 36336K->3895K(38336K), 0.0272610 secs] 3146645K->3114281K(4091776K)
> > icms_dc=6 , 0.0274040 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
> > 2012-04-30T18:58:47.452+0000: 105718.560: [GC 105718.560: [ParNew:
> > 37990K->3082K(38336K), 0.0504660 secs] 3148376K->3114329K(4091776K)
> > icms_dc=6 , 0.0506020 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
> > 2012-04-30T18:58:47.823+0000: 105718.930: [GC 105718.931: [ParNew:
> > 37194K->3340K(38336K), 0.0419660 secs] 3148441K->3115727K(4091776K)
> > icms_dc=6 , 0.0421110 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
> > 2012-04-30T18:58:48.099+0000: 105719.207: [GC 105719.207: [ParNew:
> > 37351K->3249K(38336K), 0.0695950 secs] 3149738K->3115636K(4091776K)
> > icms_dc=6 , 0.0697320 secs] [Times: user=0.07 sys=0.00, real=0.07 secs]
> > 2012-04-30T18:58:48.447+0000: 105719.555: [GC 105719.555: [ParNew:
> > 37361K->2512K(38336K), 0.0510560 secs] 3149748K->3116166K(4091776K)
> > icms_dc=6 , 0.0512050 secs] [Times: user=0.05 sys=0.00, real=0.00 secs]
> > 2012-04-30T18:58:48.794+0000: 105719.902: [GC 105719.902: [ParNew:
> > 36624K->2616K(38336K), 0.1986770 secs] 3150278K->3116270K(4091776K)
> > icms_dc=6 , 0.1988160 secs] [Times: user=0.05 sys=0.00, real=0.20 secs]
> > 2012-04-30T18:58:49.614+0000: 105720.722: [GC 105720.722: [ParNew:
> > 36728K->3450K(38336K), 0.1099190 secs] 3150382K->3117103K(4091776K)
> > icms_dc=6 , 0.1100650 secs] [Times: user=0.05 sys=0.00, real=0.11 secs]
> > 2012-04-30T18:58:50.631+0000: 105721.739: [GC 105721.739: [ParNew:
> > 37546K->3176K(38336K), 0.2096280 secs] 3151199K->3117154K(4091776K)
> > icms_dc=6 , 0.2097770 secs] [Times: user=0.10 sys=0.00, real=0.21 secs]
> > 2012-04-30T18:58:51.300+0000: 105722.408: [GC 105722.408: [ParNew:
> > 37288K->3928K(38336K), 0.0873820 secs] 3151266K->3117907K(4091776K)
> > icms_dc=6 , 0.0875310 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
> > 2012-04-30T18:58:52.109+0000: 105723.217: [GC 105723.217: [ParNew:
> > 38040K->2095K(38336K), 0.0791560 secs] 3152019K->3116839K(4091776K)
> > icms_dc=6 , 0.0792970 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
> > 2012-04-30T18:58:53.132+0000: 105724.240: [GC 105724.240: [ParNew:
> > 36207K->2581K(38336K), 0.1651210 secs] 3150951K->3117700K(4091776K)
> > icms_dc=6 , 0.1652650 secs] [Times: user=0.06 sys=0.00, real=0.16 secs]
> > 2012-04-30T18:58:55.402+0000: 105726.510: [GC 105726.510: [ParNew:
> > 36693K->2297K(38336K), 0.1262480 secs] 3151812K->3117742K(4091776K)
> > icms_dc=6 , 0.1264420 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
> > 2012-04-30T18:58:56.103+0000: 105727.211: [GC 105727.211: [ParNew:
> > 36409K->4224K(38336K), 0.3658760 secs] 3151854K->3120205K(4091776K)
> > icms_dc=6 , 0.3660220 secs]
> >
>

Re: RegionServer silently stops (only "issue": CMS-concurrent-mark ~80sec)

Posted by Igal Shilman <ig...@wix.com>.
Hi Alex, just to rule out, oom killer,
Try this:
http://stackoverflow.com/questions/624857/finding-which-process-was-killed-by-linux-oom-killer


On Mon, Apr 30, 2012 at 10:48 PM, Alex Baranau <al...@gmail.com>wrote:

> Hello,
>
> During recent weeks I constantly see some RSs *silently* dying on our HBase
> cluster. By "silently" I mean that process stops, but no errors in logs
> [1].
>
> The only thing I can relate to it is long CMS-concurrent-mark: almost 80
> seconds. But this should not cause issues as it is not a "stop-the-world"
> process.
>
> Any advice?
>
> HBase: hbase-0.90.4-cdh3u3
> Hadoop: 0.20.2-cdh3u3
>
> Thank you,
> Alex Baranau
>
> [1]
>
> last lines from RS log (no errors before too, and nothing written in *.out
> file):
>
> 2012-04-30 18:52:11,806 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
> requested for agg-sa-1.3,0011|
>
> te|dtc|\x00\x00\x00\x00\x00\x00<\x1E\x002\x00\x00\x00\x015\x9C_n\x00\x00\x00\x00\x00\x00\x00\x00\x00,1334852280902.4285f9339b520ee617c087c0fd0dbf65.
> because regionserver60020.cacheFlusher; priority=-1, compaction queue
> size=0
> 2012-04-30 18:54:58,779 DEBUG
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new
> createWriter -- HADOOP-6840
> 2012-04-30 18:54:58,779 DEBUG
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
>
> Path=hdfs://xxx.ec2.internal/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335812098651,
> syncFs=true, hflush=false
> 2012-04-30 18:54:58,874 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
> Roll
>
> /hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335811856672,
> entries=73789, filesize=63773934. New hlog
>
> /hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1335812098651
> 2012-04-30 18:56:31,867 INFO
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up
> with memory above low water.
> 2012-04-30 18:56:31,867 INFO
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
> agg-sa-1.3,s_00I4|
>
> tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
> due to global heap pressure
> 2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Started memstore flush for agg-sa-1.3,s_00I4|
>
> tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.,
> current region memstore size 138.1m
> 2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Finished snapshotting, commencing flushing stores
> 2012-04-30 18:56:56,303 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=322.84 MB,
> free=476.34 MB, max=799.17 MB, blocks=5024, accesses=12189396, hits=127592,
> hitRatio=1.04%%, cachingAccesses=132480, cachingHits=126949,
> cachingHitsRatio=95.82%%, evictions=0, evicted=0, evictedPerRun=NaN
> 2012-04-30 18:56:59,026 INFO org.apache.hadoop.hbase.regionserver.Store:
> Renaming flushed file at
>
> hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/.tmp/391890051647401997
> to
>
> hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/1139737908876846168
> 2012-04-30 18:56:59,034 INFO org.apache.hadoop.hbase.regionserver.Store:
> Added
>
> hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/1139737908876846168,
> entries=476418, sequenceid=880198761, memsize=138.1m, filesize=5.7m
> 2012-04-30 18:56:59,097 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Finished memstore flush of ~138.1m for region agg-sa-1.3,s_00I4|
>
> tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
> in 27230ms, sequenceid=880198761, compaction requested=false
> ~
>
> [2]
>
> last lines from GC log:
>
> 2012-04-30T18:58:46.683+0000: 105717.791: [GC 105717.791: [ParNew:
> 35638K->1118K(38336K), 0.0548970 secs] 3145651K->3111412K(4091776K)
> icms_dc=6 , 0.0550360 secs] [Times: user=0.08 sys=0.00, real=0.09 secs]
> 2012-04-30T18:58:46.961+0000: 105718.069: [GC 105718.069: [ParNew:
> 35230K->2224K(38336K), 0.0802440 secs] 3145524K->3112533K(4091776K)
> icms_dc=6 , 0.0803810 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
> 2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-mark:
> 8.770/80.230 secs] [Times: user=61.34 sys=5.69, real=80.23 secs]
> 2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-preclean-start]
> 2012-04-30T18:58:47.183+0000: 105718.291: [GC 105718.291: [ParNew:
> 36336K->3895K(38336K), 0.0272610 secs] 3146645K->3114281K(4091776K)
> icms_dc=6 , 0.0274040 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
> 2012-04-30T18:58:47.452+0000: 105718.560: [GC 105718.560: [ParNew:
> 37990K->3082K(38336K), 0.0504660 secs] 3148376K->3114329K(4091776K)
> icms_dc=6 , 0.0506020 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
> 2012-04-30T18:58:47.823+0000: 105718.930: [GC 105718.931: [ParNew:
> 37194K->3340K(38336K), 0.0419660 secs] 3148441K->3115727K(4091776K)
> icms_dc=6 , 0.0421110 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
> 2012-04-30T18:58:48.099+0000: 105719.207: [GC 105719.207: [ParNew:
> 37351K->3249K(38336K), 0.0695950 secs] 3149738K->3115636K(4091776K)
> icms_dc=6 , 0.0697320 secs] [Times: user=0.07 sys=0.00, real=0.07 secs]
> 2012-04-30T18:58:48.447+0000: 105719.555: [GC 105719.555: [ParNew:
> 37361K->2512K(38336K), 0.0510560 secs] 3149748K->3116166K(4091776K)
> icms_dc=6 , 0.0512050 secs] [Times: user=0.05 sys=0.00, real=0.00 secs]
> 2012-04-30T18:58:48.794+0000: 105719.902: [GC 105719.902: [ParNew:
> 36624K->2616K(38336K), 0.1986770 secs] 3150278K->3116270K(4091776K)
> icms_dc=6 , 0.1988160 secs] [Times: user=0.05 sys=0.00, real=0.20 secs]
> 2012-04-30T18:58:49.614+0000: 105720.722: [GC 105720.722: [ParNew:
> 36728K->3450K(38336K), 0.1099190 secs] 3150382K->3117103K(4091776K)
> icms_dc=6 , 0.1100650 secs] [Times: user=0.05 sys=0.00, real=0.11 secs]
> 2012-04-30T18:58:50.631+0000: 105721.739: [GC 105721.739: [ParNew:
> 37546K->3176K(38336K), 0.2096280 secs] 3151199K->3117154K(4091776K)
> icms_dc=6 , 0.2097770 secs] [Times: user=0.10 sys=0.00, real=0.21 secs]
> 2012-04-30T18:58:51.300+0000: 105722.408: [GC 105722.408: [ParNew:
> 37288K->3928K(38336K), 0.0873820 secs] 3151266K->3117907K(4091776K)
> icms_dc=6 , 0.0875310 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
> 2012-04-30T18:58:52.109+0000: 105723.217: [GC 105723.217: [ParNew:
> 38040K->2095K(38336K), 0.0791560 secs] 3152019K->3116839K(4091776K)
> icms_dc=6 , 0.0792970 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
> 2012-04-30T18:58:53.132+0000: 105724.240: [GC 105724.240: [ParNew:
> 36207K->2581K(38336K), 0.1651210 secs] 3150951K->3117700K(4091776K)
> icms_dc=6 , 0.1652650 secs] [Times: user=0.06 sys=0.00, real=0.16 secs]
> 2012-04-30T18:58:55.402+0000: 105726.510: [GC 105726.510: [ParNew:
> 36693K->2297K(38336K), 0.1262480 secs] 3151812K->3117742K(4091776K)
> icms_dc=6 , 0.1264420 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
> 2012-04-30T18:58:56.103+0000: 105727.211: [GC 105727.211: [ParNew:
> 36409K->4224K(38336K), 0.3658760 secs] 3151854K->3120205K(4091776K)
> icms_dc=6 , 0.3660220 secs]
>

Re: RegionServer silently stops (only "issue": CMS-concurrent-mark ~80sec)

Posted by Christian Schäfer <sy...@yahoo.de>.
An efficient way would be 

egrep -i 'killed process' /var/log/{log file}

Whereas you likely checked your logs.


regards
Christian


________________________________
 Von: Gopinathan A <go...@huawei.com>
An: alex.baranov.v@gmail.com 
CC: user@hbase.apache.org; hbase-user@hadoop.apache.org 
Gesendet: 7:15 Mittwoch, 2.Mai 2012
Betreff: RE: RegionServer silently stops (only "issue": CMS-concurrent-mark ~80sec)
 
Hi Alex,

You can check the system logs (/var/log/), whether system issued any kill
command for that process. This can happen if the resource utilization is
more from the RS (like memory usage) for very long time. 

Are you using GZIP compression? 

Thanks & Regards,
Gopinathan A

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!


-----Original Message-----
From: Alex Baranau [mailto:alex.baranov.v@gmail.com] 
Sent: Tuesday, May 01, 2012 1:18 AM
To: hbase-user@hadoop.apache.org
Subject: RegionServer silently stops (only "issue": CMS-concurrent-mark
~80sec)

Hello,

During recent weeks I constantly see some RSs *silently* dying on our HBase
cluster. By "silently" I mean that process stops, but no errors in logs [1].

The only thing I can relate to it is long CMS-concurrent-mark: almost 80
seconds. But this should not cause issues as it is not a "stop-the-world"
process.

Any advice?

HBase: hbase-0.90.4-cdh3u3
Hadoop: 0.20.2-cdh3u3

Thank you,
Alex Baranau

[1]

last lines from RS log (no errors before too, and nothing written in *.out
file):

2012-04-30 18:52:11,806 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for agg-sa-1.3,0011|
te|dtc|\x00\x00\x00\x00\x00\x00<\x1E\x002\x00\x00\x00\x015\x9C_n\x00\x00\x00
\x00\x00\x00\x00\x00\x00,1334852280902.4285f9339b520ee617c087c0fd0dbf65.
because regionserver60020.cacheFlusher; priority=-1, compaction queue size=0
2012-04-30 18:54:58,779 DEBUG
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new
createWriter -- HADOOP-6840
2012-04-30 18:54:58,779 DEBUG
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
Path=hdfs://xxx.ec2.internal/hbase/.logs/xxx.ec2.internal,60020,133570661339
7/xxx.ec2.internal%3A60020.1335812098651,
syncFs=true, hflush=false
2012-04-30 18:54:58,874 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Roll
/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1
335811856672,
entries=73789, filesize=63773934. New hlog
/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1
335812098651
2012-04-30 18:56:31,867 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up
with memory above low water.
2012-04-30 18:56:31,867 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562
\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00
\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
due to global heap pressure
2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562
\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00
\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
,
current region memstore size 138.1m
2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-04-30 18:56:56,303 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=322.84 MB,
free=476.34 MB, max=799.17 MB, blocks=5024, accesses=12189396, hits=127592,
hitRatio=1.04%%, cachingAccesses=132480, cachingHits=126949,
cachingHitsRatio=95.82%%, evictions=0, evicted=0, evictedPerRun=NaN
2012-04-30 18:56:59,026 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/.t
mp/391890051647401997
to
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/
1139737908876846168
2012-04-30 18:56:59,034 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/
1139737908876846168,
entries=476418, sequenceid=880198761, memsize=138.1m, filesize=5.7m
2012-04-30 18:56:59,097 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~138.1m for region agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562
\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00
\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
in 27230ms, sequenceid=880198761, compaction requested=false
~

[2]

last lines from GC log:

2012-04-30T18:58:46.683+0000: 105717.791: [GC 105717.791: [ParNew:
35638K->1118K(38336K), 0.0548970 secs] 3145651K->3111412K(4091776K)
icms_dc=6 , 0.0550360 secs] [Times: user=0.08 sys=0.00, real=0.09 secs]
2012-04-30T18:58:46.961+0000: 105718.069: [GC 105718.069: [ParNew:
35230K->2224K(38336K), 0.0802440 secs] 3145524K->3112533K(4091776K)
icms_dc=6 , 0.0803810 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-mark:
8.770/80.230 secs] [Times: user=61.34 sys=5.69, real=80.23 secs]
2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-preclean-start]
2012-04-30T18:58:47.183+0000: 105718.291: [GC 105718.291: [ParNew:
36336K->3895K(38336K), 0.0272610 secs] 3146645K->3114281K(4091776K)
icms_dc=6 , 0.0274040 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
2012-04-30T18:58:47.452+0000: 105718.560: [GC 105718.560: [ParNew:
37990K->3082K(38336K), 0.0504660 secs] 3148376K->3114329K(4091776K)
icms_dc=6 , 0.0506020 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
2012-04-30T18:58:47.823+0000: 105718.930: [GC 105718.931: [ParNew:
37194K->3340K(38336K), 0.0419660 secs] 3148441K->3115727K(4091776K)
icms_dc=6 , 0.0421110 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
2012-04-30T18:58:48.099+0000: 105719.207: [GC 105719.207: [ParNew:
37351K->3249K(38336K), 0.0695950 secs] 3149738K->3115636K(4091776K)
icms_dc=6 , 0.0697320 secs] [Times: user=0.07 sys=0.00, real=0.07 secs]
2012-04-30T18:58:48.447+0000: 105719.555: [GC 105719.555: [ParNew:
37361K->2512K(38336K), 0.0510560 secs] 3149748K->3116166K(4091776K)
icms_dc=6 , 0.0512050 secs] [Times: user=0.05 sys=0.00, real=0.00 secs]
2012-04-30T18:58:48.794+0000: 105719.902: [GC 105719.902: [ParNew:
36624K->2616K(38336K), 0.1986770 secs] 3150278K->3116270K(4091776K)
icms_dc=6 , 0.1988160 secs] [Times: user=0.05 sys=0.00, real=0.20 secs]
2012-04-30T18:58:49.614+0000: 105720.722: [GC 105720.722: [ParNew:
36728K->3450K(38336K), 0.1099190 secs] 3150382K->3117103K(4091776K)
icms_dc=6 , 0.1100650 secs] [Times: user=0.05 sys=0.00, real=0.11 secs]
2012-04-30T18:58:50.631+0000: 105721.739: [GC 105721.739: [ParNew:
37546K->3176K(38336K), 0.2096280 secs] 3151199K->3117154K(4091776K)
icms_dc=6 , 0.2097770 secs] [Times: user=0.10 sys=0.00, real=0.21 secs]
2012-04-30T18:58:51.300+0000: 105722.408: [GC 105722.408: [ParNew:
37288K->3928K(38336K), 0.0873820 secs] 3151266K->3117907K(4091776K)
icms_dc=6 , 0.0875310 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
2012-04-30T18:58:52.109+0000: 105723.217: [GC 105723.217: [ParNew:
38040K->2095K(38336K), 0.0791560 secs] 3152019K->3116839K(4091776K)
icms_dc=6 , 0.0792970 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
2012-04-30T18:58:53.132+0000: 105724.240: [GC 105724.240: [ParNew:
36207K->2581K(38336K), 0.1651210 secs] 3150951K->3117700K(4091776K)
icms_dc=6 , 0.1652650 secs] [Times: user=0.06 sys=0.00, real=0.16 secs]
2012-04-30T18:58:55.402+0000: 105726.510: [GC 105726.510: [ParNew:
36693K->2297K(38336K), 0.1262480 secs] 3151812K->3117742K(4091776K)
icms_dc=6 , 0.1264420 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
2012-04-30T18:58:56.103+0000: 105727.211: [GC 105727.211: [ParNew:
36409K->4224K(38336K), 0.3658760 secs] 3151854K->3120205K(4091776K)
icms_dc=6 , 0.3660220 secs]

RE: RegionServer silently stops (only "issue": CMS-concurrent-mark ~80sec)

Posted by Gopinathan A <go...@huawei.com>.
Hi Alex,

You can check the system logs (/var/log/), whether system issued any kill
command for that process. This can happen if the resource utilization is
more from the RS (like memory usage) for very long time. 

Are you using GZIP compression? 

Thanks & Regards,
Gopinathan A

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!


-----Original Message-----
From: Alex Baranau [mailto:alex.baranov.v@gmail.com] 
Sent: Tuesday, May 01, 2012 1:18 AM
To: hbase-user@hadoop.apache.org
Subject: RegionServer silently stops (only "issue": CMS-concurrent-mark
~80sec)

Hello,

During recent weeks I constantly see some RSs *silently* dying on our HBase
cluster. By "silently" I mean that process stops, but no errors in logs [1].

The only thing I can relate to it is long CMS-concurrent-mark: almost 80
seconds. But this should not cause issues as it is not a "stop-the-world"
process.

Any advice?

HBase: hbase-0.90.4-cdh3u3
Hadoop: 0.20.2-cdh3u3

Thank you,
Alex Baranau

[1]

last lines from RS log (no errors before too, and nothing written in *.out
file):

2012-04-30 18:52:11,806 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for agg-sa-1.3,0011|
te|dtc|\x00\x00\x00\x00\x00\x00<\x1E\x002\x00\x00\x00\x015\x9C_n\x00\x00\x00
\x00\x00\x00\x00\x00\x00,1334852280902.4285f9339b520ee617c087c0fd0dbf65.
because regionserver60020.cacheFlusher; priority=-1, compaction queue size=0
2012-04-30 18:54:58,779 DEBUG
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new
createWriter -- HADOOP-6840
2012-04-30 18:54:58,779 DEBUG
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
Path=hdfs://xxx.ec2.internal/hbase/.logs/xxx.ec2.internal,60020,133570661339
7/xxx.ec2.internal%3A60020.1335812098651,
syncFs=true, hflush=false
2012-04-30 18:54:58,874 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Roll
/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1
335811856672,
entries=73789, filesize=63773934. New hlog
/hbase/.logs/xxx.ec2.internal,60020,1335706613397/xxx.ec2.internal%3A60020.1
335812098651
2012-04-30 18:56:31,867 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up
with memory above low water.
2012-04-30 18:56:31,867 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562
\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00
\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
due to global heap pressure
2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memstore flush for agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562
\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00
\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
,
current region memstore size 138.1m
2012-04-30 18:56:31,867 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished snapshotting, commencing flushing stores
2012-04-30 18:56:56,303 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=322.84 MB,
free=476.34 MB, max=799.17 MB, blocks=5024, accesses=12189396, hits=127592,
hitRatio=1.04%%, cachingAccesses=132480, cachingHits=126949,
cachingHitsRatio=95.82%%, evictions=0, evicted=0, evictedPerRun=NaN
2012-04-30 18:56:59,026 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming flushed file at
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/.t
mp/391890051647401997
to
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/
1139737908876846168
2012-04-30 18:56:59,034 INFO org.apache.hadoop.hbase.regionserver.Store:
Added
hdfs://zzz.ec2.internal/hbase/agg-sa-1.3/30b127193485342359eadf1586819805/a/
1139737908876846168,
entries=476418, sequenceid=880198761, memsize=138.1m, filesize=5.7m
2012-04-30 18:56:59,097 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~138.1m for region agg-sa-1.3,s_00I4|
tdqc\x00docs|mrtdocs|\x00\x00\x00\x00\x00\x03\x11\xF4\x00none\x00|1334692562
\x00\x0D\xE0\xB6\xB3\xA7c\xFF\xBC|26837373\x00\x00\x00\x016\xC1\xE0D\xBE\x00
\x00\x00\x00\x00\x00\x00\x00,1335761291026.30b127193485342359eadf1586819805.
in 27230ms, sequenceid=880198761, compaction requested=false
~

[2]

last lines from GC log:

2012-04-30T18:58:46.683+0000: 105717.791: [GC 105717.791: [ParNew:
35638K->1118K(38336K), 0.0548970 secs] 3145651K->3111412K(4091776K)
icms_dc=6 , 0.0550360 secs] [Times: user=0.08 sys=0.00, real=0.09 secs]
2012-04-30T18:58:46.961+0000: 105718.069: [GC 105718.069: [ParNew:
35230K->2224K(38336K), 0.0802440 secs] 3145524K->3112533K(4091776K)
icms_dc=6 , 0.0803810 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-mark:
8.770/80.230 secs] [Times: user=61.34 sys=5.69, real=80.23 secs]
2012-04-30T18:58:47.114+0000: 105718.222: [CMS-concurrent-preclean-start]
2012-04-30T18:58:47.183+0000: 105718.291: [GC 105718.291: [ParNew:
36336K->3895K(38336K), 0.0272610 secs] 3146645K->3114281K(4091776K)
icms_dc=6 , 0.0274040 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
2012-04-30T18:58:47.452+0000: 105718.560: [GC 105718.560: [ParNew:
37990K->3082K(38336K), 0.0504660 secs] 3148376K->3114329K(4091776K)
icms_dc=6 , 0.0506020 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
2012-04-30T18:58:47.823+0000: 105718.930: [GC 105718.931: [ParNew:
37194K->3340K(38336K), 0.0419660 secs] 3148441K->3115727K(4091776K)
icms_dc=6 , 0.0421110 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
2012-04-30T18:58:48.099+0000: 105719.207: [GC 105719.207: [ParNew:
37351K->3249K(38336K), 0.0695950 secs] 3149738K->3115636K(4091776K)
icms_dc=6 , 0.0697320 secs] [Times: user=0.07 sys=0.00, real=0.07 secs]
2012-04-30T18:58:48.447+0000: 105719.555: [GC 105719.555: [ParNew:
37361K->2512K(38336K), 0.0510560 secs] 3149748K->3116166K(4091776K)
icms_dc=6 , 0.0512050 secs] [Times: user=0.05 sys=0.00, real=0.00 secs]
2012-04-30T18:58:48.794+0000: 105719.902: [GC 105719.902: [ParNew:
36624K->2616K(38336K), 0.1986770 secs] 3150278K->3116270K(4091776K)
icms_dc=6 , 0.1988160 secs] [Times: user=0.05 sys=0.00, real=0.20 secs]
2012-04-30T18:58:49.614+0000: 105720.722: [GC 105720.722: [ParNew:
36728K->3450K(38336K), 0.1099190 secs] 3150382K->3117103K(4091776K)
icms_dc=6 , 0.1100650 secs] [Times: user=0.05 sys=0.00, real=0.11 secs]
2012-04-30T18:58:50.631+0000: 105721.739: [GC 105721.739: [ParNew:
37546K->3176K(38336K), 0.2096280 secs] 3151199K->3117154K(4091776K)
icms_dc=6 , 0.2097770 secs] [Times: user=0.10 sys=0.00, real=0.21 secs]
2012-04-30T18:58:51.300+0000: 105722.408: [GC 105722.408: [ParNew:
37288K->3928K(38336K), 0.0873820 secs] 3151266K->3117907K(4091776K)
icms_dc=6 , 0.0875310 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
2012-04-30T18:58:52.109+0000: 105723.217: [GC 105723.217: [ParNew:
38040K->2095K(38336K), 0.0791560 secs] 3152019K->3116839K(4091776K)
icms_dc=6 , 0.0792970 secs] [Times: user=0.05 sys=0.00, real=0.08 secs]
2012-04-30T18:58:53.132+0000: 105724.240: [GC 105724.240: [ParNew:
36207K->2581K(38336K), 0.1651210 secs] 3150951K->3117700K(4091776K)
icms_dc=6 , 0.1652650 secs] [Times: user=0.06 sys=0.00, real=0.16 secs]
2012-04-30T18:58:55.402+0000: 105726.510: [GC 105726.510: [ParNew:
36693K->2297K(38336K), 0.1262480 secs] 3151812K->3117742K(4091776K)
icms_dc=6 , 0.1264420 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
2012-04-30T18:58:56.103+0000: 105727.211: [GC 105727.211: [ParNew:
36409K->4224K(38336K), 0.3658760 secs] 3151854K->3120205K(4091776K)
icms_dc=6 , 0.3660220 secs]