You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2009/10/29 15:33:51 UTC

possible memstore related OOME

FYI, a detailed report from a Trend RD team as email thread below.

Also, a weird NPE at HRegionServer.getClosestRowBefore(HRegionServer.java:1758) during compaction of META.

2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region .META.,,1
2009-10-28 09:59:10,687 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.lang.NullPointerException
2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 0 on 60020, call getClosestRowBefore([B@431d7fb1,
[B@45aa5fe3, [B@347d53d0) from 172.21.12.41:44655: error:
java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
Caused by: java.lang.NullPointerException
2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec

---- Forwarded Message ----
    
From:       Andrew Purtell (RD-US)
  Sent:       Thu 10/29/2009 4:44 PM
Subject:       RE: jhat (was: RE: HBase 0.20.0 region server crash due to out of memory error)

Top item on the regionserver which went down due to OOME is byte[] -- class [B 2879025 3938815412 (3.6 GB) -- with top referencer org.apache.hadoop.hbase.KeyValue 1539539 (about 53% of all references). That is value data.

There are a lot of ConcurrentSkipListMap$Node objects also. The major user of this data type if you trace it through in jhat most likely will be KeyValueSkipListSet -> MemStore.

According to the metrics dumped at OOME time:

 > regions=342, stores=1506, storefiles=892, storefileIndexSize=196, memstoreSize=606, usedHeap=3983, maxHeap=3983,
 > blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0, blockCacheHitRatio=0

606 MB memstore plus 7 MB block cache plus 196 MB storeFileIndex != 3.6 GB of byte[]. Something does not add up.

(Do disable the block cache as I described in the other email. It is not helping at all with a hit ratio of 0%.)

The MemStoreScanner line reminds me of a recently filed issue:

  > class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner 77723   2798028

I heard about this issue while traveling but forgot about it.  See https://issues.apache.org/jira/browse/HBASE-1927 . Fixed already on the 0.20 branch and scheduled for 0.20.2 release. Current branch head is available to test now if this has any impact on the OOME issue. There could be some impact because MemStoreScanners cache one row of values, which will hold references to ConcurrentSkipListMap$Node. This is not accounted for in the metrics.
 
That the problem manifests during compaction with associated messages such as:

  > 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog:
  > IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog; editcount=6175

  > 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 24610ms,
  > ten times longer than scheduled: 1000

indicating very slow HDFS (~5 seconds per write) and thread starvation also suggests MemStore flushing may have fallen behind. Combine long delayed flushing with KeyValue heap load on the read side (scanners), and this may explain all of the heap usage, simply insufficient RAM/heap to contain all of the workload. Scanner leak may be a contributing factor. Snapshots for flushing are not accounted for in the metrics either.

Let's try to sort out what may be memory leak related and what may be just overloading. Will you consider testing the latest 0.20 branch revision in HBase project SVN?

I will forward this to hbase-dev@ with your email addresses removed.

Best regards,

    - Andy


-----Original Message-----
Sent: Thu 10/29/2009 12:51 PM
To: Andrew Purtell (RD-US)
Subject: RE: jhat (was: RE: HBase 0.20.0 region server crash due to out of memory error)

Dear Andy,

Here is the top ten classes in heap histogram got from jhat. Please kindly tell me if you need more information.

Class   Instance Count  Total Size
class [B        2879025         3938815412
class java.util.concurrent.ConcurrentSkipListMap$Node   1472184         35332416
class org.apache.hadoop.hbase.KeyValue  1539539         24632624
class java.util.concurrent.ConcurrentSkipListMap$Index  734903  17637672
class [J        895     10660592
class [[B       892     10660120
class [Ljava.lang.Object;       82608   8947128
class [I        2992    5467740
class [C        37504   3125808
class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner     77723   2798028

References by Type of class[B:
class [B [0x2aaba8220738]
Referrers by Type
Class   Count
org.apache.hadoop.hbase.KeyValue        1539539
[[B     1330731
org.apache.hadoop.hbase.io.ImmutableBytesWritable       22504
org.apache.hadoop.hdfs.DFSClient$BlockReader    2605
java.util.TreeMap$Entry         1938
java.io.DataInputStream         1790
org.apache.hadoop.hdfs.DFSClient$DFSDataInputStream     1784
java.util.concurrent.ConcurrentSkipListMap$Node         1559
org.apache.hadoop.hbase.HColumnDescriptor       1509
org.apache.hadoop.hbase.regionserver.Store      1506
org.apache.hadoop.hbase.HRegionInfo     1032

Best Regards,
Robin Tang

-----Original Message-----
From: Andrew Purtell (RD-US)
Sent: 2009?10?29? 11:56
Subject: jhat (was: RE: HBase 0.20.0 region server crash due to out of memory error)

Hi Robin,

Do you know how to use jhat to read a heap dump from a regionserver (written automatically when it OOMEs) and get a histogram of per object heap usage?

Best regards,

    - Andy

-----Original Message-----

Sent: 2009?10?28? 19:17
To: Andrew Purtell (RD-US)
Subject: HBase 0.20.0 region server crash due to out of memory error
Importance: High

Dear Andy,

In TSS testing, after processing about 1.5 million urls, two of four region servers crashed.


69.174.253.200:

2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region urls,002007360827102009-0000000000000000037-172.21.12.42-c74ec67ed7f2c42b948ae96a33fd59af,1256636471071 in 15sec
2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region result_hash,,1256524756227
2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog: Roll /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723488679, entries=27549, calcsize=63900120, filesize=60961579. New hlog /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723529786
2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722212390 whose highest sequence/edit id is 322578992
2009-10-28 09:52:09,820 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722224590 whose highest sequence/edit id is 322702111
2009-10-28 09:52:09,826 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722235891 whose highest sequence/edit id is 322817821
2009-10-28 09:52:09,833 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722254773 whose highest sequence/edit id is 322908772
2009-10-28 09:52:09,842 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722320154 whose highest sequence/edit id is 323030355
2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog; editcount=6175
2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 24610ms, ten times longer than scheduled: 1000
2009-10-28 09:55:29,503 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=0.0, regions=342, stores=1506, storefiles=892, storefileIndexSize=196, memstoreSize=606, usedHeap=3983, maxHeap=3983, blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0, blockCacheHitRatio=0
2009-10-28 09:55:29,503 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Set stop flag in regionserver/172.21.12.39:60020.compactor
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
        at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:185)
        at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:875)
        at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:738)
        at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:779)
        at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:732)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:104)
2009-10-28 09:55:33,171 INFO org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.
2009-10-28 09:55:33,172 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020, call get([B@12249bf9, row=024003550728102009-00000000000000004
74-172.21.12.42-edf985c42366636e09f74f03d945dbbe, maxVersions=1, timeRange=[0,9223372036854775807), families=ALL) from 172.21.12.42:58762: error: java.io.IOException: Server not running, aborting
java.io.IOException: Server not running, aborting
        at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1764)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 09:55:33,171 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: regionserver/172.21.12.39:60020.cacheFlusher exiting
2009-10-28 09:55:33,171 INFO org.apache.hadoop.hbase.regionserver.LogFlusher: regionserver/172.21.12.39:60020.logFlusher exiting


69.174.253.201:

2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region .META.,,1
2009-10-28 09:59:10,687 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.lang.NullPointerException
2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call getClosestRowBefore([B@431d7fb1, [B@45aa5fe3, [B@347d53d0) from 172.21.12.41:44655: error: java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
Caused by: java.lang.NullPointerException
2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
2009-10-28 10:04:54,207 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 1066294637410862793 lease expired
2009-10-28 10:04:54,209 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 6680747753024842647 lease expired
2009-10-28 10:04:54,209 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -6935590141578792585 lease expired
2009-10-28 10:05:52,261 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -547447698255259149 lease expired
2009-10-28 10:06:47,273 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 2776617092001158844 lease expired
2009-10-28 10:06:51,596 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:06:51,598 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call next(-547447698255259149, 100) from 172.21.12.38:54959: error: org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:07:03,997 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:07:03,998 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, call next(-6935590141578792585, 100) from 172.21.12.39:59669:error: org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:08:31,829 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087
2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087 in 7sec
2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764
2009-10-28 10:08:48,056 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -6964642614101173789 lease expired
2009-10-28 10:08:51,372 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764 in 12sec
2009-10-28 10:08:51,373 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region result_hash,,1256524756227
2009-10-28 10:09:13,732 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:13,736 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020, call next(6680747753024842647, 100) from 172.21.12.37:47309: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:17,595 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:17,597 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020, call next(1066294637410862793, 100) from 172.21.12.37:47311: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:43,343 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:43,344 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, call next(-6964642614101173789, 100) from 172.21.12.41:42371:error: org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:43,448 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@50498912
2009-10-28 10:09:43,887 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:09:43,888 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020, call next(2776617092001158844, 100) from 172.21.12.39:59763: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:10:01,444 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on 60020 took 3706ms appending an edit to hlog; editcount=68777
2009-10-28 10:11:01,838 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 0 on 60020 took 4219ms appending an edit to hlog; editcount=75191
2009-10-28 10:11:34,202 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 9 on 60020 took 4287ms appending an edit to hlog; editcount=77350
2009-10-28 10:11:43,711 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on 60020 took 2553ms appending an edit to hlog; editcount=77532
2009-10-28 10:11:59,666 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 5 on 60020 took 3756ms appending an edit to hlog; editcount=77740
2009-10-28 10:12:43,878 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 21384ms, ten times longer than scheduled: 1000
2009-10-28 10:12:43,896 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
        at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:12:43,912 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, aborting.
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
        at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-10-28 10:12:49,264 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=0.0, regions=454, stores=1841, storefiles=1075, storefileIndexSize=242, memstoreSize=497, usedHeap=3961, maxHeap=3983, blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0, blockCacheHitRatio=0
2009-10-28 10:12:49,267 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60020, call next(1793460901757128664, 100) from 172.21.12.40:50800: error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
java.io.IOException: java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1940)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
        at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
        ... 5 more
2009-10-28 10:12:49,268 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call put([B@3e2a9413, [Lorg.apache.hadoop.hbase.client.Put;@5d5f8b47) from 172.21.12.38:57580: error: java.io.IOException: Server not running, aborting
java.io.IOException: Server not running, aborting
        at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1806)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)


      

Re: possible memstore related OOME

Posted by Ryan Rawson <ry...@gmail.com>.
4GB ram on 342 regions and CMS isnt much ram.  CMS needs room to do
it's thing.  I'd be interested to see the GC logs right before the
OOME, probably lots of concurrent mode failures.

-ryan

On Thu, Oct 29, 2009 at 9:04 AM, Jonathan Gray <jl...@streamy.com> wrote:
> Could be possible that if the compactions are very slow running, and we're
> not counting snapshots as part of the heap usage, then we won't start
> forcing more compactions because of heap pressure (not that this would even
> help much if io is saturated).
>
> Throw some heavy concurrent reading in there while hammering IO and could
> definitely create an OOME situation.  Need to give ample heap.
>
> Do you have GC logs from when this happened?  Did the Full GC wall come up
> multiple times and it couldn't free up enough?
>
> JG
>
> Andrew Purtell wrote:
>>
>> FYI, a detailed report from a Trend RD team as email thread below.
>>
>> Also, a weird NPE at
>> HRegionServer.getClosestRowBefore(HRegionServer.java:1758) during compaction
>> of META.
>>
>> 2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> Starting compaction on region .META.,,1
>> 2009-10-28 09:59:10,687 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> java.lang.NullPointerException
>> 2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>> Server handler 0 on 60020, call getClosestRowBefore([B@431d7fb1,
>> [B@45aa5fe3, [B@347d53d0) from 172.21.12.41:44655: error:
>> java.io.IOException: java.lang.NullPointerException
>> java.io.IOException: java.lang.NullPointerException
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
>>        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> Caused by: java.lang.NullPointerException
>> 2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> compaction completed on region .META.,,1 in 0sec
>>
>> ---- Forwarded Message ----
>>    From:       Andrew Purtell (RD-US)
>>  Sent:       Thu 10/29/2009 4:44 PM
>> Subject:       RE: jhat (was: RE: HBase 0.20.0 region server crash due to
>> out of memory error)
>>
>> Top item on the regionserver which went down due to OOME is byte[] --
>> class [B 2879025 3938815412 (3.6 GB) -- with top referencer
>> org.apache.hadoop.hbase.KeyValue 1539539 (about 53% of all references). That
>> is value data.
>>
>> There are a lot of ConcurrentSkipListMap$Node objects also. The major user
>> of this data type if you trace it through in jhat most likely will be
>> KeyValueSkipListSet -> MemStore.
>>
>> According to the metrics dumped at OOME time:
>>
>>  > regions=342, stores=1506, storefiles=892, storefileIndexSize=196,
>> memstoreSize=606, usedHeap=3983, maxHeap=3983,
>>  > blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0,
>> blockCacheHitRatio=0
>>
>> 606 MB memstore plus 7 MB block cache plus 196 MB storeFileIndex != 3.6 GB
>> of byte[]. Something does not add up.
>>
>> (Do disable the block cache as I described in the other email. It is not
>> helping at all with a hit ratio of 0%.)
>>
>> The MemStoreScanner line reminds me of a recently filed issue:
>>
>>  > class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner
>> 77723   2798028
>>
>> I heard about this issue while traveling but forgot about it.  See
>> https://issues.apache.org/jira/browse/HBASE-1927 . Fixed already on the 0.20
>> branch and scheduled for 0.20.2 release. Current branch head is available to
>> test now if this has any impact on the OOME issue. There could be some
>> impact because MemStoreScanners cache one row of values, which will hold
>> references to ConcurrentSkipListMap$Node. This is not accounted for in the
>> metrics.
>>  That the problem manifests during compaction with associated messages
>> such as:
>>
>>  > 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog:
>>  > IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog;
>> editcount=6175
>>
>>  > 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 24610ms,
>>  > ten times longer than scheduled: 1000
>>
>> indicating very slow HDFS (~5 seconds per write) and thread starvation
>> also suggests MemStore flushing may have fallen behind. Combine long delayed
>> flushing with KeyValue heap load on the read side (scanners), and this may
>> explain all of the heap usage, simply insufficient RAM/heap to contain all
>> of the workload. Scanner leak may be a contributing factor. Snapshots for
>> flushing are not accounted for in the metrics either.
>>
>> Let's try to sort out what may be memory leak related and what may be just
>> overloading. Will you consider testing the latest 0.20 branch revision in
>> HBase project SVN?
>>
>> I will forward this to hbase-dev@ with your email addresses removed.
>>
>> Best regards,
>>
>>    - Andy
>>
>>
>> -----Original Message-----
>> Sent: Thu 10/29/2009 12:51 PM
>> To: Andrew Purtell (RD-US)
>> Subject: RE: jhat (was: RE: HBase 0.20.0 region server crash due to out of
>> memory error)
>>
>> Dear Andy,
>>
>> Here is the top ten classes in heap histogram got from jhat. Please kindly
>> tell me if you need more information.
>>
>> Class   Instance Count  Total Size
>> class [B        2879025         3938815412
>> class java.util.concurrent.ConcurrentSkipListMap$Node   1472184
>> 35332416
>> class org.apache.hadoop.hbase.KeyValue  1539539         24632624
>> class java.util.concurrent.ConcurrentSkipListMap$Index  734903  17637672
>> class [J        895     10660592
>> class [[B       892     10660120
>> class [Ljava.lang.Object;       82608   8947128
>> class [I        2992    5467740
>> class [C        37504   3125808
>> class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner
>> 77723   2798028
>>
>> References by Type of class[B:
>> class [B [0x2aaba8220738]
>> Referrers by Type
>> Class   Count
>> org.apache.hadoop.hbase.KeyValue        1539539
>> [[B     1330731
>> org.apache.hadoop.hbase.io.ImmutableBytesWritable       22504
>> org.apache.hadoop.hdfs.DFSClient$BlockReader    2605
>> java.util.TreeMap$Entry         1938
>> java.io.DataInputStream         1790
>> org.apache.hadoop.hdfs.DFSClient$DFSDataInputStream     1784
>> java.util.concurrent.ConcurrentSkipListMap$Node         1559
>> org.apache.hadoop.hbase.HColumnDescriptor       1509
>> org.apache.hadoop.hbase.regionserver.Store      1506
>> org.apache.hadoop.hbase.HRegionInfo     1032
>>
>> Best Regards,
>> Robin Tang
>>
>> -----Original Message-----
>> From: Andrew Purtell (RD-US)
>> Sent: 2009?10?29? 11:56
>> Subject: jhat (was: RE: HBase 0.20.0 region server crash due to out of
>> memory error)
>>
>> Hi Robin,
>>
>> Do you know how to use jhat to read a heap dump from a regionserver
>> (written automatically when it OOMEs) and get a histogram of per object heap
>> usage?
>>
>> Best regards,
>>
>>    - Andy
>>
>> -----Original Message-----
>>
>> Sent: 2009?10?28? 19:17
>> To: Andrew Purtell (RD-US)
>> Subject: HBase 0.20.0 region server crash due to out of memory error
>> Importance: High
>>
>> Dear Andy,
>>
>> In TSS testing, after processing about 1.5 million urls, two of four
>> region servers crashed.
>>
>>
>> 69.174.253.200:
>>
>> 2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> compaction completed on region
>> urls,002007360827102009-0000000000000000037-172.21.12.42-c74ec67ed7f2c42b948ae96a33fd59af,1256636471071
>> in 15sec
>> 2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> Starting compaction on region result_hash,,1256524756227
>> 2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> Roll
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723488679,
>> entries=27549, calcsize=63900120, filesize=60961579. New hlog
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723529786
>> 2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> removing old hlog file
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722212390
>> whose highest sequence/edit id is 322578992
>> 2009-10-28 09:52:09,820 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> removing old hlog file
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722224590
>> whose highest sequence/edit id is 322702111
>> 2009-10-28 09:52:09,826 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> removing old hlog file
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722235891
>> whose highest sequence/edit id is 322817821
>> 2009-10-28 09:52:09,833 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> removing old hlog file
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722254773
>> whose highest sequence/edit id is 322908772
>> 2009-10-28 09:52:09,842 INFO org.apache.hadoop.hbase.regionserver.HLog:
>> removing old hlog file
>> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722320154
>> whose highest sequence/edit id is 323030355
>> 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog:
>> IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog;
>> editcount=6175
>> 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 24610ms, ten times longer than scheduled: 1000
>> 2009-10-28 09:55:29,503 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>> request=0.0, regions=342, stores=1506, storefiles=892,
>> storefileIndexSize=196, memstoreSize=606, usedHeap=3983, maxHeap=3983,
>> blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0,
>> blockCacheHitRatio=0
>> 2009-10-28 09:55:29,503 FATAL
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Set stop flag in
>> regionserver/172.21.12.39:60020.compactor
>> java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:185)
>>        at
>> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:875)
>>        at
>> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:738)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:779)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:732)
>>        at
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:104)
>> 2009-10-28 09:55:33,171 INFO
>> org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.
>> 2009-10-28 09:55:33,172 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 1 on 60020, call get([B@12249bf9,
>> row=024003550728102009-00000000000000004
>> 74-172.21.12.42-edf985c42366636e09f74f03d945dbbe, maxVersions=1,
>> timeRange=[0,9223372036854775807), families=ALL) from 172.21.12.42:58762:
>> error: java.io.IOException: Server not running, aborting
>> java.io.IOException: Server not running, aborting
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1764)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 09:55:33,171 INFO
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
>> regionserver/172.21.12.39:60020.cacheFlusher exiting
>> 2009-10-28 09:55:33,171 INFO
>> org.apache.hadoop.hbase.regionserver.LogFlusher:
>> regionserver/172.21.12.39:60020.logFlusher exiting
>>
>>
>> 69.174.253.201:
>>
>> 2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> Starting compaction on region .META.,,1
>> 2009-10-28 09:59:10,687 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> java.lang.NullPointerException
>> 2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 0 on 60020, call getClosestRowBefore([B@431d7fb1, [B@45aa5fe3,
>> [B@347d53d0) from 172.21.12.41:44655: error: java.io.IOException:
>> java.lang.NullPointerException
>> java.io.IOException: java.lang.NullPointerException
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
>>        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> Caused by: java.lang.NullPointerException
>> 2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> compaction completed on region .META.,,1 in 0sec
>> 2009-10-28 10:04:54,207 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> 1066294637410862793 lease expired
>> 2009-10-28 10:04:54,209 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> 6680747753024842647 lease expired
>> 2009-10-28 10:04:54,209 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> -6935590141578792585 lease expired
>> 2009-10-28 10:05:52,261 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> -547447698255259149 lease expired
>> 2009-10-28 10:06:47,273 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> 2776617092001158844 lease expired
>> 2009-10-28 10:06:51,596 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:06:51,598 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 0 on 60020, call next(-547447698255259149, 100) from
>> 172.21.12.38:54959: error: org.apache.hadoop.hbase.UnknownScannerException:
>> Name: -547447698255259149
>> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:07:03,997 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> org.apache.hadoop.hbase.UnknownScannerException: Name:
>> -6935590141578792585
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:07:03,998 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 6 on 60020, call next(-6935590141578792585, 100) from
>> 172.21.12.39:59669:error: org.apache.hadoop.hbase.UnknownScannerException:
>> Name: -6935590141578792585
>> org.apache.hadoop.hbase.UnknownScannerException: Name:
>> -6935590141578792585
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:08:31,829 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> Starting compaction on region
>> result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087
>> 2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> compaction completed on region
>> result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087
>> in 7sec
>> 2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> Starting compaction on region
>> urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764
>> 2009-10-28 10:08:48,056 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> -6964642614101173789 lease expired
>> 2009-10-28 10:08:51,372 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> compaction completed on region
>> urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764
>> in 12sec
>> 2009-10-28 10:08:51,373 INFO org.apache.hadoop.hbase.regionserver.HRegion:
>> Starting compaction on region result_hash,,1256524756227
>> 2009-10-28 10:09:13,732 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:13,736 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 5 on 60020, call next(6680747753024842647, 100) from
>> 172.21.12.37:47309: error: org.apache.hadoop.hbase.UnknownScannerException:
>> Name: 6680747753024842647
>> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:17,595 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:17,597 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 7 on 60020, call next(1066294637410862793, 100) from
>> 172.21.12.37:47311: error: org.apache.hadoop.hbase.UnknownScannerException:
>> Name: 1066294637410862793
>> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:43,343 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> org.apache.hadoop.hbase.UnknownScannerException: Name:
>> -6964642614101173789
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:43,344 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 6 on 60020, call next(-6964642614101173789, 100) from
>> 172.21.12.41:42371:error: org.apache.hadoop.hbase.UnknownScannerException:
>> Name: -6964642614101173789
>> org.apache.hadoop.hbase.UnknownScannerException: Name:
>> -6964642614101173789
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:43,448 WARN org.apache.hadoop.hbase.regionserver.Store:
>> Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@50498912
>> 2009-10-28 10:09:43,887 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:09:43,888 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 7 on 60020, call next(2776617092001158844, 100) from
>> 172.21.12.39:59763: error: org.apache.hadoop.hbase.UnknownScannerException:
>> Name: 2776617092001158844
>> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:10:01,444 WARN org.apache.hadoop.hbase.regionserver.HLog:
>> IPC Server handler 1 on 60020 took 3706ms appending an edit to hlog;
>> editcount=68777
>> 2009-10-28 10:11:01,838 WARN org.apache.hadoop.hbase.regionserver.HLog:
>> IPC Server handler 0 on 60020 took 4219ms appending an edit to hlog;
>> editcount=75191
>> 2009-10-28 10:11:34,202 WARN org.apache.hadoop.hbase.regionserver.HLog:
>> IPC Server handler 9 on 60020 took 4287ms appending an edit to hlog;
>> editcount=77350
>> 2009-10-28 10:11:43,711 WARN org.apache.hadoop.hbase.regionserver.HLog:
>> IPC Server handler 4 on 60020 took 2553ms appending an edit to hlog;
>> editcount=77532
>> 2009-10-28 10:11:59,666 WARN org.apache.hadoop.hbase.regionserver.HLog:
>> IPC Server handler 5 on 60020 took 3756ms appending an edit to hlog;
>> editcount=77740
>> 2009-10-28 10:12:43,878 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 21384ms, ten times longer than scheduled: 1000
>> 2009-10-28 10:12:43,896 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>> java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:12:43,912 FATAL
>> org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
>> aborting.
>> java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> 2009-10-28 10:12:49,264 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>> request=0.0, regions=454, stores=1841, storefiles=1075,
>> storefileIndexSize=242, memstoreSize=497, usedHeap=3961, maxHeap=3983,
>> blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0,
>> blockCacheHitRatio=0
>> 2009-10-28 10:12:49,267 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 2 on 60020, call next(1793460901757128664, 100) from
>> 172.21.12.40:50800: error: java.io.IOException: java.lang.OutOfMemoryError:
>> Java heap space
>> java.io.IOException: java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1940)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>>        at
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>>        at
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>>        at
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>>        ... 5 more
>> 2009-10-28 10:12:49,268 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 0 on 60020, call put([B@3e2a9413,
>> [Lorg.apache.hadoop.hbase.client.Put;@5d5f8b47) from 172.21.12.38:57580:
>> error: java.io.IOException: Server not running, aborting
>> java.io.IOException: Server not running, aborting
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1806)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>>        at
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>>
>>
>>
>

Re: possible memstore related OOME

Posted by Jonathan Gray <jl...@streamy.com>.
Could be possible that if the compactions are very slow running, and 
we're not counting snapshots as part of the heap usage, then we won't 
start forcing more compactions because of heap pressure (not that this 
would even help much if io is saturated).

Throw some heavy concurrent reading in there while hammering IO and 
could definitely create an OOME situation.  Need to give ample heap.

Do you have GC logs from when this happened?  Did the Full GC wall come 
up multiple times and it couldn't free up enough?

JG

Andrew Purtell wrote:
> FYI, a detailed report from a Trend RD team as email thread below.
> 
> Also, a weird NPE at HRegionServer.getClosestRowBefore(HRegionServer.java:1758) during compaction of META.
> 
> 2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region .META.,,1
> 2009-10-28 09:59:10,687 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.lang.NullPointerException
> 2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server handler 0 on 60020, call getClosestRowBefore([B@431d7fb1,
> [B@45aa5fe3, [B@347d53d0) from 172.21.12.41:44655: error:
> java.io.IOException: java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
>         at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Caused by: java.lang.NullPointerException
> 2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 
> ---- Forwarded Message ----
>     
> From:       Andrew Purtell (RD-US)
>   Sent:       Thu 10/29/2009 4:44 PM
> Subject:       RE: jhat (was: RE: HBase 0.20.0 region server crash due to out of memory error)
> 
> Top item on the regionserver which went down due to OOME is byte[] -- class [B 2879025 3938815412 (3.6 GB) -- with top referencer org.apache.hadoop.hbase.KeyValue 1539539 (about 53% of all references). That is value data.
> 
> There are a lot of ConcurrentSkipListMap$Node objects also. The major user of this data type if you trace it through in jhat most likely will be KeyValueSkipListSet -> MemStore.
> 
> According to the metrics dumped at OOME time:
> 
>  > regions=342, stores=1506, storefiles=892, storefileIndexSize=196, memstoreSize=606, usedHeap=3983, maxHeap=3983,
>  > blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0, blockCacheHitRatio=0
> 
> 606 MB memstore plus 7 MB block cache plus 196 MB storeFileIndex != 3.6 GB of byte[]. Something does not add up.
> 
> (Do disable the block cache as I described in the other email. It is not helping at all with a hit ratio of 0%.)
> 
> The MemStoreScanner line reminds me of a recently filed issue:
> 
>   > class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner 77723   2798028
> 
> I heard about this issue while traveling but forgot about it.  See https://issues.apache.org/jira/browse/HBASE-1927 . Fixed already on the 0.20 branch and scheduled for 0.20.2 release. Current branch head is available to test now if this has any impact on the OOME issue. There could be some impact because MemStoreScanners cache one row of values, which will hold references to ConcurrentSkipListMap$Node. This is not accounted for in the metrics.
>  
> That the problem manifests during compaction with associated messages such as:
> 
>   > 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog:
>   > IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog; editcount=6175
> 
>   > 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 24610ms,
>   > ten times longer than scheduled: 1000
> 
> indicating very slow HDFS (~5 seconds per write) and thread starvation also suggests MemStore flushing may have fallen behind. Combine long delayed flushing with KeyValue heap load on the read side (scanners), and this may explain all of the heap usage, simply insufficient RAM/heap to contain all of the workload. Scanner leak may be a contributing factor. Snapshots for flushing are not accounted for in the metrics either.
> 
> Let's try to sort out what may be memory leak related and what may be just overloading. Will you consider testing the latest 0.20 branch revision in HBase project SVN?
> 
> I will forward this to hbase-dev@ with your email addresses removed.
> 
> Best regards,
> 
>     - Andy
> 
> 
> -----Original Message-----
> Sent: Thu 10/29/2009 12:51 PM
> To: Andrew Purtell (RD-US)
> Subject: RE: jhat (was: RE: HBase 0.20.0 region server crash due to out of memory error)
> 
> Dear Andy,
> 
> Here is the top ten classes in heap histogram got from jhat. Please kindly tell me if you need more information.
> 
> Class   Instance Count  Total Size
> class [B        2879025         3938815412
> class java.util.concurrent.ConcurrentSkipListMap$Node   1472184         35332416
> class org.apache.hadoop.hbase.KeyValue  1539539         24632624
> class java.util.concurrent.ConcurrentSkipListMap$Index  734903  17637672
> class [J        895     10660592
> class [[B       892     10660120
> class [Ljava.lang.Object;       82608   8947128
> class [I        2992    5467740
> class [C        37504   3125808
> class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner     77723   2798028
> 
> References by Type of class[B:
> class [B [0x2aaba8220738]
> Referrers by Type
> Class   Count
> org.apache.hadoop.hbase.KeyValue        1539539
> [[B     1330731
> org.apache.hadoop.hbase.io.ImmutableBytesWritable       22504
> org.apache.hadoop.hdfs.DFSClient$BlockReader    2605
> java.util.TreeMap$Entry         1938
> java.io.DataInputStream         1790
> org.apache.hadoop.hdfs.DFSClient$DFSDataInputStream     1784
> java.util.concurrent.ConcurrentSkipListMap$Node         1559
> org.apache.hadoop.hbase.HColumnDescriptor       1509
> org.apache.hadoop.hbase.regionserver.Store      1506
> org.apache.hadoop.hbase.HRegionInfo     1032
> 
> Best Regards,
> Robin Tang
> 
> -----Original Message-----
> From: Andrew Purtell (RD-US)
> Sent: 2009?10?29? 11:56
> Subject: jhat (was: RE: HBase 0.20.0 region server crash due to out of memory error)
> 
> Hi Robin,
> 
> Do you know how to use jhat to read a heap dump from a regionserver (written automatically when it OOMEs) and get a histogram of per object heap usage?
> 
> Best regards,
> 
>     - Andy
> 
> -----Original Message-----
> 
> Sent: 2009?10?28? 19:17
> To: Andrew Purtell (RD-US)
> Subject: HBase 0.20.0 region server crash due to out of memory error
> Importance: High
> 
> Dear Andy,
> 
> In TSS testing, after processing about 1.5 million urls, two of four region servers crashed.
> 
> 
> 69.174.253.200:
> 
> 2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region urls,002007360827102009-0000000000000000037-172.21.12.42-c74ec67ed7f2c42b948ae96a33fd59af,1256636471071 in 15sec
> 2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region result_hash,,1256524756227
> 2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog: Roll /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723488679, entries=27549, calcsize=63900120, filesize=60961579. New hlog /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723529786
> 2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722212390 whose highest sequence/edit id is 322578992
> 2009-10-28 09:52:09,820 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722224590 whose highest sequence/edit id is 322702111
> 2009-10-28 09:52:09,826 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722235891 whose highest sequence/edit id is 322817821
> 2009-10-28 09:52:09,833 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722254773 whose highest sequence/edit id is 322908772
> 2009-10-28 09:52:09,842 INFO org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722320154 whose highest sequence/edit id is 323030355
> 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog; editcount=6175
> 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 24610ms, ten times longer than scheduled: 1000
> 2009-10-28 09:55:29,503 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=0.0, regions=342, stores=1506, storefiles=892, storefileIndexSize=196, memstoreSize=606, usedHeap=3983, maxHeap=3983, blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0, blockCacheHitRatio=0
> 2009-10-28 09:55:29,503 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Set stop flag in regionserver/172.21.12.39:60020.compactor
> java.lang.OutOfMemoryError: Java heap space
>         at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>         at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:185)
>         at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:875)
>         at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:738)
>         at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:779)
>         at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:732)
>         at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:104)
> 2009-10-28 09:55:33,171 INFO org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.
> 2009-10-28 09:55:33,172 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020, call get([B@12249bf9, row=024003550728102009-00000000000000004
> 74-172.21.12.42-edf985c42366636e09f74f03d945dbbe, maxVersions=1, timeRange=[0,9223372036854775807), families=ALL) from 172.21.12.42:58762: error: java.io.IOException: Server not running, aborting
> java.io.IOException: Server not running, aborting
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1764)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 09:55:33,171 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: regionserver/172.21.12.39:60020.cacheFlusher exiting
> 2009-10-28 09:55:33,171 INFO org.apache.hadoop.hbase.regionserver.LogFlusher: regionserver/172.21.12.39:60020.logFlusher exiting
> 
> 
> 69.174.253.201:
> 
> 2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region .META.,,1
> 2009-10-28 09:59:10,687 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.lang.NullPointerException
> 2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call getClosestRowBefore([B@431d7fb1, [B@45aa5fe3, [B@347d53d0) from 172.21.12.41:44655: error: java.io.IOException: java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
>         at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Caused by: java.lang.NullPointerException
> 2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2009-10-28 10:04:54,207 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 1066294637410862793 lease expired
> 2009-10-28 10:04:54,209 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 6680747753024842647 lease expired
> 2009-10-28 10:04:54,209 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -6935590141578792585 lease expired
> 2009-10-28 10:05:52,261 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -547447698255259149 lease expired
> 2009-10-28 10:06:47,273 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 2776617092001158844 lease expired
> 2009-10-28 10:06:51,596 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:06:51,598 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call next(-547447698255259149, 100) from 172.21.12.38:54959: error: org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:07:03,997 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:07:03,998 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, call next(-6935590141578792585, 100) from 172.21.12.39:59669:error: org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:08:31,829 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087
> 2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087 in 7sec
> 2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764
> 2009-10-28 10:08:48,056 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -6964642614101173789 lease expired
> 2009-10-28 10:08:51,372 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764 in 12sec
> 2009-10-28 10:08:51,373 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region result_hash,,1256524756227
> 2009-10-28 10:09:13,732 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:13,736 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020, call next(6680747753024842647, 100) from 172.21.12.37:47309: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:17,595 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:17,597 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020, call next(1066294637410862793, 100) from 172.21.12.37:47311: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,343 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,344 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, call next(-6964642614101173789, 100) from 172.21.12.41:42371:error: org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,448 WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@50498912
> 2009-10-28 10:09:43,887 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,888 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020, call next(2776617092001158844, 100) from 172.21.12.39:59763: error: org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:10:01,444 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on 60020 took 3706ms appending an edit to hlog; editcount=68777
> 2009-10-28 10:11:01,838 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 0 on 60020 took 4219ms appending an edit to hlog; editcount=75191
> 2009-10-28 10:11:34,202 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 9 on 60020 took 4287ms appending an edit to hlog; editcount=77350
> 2009-10-28 10:11:43,711 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on 60020 took 2553ms appending an edit to hlog; editcount=77532
> 2009-10-28 10:11:59,666 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 5 on 60020 took 3756ms appending an edit to hlog; editcount=77740
> 2009-10-28 10:12:43,878 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 21384ms, ten times longer than scheduled: 1000
> 2009-10-28 10:12:43,896 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> java.lang.OutOfMemoryError: Java heap space
>         at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>         at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:12:43,912 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, aborting.
> java.lang.OutOfMemoryError: Java heap space
>         at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>         at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:12:49,264 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=0.0, regions=454, stores=1841, storefiles=1075, storefileIndexSize=242, memstoreSize=497, usedHeap=3961, maxHeap=3983, blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0, blockCacheHitRatio=0
> 2009-10-28 10:12:49,267 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60020, call next(1793460901757128664, 100) from 172.21.12.40:50800: error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
> java.io.IOException: java.lang.OutOfMemoryError: Java heap space
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1940)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>         at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>         at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>         at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>         ... 5 more
> 2009-10-28 10:12:49,268 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call put([B@3e2a9413, [Lorg.apache.hadoop.hbase.client.Put;@5d5f8b47) from 172.21.12.38:57580: error: java.io.IOException: Server not running, aborting
> java.io.IOException: Server not running, aborting
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1806)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 
> 
>       

Re: possible memstore related OOME

Posted by stack <st...@duboce.net>.
On the NPE, complaint must be from here:

 862   /*
 863    * @param t
 864    * @param msg Message to put in new IOE if passed <code>t</code> is
not an IOE
 865    * @return Make <code>t</code> an IOE if it isn't already.
 866    */
 867   private IOException convertThrowableToIOE(final Throwable t,
 868       final String msg) {
 869     return (t instanceof IOException? (IOException)t:
 870       msg == null || msg.length() == 0?
 871         new IOException(t): new IOException(msg, t));
 872   }


(Line numbers are off)

Is t null?  Doesn't seem possible?

On OOME, i'm with you that it may be partly the leaking Scanners issue...
there are 77k of them outstanding (The reporter said it made him OOME).

St.Ack



On Thu, Oct 29, 2009 at 7:33 AM, Andrew Purtell <ap...@apache.org> wrote:

> FYI, a detailed report from a Trend RD team as email thread below.
>
> Also, a weird NPE at
> HRegionServer.getClosestRowBefore(HRegionServer.java:1758) during compaction
> of META.
>
> 2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region .META.,,1
> 2009-10-28 09:59:10,687 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> java.lang.NullPointerException
> 2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server handler 0 on 60020, call getClosestRowBefore([B@431d7fb1,
> [B@45aa5fe3, [B@347d53d0) from 172.21.12.41:44655: error:
> java.io.IOException: java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
>        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Caused by: java.lang.NullPointerException
> 2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region .META.,,1 in 0sec
>
> ---- Forwarded Message ----
>
> From:       Andrew Purtell (RD-US)
>  Sent:       Thu 10/29/2009 4:44 PM
> Subject:       RE: jhat (was: RE: HBase 0.20.0 region server crash due to
> out of memory error)
>
> Top item on the regionserver which went down due to OOME is byte[] -- class
> [B 2879025 3938815412 (3.6 GB) -- with top referencer
> org.apache.hadoop.hbase.KeyValue 1539539 (about 53% of all references). That
> is value data.
>
> There are a lot of ConcurrentSkipListMap$Node objects also. The major user
> of this data type if you trace it through in jhat most likely will be
> KeyValueSkipListSet -> MemStore.
>
> According to the metrics dumped at OOME time:
>
>  > regions=342, stores=1506, storefiles=892, storefileIndexSize=196,
> memstoreSize=606, usedHeap=3983, maxHeap=3983,
>  > blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0,
> blockCacheHitRatio=0
>
> 606 MB memstore plus 7 MB block cache plus 196 MB storeFileIndex != 3.6 GB
> of byte[]. Something does not add up.
>
> (Do disable the block cache as I described in the other email. It is not
> helping at all with a hit ratio of 0%.)
>
> The MemStoreScanner line reminds me of a recently filed issue:
>
>  > class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner
> 77723   2798028
>
> I heard about this issue while traveling but forgot about it.  See
> https://issues.apache.org/jira/browse/HBASE-1927 . Fixed already on the
> 0.20 branch and scheduled for 0.20.2 release. Current branch head is
> available to test now if this has any impact on the OOME issue. There could
> be some impact because MemStoreScanners cache one row of values, which will
> hold references to ConcurrentSkipListMap$Node. This is not accounted for in
> the metrics.
>
> That the problem manifests during compaction with associated messages such
> as:
>
>  > 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog:
>  > IPC Server handler 3 on 60020 took 4786ms appending an edit to hlog;
> editcount=6175
>
>  > 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept 24610ms,
>  > ten times longer than scheduled: 1000
>
> indicating very slow HDFS (~5 seconds per write) and thread starvation also
> suggests MemStore flushing may have fallen behind. Combine long delayed
> flushing with KeyValue heap load on the read side (scanners), and this may
> explain all of the heap usage, simply insufficient RAM/heap to contain all
> of the workload. Scanner leak may be a contributing factor. Snapshots for
> flushing are not accounted for in the metrics either.
>
> Let's try to sort out what may be memory leak related and what may be just
> overloading. Will you consider testing the latest 0.20 branch revision in
> HBase project SVN?
>
> I will forward this to hbase-dev@ with your email addresses removed.
>
> Best regards,
>
>    - Andy
>
>
> -----Original Message-----
> Sent: Thu 10/29/2009 12:51 PM
> To: Andrew Purtell (RD-US)
> Subject: RE: jhat (was: RE: HBase 0.20.0 region server crash due to out of
> memory error)
>
> Dear Andy,
>
> Here is the top ten classes in heap histogram got from jhat. Please kindly
> tell me if you need more information.
>
> Class   Instance Count  Total Size
> class [B        2879025         3938815412
> class java.util.concurrent.ConcurrentSkipListMap$Node   1472184
> 35332416
> class org.apache.hadoop.hbase.KeyValue  1539539         24632624
> class java.util.concurrent.ConcurrentSkipListMap$Index  734903  17637672
> class [J        895     10660592
> class [[B       892     10660120
> class [Ljava.lang.Object;       82608   8947128
> class [I        2992    5467740
> class [C        37504   3125808
> class org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner
> 77723   2798028
>
> References by Type of class[B:
> class [B [0x2aaba8220738]
> Referrers by Type
> Class   Count
> org.apache.hadoop.hbase.KeyValue        1539539
> [[B     1330731
> org.apache.hadoop.hbase.io.ImmutableBytesWritable       22504
> org.apache.hadoop.hdfs.DFSClient$BlockReader    2605
> java.util.TreeMap$Entry         1938
> java.io.DataInputStream         1790
> org.apache.hadoop.hdfs.DFSClient$DFSDataInputStream     1784
> java.util.concurrent.ConcurrentSkipListMap$Node         1559
> org.apache.hadoop.hbase.HColumnDescriptor       1509
> org.apache.hadoop.hbase.regionserver.Store      1506
> org.apache.hadoop.hbase.HRegionInfo     1032
>
> Best Regards,
> Robin Tang
>
> -----Original Message-----
> From: Andrew Purtell (RD-US)
> Sent: 2009?10?29? 11:56
> Subject: jhat (was: RE: HBase 0.20.0 region server crash due to out of
> memory error)
>
> Hi Robin,
>
> Do you know how to use jhat to read a heap dump from a regionserver
> (written automatically when it OOMEs) and get a histogram of per object heap
> usage?
>
> Best regards,
>
>    - Andy
>
> -----Original Message-----
>
> Sent: 2009?10?28? 19:17
> To: Andrew Purtell (RD-US)
> Subject: HBase 0.20.0 region server crash due to out of memory error
> Importance: High
>
> Dear Andy,
>
> In TSS testing, after processing about 1.5 million urls, two of four region
> servers crashed.
>
>
> 69.174.253.200:
>
> 2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region
> urls,002007360827102009-0000000000000000037-172.21.12.42-c74ec67ed7f2c42b948ae96a33fd59af,1256636471071
> in 15sec
> 2009-10-28 09:52:06,082 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region result_hash,,1256524756227
> 2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog:
> Roll
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723488679,
> entries=27549, calcsize=63900120, filesize=60961579. New hlog
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256723529786
> 2009-10-28 09:52:09,814 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old hlog file
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722212390
> whose highest sequence/edit id is 322578992
> 2009-10-28 09:52:09,820 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old hlog file
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722224590
> whose highest sequence/edit id is 322702111
> 2009-10-28 09:52:09,826 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old hlog file
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722235891
> whose highest sequence/edit id is 322817821
> 2009-10-28 09:52:09,833 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old hlog file
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722254773
> whose highest sequence/edit id is 322908772
> 2009-10-28 09:52:09,842 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old hlog file
> /data/hbase/.logs/tss-poc2-3,60020,1256627509381/hlog.dat.1256722320154
> whose highest sequence/edit id is 323030355
> 2009-10-28 09:54:18,365 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 3 on 60020 took 4786ms appending an edit to hlog;
> editcount=6175
> 2009-10-28 09:55:29,502 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 24610ms, ten times longer than scheduled: 1000
> 2009-10-28 09:55:29,503 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=342, stores=1506, storefiles=892,
> storefileIndexSize=196, memstoreSize=606, usedHeap=3983, maxHeap=3983,
> blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0,
> blockCacheHitRatio=0
> 2009-10-28 09:55:29,503 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: Set stop flag in
> regionserver/172.21.12.39:60020.compactor
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>        at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>        at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:185)
>        at
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:875)
>        at
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:738)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:779)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:732)
>        at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:104)
> 2009-10-28 09:55:33,171 INFO
> org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.
> 2009-10-28 09:55:33,172 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60020, call get([B@12249bf9,
> row=024003550728102009-00000000000000004
> 74-172.21.12.42-edf985c42366636e09f74f03d945dbbe, maxVersions=1,
> timeRange=[0,9223372036854775807), families=ALL) from 172.21.12.42:58762:
> error: java.io.IOException: Server not running, aborting
> java.io.IOException: Server not running, aborting
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1764)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 09:55:33,171 INFO
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
> regionserver/172.21.12.39:60020.cacheFlusher exiting
> 2009-10-28 09:55:33,171 INFO
> org.apache.hadoop.hbase.regionserver.LogFlusher:
> regionserver/172.21.12.39:60020.logFlusher exiting
>
>
> 69.174.253.201:
>
> 2009-10-28 09:59:10,653 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region .META.,,1
> 2009-10-28 09:59:10,687 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> java.lang.NullPointerException
> 2009-10-28 09:59:10,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60020, call getClosestRowBefore([B@431d7fb1, [B@45aa5fe3,
> [B@347d53d0) from 172.21.12.41:44655: error: java.io.IOException:
> java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1758)
>        at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Caused by: java.lang.NullPointerException
> 2009-10-28 09:59:10,697 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region .META.,,1 in 0sec
> 2009-10-28 10:04:54,207 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> 1066294637410862793 lease expired
> 2009-10-28 10:04:54,209 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> 6680747753024842647 lease expired
> 2009-10-28 10:04:54,209 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> -6935590141578792585 lease expired
> 2009-10-28 10:05:52,261 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> -547447698255259149 lease expired
> 2009-10-28 10:06:47,273 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> 2776617092001158844 lease expired
> 2009-10-28 10:06:51,596 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:06:51,598 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60020, call next(-547447698255259149, 100) from
> 172.21.12.38:54959: error:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
> org.apache.hadoop.hbase.UnknownScannerException: Name: -547447698255259149
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:07:03,997 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:07:03,998 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60020, call next(-6935590141578792585, 100) from
> 172.21.12.39:59669:error: org.apache.hadoop.hbase.UnknownScannerException:
> Name: -6935590141578792585
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6935590141578792585
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:08:31,829 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region
> result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087
> 2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region
> result_url,016912590527102009-0000000000000000004-172.21.12.42c9828c1ba938237e64402dcb08e38ae3,1256637802087
> in 7sec
> 2009-10-28 10:08:39,130 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region
> urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764
> 2009-10-28 10:08:48,056 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> -6964642614101173789 lease expired
> 2009-10-28 10:08:51,372 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region
> urls,048403360827102009-0000000000000000034-172.21.12.42-24ca13c9888cfa83b50bdbccdbf41d20,1256637340764
> in 12sec
> 2009-10-28 10:08:51,373 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region result_hash,,1256524756227
> 2009-10-28 10:09:13,732 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:13,736 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60020, call next(6680747753024842647, 100) from
> 172.21.12.37:47309: error:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
> org.apache.hadoop.hbase.UnknownScannerException: Name: 6680747753024842647
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:17,595 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:17,597 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60020, call next(1066294637410862793, 100) from
> 172.21.12.37:47311: error:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
> org.apache.hadoop.hbase.UnknownScannerException: Name: 1066294637410862793
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,343 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,344 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60020, call next(-6964642614101173789, 100) from
> 172.21.12.41:42371:error: org.apache.hadoop.hbase.UnknownScannerException:
> Name: -6964642614101173789
> org.apache.hadoop.hbase.UnknownScannerException: Name: -6964642614101173789
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,448 WARN org.apache.hadoop.hbase.regionserver.Store:
> Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner@50498912
> 2009-10-28 10:09:43,887 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:09:43,888 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60020, call next(2776617092001158844, 100) from
> 172.21.12.39:59763: error:
> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
> org.apache.hadoop.hbase.UnknownScannerException: Name: 2776617092001158844
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1911)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:10:01,444 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 1 on 60020 took 3706ms appending an edit to hlog;
> editcount=68777
> 2009-10-28 10:11:01,838 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 0 on 60020 took 4219ms appending an edit to hlog;
> editcount=75191
> 2009-10-28 10:11:34,202 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 9 on 60020 took 4287ms appending an edit to hlog;
> editcount=77350
> 2009-10-28 10:11:43,711 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 4 on 60020 took 2553ms appending an edit to hlog;
> editcount=77532
> 2009-10-28 10:11:59,666 WARN org.apache.hadoop.hbase.regionserver.HLog: IPC
> Server handler 5 on 60020 took 3756ms appending an edit to hlog;
> editcount=77740
> 2009-10-28 10:12:43,878 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 21384ms, ten times longer than scheduled: 1000
> 2009-10-28 10:12:43,896 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>        at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>        at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:12:43,912 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
> aborting.
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>        at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>        at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2009-10-28 10:12:49,264 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=454, stores=1841, storefiles=1075,
> storefileIndexSize=242, memstoreSize=497, usedHeap=3961, maxHeap=3983,
> blockCacheSize=6853968, blockCacheFree=828520304, blockCacheCount=0,
> blockCacheHitRatio=0
> 2009-10-28 10:12:49,267 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60020, call next(1793460901757128664, 100) from
> 172.21.12.40:50800: error: java.io.IOException:
> java.lang.OutOfMemoryError: Java heap space
> java.io.IOException: java.lang.OutOfMemoryError: Java heap space
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:847)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:837)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1940)
>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:981)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:936)
>        at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1125)
>        at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:76)
>        at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>        at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:103)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1794
>        at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1733)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1927)
>        ... 5 more
> 2009-10-28 10:12:49,268 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60020, call put([B@3e2a9413,
> [Lorg.apache.hadoop.hbase.client.Put;@5d5f8b47) from 172.21.12.38:57580:
> error: java.io.IOException: Server not running, aborting
> java.io.IOException: Server not running, aborting
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1806)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
>
>