You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2009/08/21 18:02:14 UTC

[jira] Commented: (HBASE-1782) Self-inflicted RS temporary deadlocks

    [ https://issues.apache.org/jira/browse/HBASE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746034#action_12746034 ] 

Jean-Daniel Cryans commented on HBASE-1782:
-------------------------------------------

First the flusher waits:

{code}
"RegionServer:0.cacheFlusher" daemon prio=10 tid=0x6e4f2400 nid=0x26ec waiting on condition [0x6d6e1000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.checkStoreFileCount(MemStoreFlusher.java:295)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:225)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:149)
{code}

then the compactor wants to inform the region historian of a split but is unable to:

{code}
"RegionServer:0.compactor" daemon prio=10 tid=0x6dda1800 nid=0x26ed in Object.wait() [0x6d690000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x71c26f80> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call)
	at java.lang.Object.wait(Object.java:485)
	at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:717)
	- locked <0x71c26f80> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:328)
	at $Proxy1.put(Unknown Source)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$2.call(HConnectionManager.java:1033)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$2.call(HConnectionManager.java:1031)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:923)
	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1030)
	at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:582)
	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:448)
	- locked <0x758461f0> (a org.apache.hadoop.hbase.client.HTable)
	at org.apache.hadoop.hbase.RegionHistorian.add(RegionHistorian.java:239)
	at org.apache.hadoop.hbase.RegionHistorian.add(RegionHistorian.java:217)
	at org.apache.hadoop.hbase.RegionHistorian.addRegionSplit(RegionHistorian.java:176)
	at org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:672)
{code}

because of this:

{code}
"IPC Server handler 5 on 34979" daemon prio=10 tid=0x08595c00 nid=0x26fc waiting for monitor entry [0x6d0c7000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:320)
	- waiting to lock <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1809)
	at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

"IPC Server handler 3 on 34979" daemon prio=10 tid=0x08311c00 nid=0x26fa waiting for monitor entry [0x6d169000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:320)
	- waiting to lock <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1998)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

"IPC Server handler 1 on 34979" daemon prio=10 tid=0x08471800 nid=0x26f8 waiting on condition [0x6d20b000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.checkStoreFileCount(MemStoreFlusher.java:295)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:225)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushSomeRegions(MemStoreFlusher.java:352)
	- locked <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
	at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:321)
	- locked <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1809)
	at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

{code}

> Self-inflicted RS temporary deadlocks
> -------------------------------------
>
>                 Key: HBASE-1782
>                 URL: https://issues.apache.org/jira/browse/HBASE-1782
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.20.1, 0.21.0
>
>
> When a region has too many store files and that that region server holds .META., it is easy to get a 90s deadlock. I will paste the stack traces in a moment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.