You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2009/08/21 18:02:14 UTC
[jira] Commented: (HBASE-1782) Self-inflicted RS temporary
deadlocks
[ https://issues.apache.org/jira/browse/HBASE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746034#action_12746034 ]
Jean-Daniel Cryans commented on HBASE-1782:
-------------------------------------------
First the flusher waits:
{code}
"RegionServer:0.cacheFlusher" daemon prio=10 tid=0x6e4f2400 nid=0x26ec waiting on condition [0x6d6e1000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.checkStoreFileCount(MemStoreFlusher.java:295)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:225)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:149)
{code}
then the compactor wants to inform the region historian of a split but is unable to:
{code}
"RegionServer:0.compactor" daemon prio=10 tid=0x6dda1800 nid=0x26ed in Object.wait() [0x6d690000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x71c26f80> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call)
at java.lang.Object.wait(Object.java:485)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:717)
- locked <0x71c26f80> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:328)
at $Proxy1.put(Unknown Source)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$2.call(HConnectionManager.java:1033)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$2.call(HConnectionManager.java:1031)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:923)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1030)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:582)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:448)
- locked <0x758461f0> (a org.apache.hadoop.hbase.client.HTable)
at org.apache.hadoop.hbase.RegionHistorian.add(RegionHistorian.java:239)
at org.apache.hadoop.hbase.RegionHistorian.add(RegionHistorian.java:217)
at org.apache.hadoop.hbase.RegionHistorian.addRegionSplit(RegionHistorian.java:176)
at org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:672)
{code}
because of this:
{code}
"IPC Server handler 5 on 34979" daemon prio=10 tid=0x08595c00 nid=0x26fc waiting for monitor entry [0x6d0c7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:320)
- waiting to lock <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1809)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
"IPC Server handler 3 on 34979" daemon prio=10 tid=0x08311c00 nid=0x26fa waiting for monitor entry [0x6d169000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:320)
- waiting to lock <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at org.apache.hadoop.hbase.regionserver.HRegionServer.delete(HRegionServer.java:1998)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
"IPC Server handler 1 on 34979" daemon prio=10 tid=0x08471800 nid=0x26f8 waiting on condition [0x6d20b000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.checkStoreFileCount(MemStoreFlusher.java:295)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:225)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushSomeRegions(MemStoreFlusher.java:352)
- locked <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:321)
- locked <0x74a99ff0> (a org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1809)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
{code}
> Self-inflicted RS temporary deadlocks
> -------------------------------------
>
> Key: HBASE-1782
> URL: https://issues.apache.org/jira/browse/HBASE-1782
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.0
> Reporter: Jean-Daniel Cryans
> Fix For: 0.20.1, 0.21.0
>
>
> When a region has too many store files and that that region server holds .META., it is easy to get a 90s deadlock. I will paste the stack traces in a moment.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.