You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/01/30 02:52:59 UTC

[jira] Created: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
---------------------------------------------------------------------------------

                 Key: HBASE-1163
                 URL: https://issues.apache.org/jira/browse/HBASE-1163
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.19.0
            Reporter: Andrew Purtell
            Priority: Critical


Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 

Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:

Thread 21 (RegionManager.rootScanner):
  State: WAITING
  Blocked count: 500
  Waited count: 621
  Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
  Stack:
    java.lang.Object.wait(Native Method)
    java.lang.Object.wait(Object.java:485)
    org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
    org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
    $Proxy2.next(Unknown Source)
    org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
    org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
    org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
    org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
    org.apache.hadoop.hbase.Chore.run(Chore.java:65)

I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1163:
----------------------------------

    Attachment: stacks-1163.1.zip

Attached full stack dumps from master and all HRS. 

> Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-1163
>                 URL: https://issues.apache.org/jira/browse/HBASE-1163
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Andrew Purtell
>            Priority: Critical
>         Attachments: stacks-1163.1.zip
>
>
> Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 
> Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:
> Thread 21 (RegionManager.rootScanner):
>   State: WAITING
>   Blocked count: 500
>   Waited count: 621
>   Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>     org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>     $Proxy2.next(Unknown Source)
>     org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
>     org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
>     org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
>     org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
>     org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668717#action_12668717 ] 

Andrew Purtell commented on HBASE-1163:
---------------------------------------

Shutting down a cluster in this state is not successful. Master splits all HRS logs and stays up. HRS all stay up and won't go down shy of kill -9. 

> Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-1163
>                 URL: https://issues.apache.org/jira/browse/HBASE-1163
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Andrew Purtell
>            Priority: Critical
>         Attachments: stacks-1163.1.zip
>
>
> Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 
> Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:
> Thread 21 (RegionManager.rootScanner):
>   State: WAITING
>   Blocked count: 500
>   Waited count: 621
>   Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>     org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>     $Proxy2.next(Unknown Source)
>     org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
>     org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
>     org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
>     org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
>     org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668709#action_12668709 ] 

Andrew Purtell commented on HBASE-1163:
---------------------------------------

Nothing amiss on the HRS hosting ROOT as far as I can see. 

Thread 311 (IPC Client (47) connection to sjdc-atr-dc-2.atr.trendmicro.com/10.30.94.31:60000 from an unknown user):
  State: TIMED_WAITING
  Blocked count: 1595
  Waited count: 1595
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)


> Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-1163
>                 URL: https://issues.apache.org/jira/browse/HBASE-1163
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Andrew Purtell
>            Priority: Critical
>
> Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 
> Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:
> Thread 21 (RegionManager.rootScanner):
>   State: WAITING
>   Blocked count: 500
>   Waited count: 621
>   Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>     org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>     $Proxy2.next(Unknown Source)
>     org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
>     org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
>     org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
>     org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
>     org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668709#action_12668709 ] 

apurtell edited comment on HBASE-1163 at 1/29/09 6:08 PM:
----------------------------------------------------------------

Nothing amiss on the HRS hosting ROOT as far as I can see. 

Thread 14 (IPC Server listener on 60020):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 1
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
    org.apache.hadoop.hbase.ipc.HBaseServer$Listener.run(HBaseServer.java:299)
Thread 16 (IPC Server Responder):
  State: RUNNABLE
  Blocked count: 222
  Waited count: 181
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    org.apache.hadoop.hbase.ipc.HBaseServer$Responder.run(HBaseServer.java:458)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)

Here is the corresponding IPC thread on the master:

Thread 309 (IPC Client (47) connection to /10.30.94.32:60020 from an unknown user):
  State: RUNNABLE
  Blocked count: 1
  Waited count: 1
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
    org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
    org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
    org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
    java.io.FilterInputStream.read(FilterInputStream.java:116)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
    java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    java.io.BufferedInputStream.read(BufferedInputStream.java:237)
    java.io.DataInputStream.readInt(DataInputStream.java:370)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)



      was (Author: apurtell):
    Nothing amiss on the HRS hosting ROOT as far as I can see. 

Thread 311 (IPC Client (47) connection to sjdc-atr-dc-2.atr.trendmicro.com/10.30.94.31:60000 from an unknown user):
  State: TIMED_WAITING
  Blocked count: 1595
  Waited count: 1595
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)

Here is the corresponding IPC thread on the master:

Thread 309 (IPC Client (47) connection to /10.30.94.32:60020 from an unknown user):
  State: RUNNABLE
  Blocked count: 1
  Waited count: 1
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
    org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
    org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
    org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
    java.io.FilterInputStream.read(FilterInputStream.java:116)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
    java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    java.io.BufferedInputStream.read(BufferedInputStream.java:237)
    java.io.DataInputStream.readInt(DataInputStream.java:370)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)


  
> Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-1163
>                 URL: https://issues.apache.org/jira/browse/HBASE-1163
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Andrew Purtell
>            Priority: Critical
>
> Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 
> Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:
> Thread 21 (RegionManager.rootScanner):
>   State: WAITING
>   Blocked count: 500
>   Waited count: 621
>   Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>     org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>     $Proxy2.next(Unknown Source)
>     org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
>     org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
>     org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
>     org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
>     org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668709#action_12668709 ] 

apurtell edited comment on HBASE-1163 at 1/29/09 6:02 PM:
----------------------------------------------------------------

Nothing amiss on the HRS hosting ROOT as far as I can see. 

Thread 311 (IPC Client (47) connection to sjdc-atr-dc-2.atr.trendmicro.com/10.30.94.31:60000 from an unknown user):
  State: TIMED_WAITING
  Blocked count: 1595
  Waited count: 1595
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)

Here is the corresponding IPC thread on the master:

Thread 309 (IPC Client (47) connection to /10.30.94.32:60020 from an unknown user):
  State: RUNNABLE
  Blocked count: 1
  Waited count: 1
  Stack:
    sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
    sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
    org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
    org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
    org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
    java.io.FilterInputStream.read(FilterInputStream.java:116)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:276)
    java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    java.io.BufferedInputStream.read(BufferedInputStream.java:237)
    java.io.DataInputStream.readInt(DataInputStream.java:370)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:498)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:443)



      was (Author: apurtell):
    Nothing amiss on the HRS hosting ROOT as far as I can see. 

Thread 311 (IPC Client (47) connection to sjdc-atr-dc-2.atr.trendmicro.com/10.30.94.31:60000 from an unknown user):
  State: TIMED_WAITING
  Blocked count: 1595
  Waited count: 1595
  Stack:
    java.lang.Object.wait(Native Method)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.waitForWork(HBaseClient.java:400)
    org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:442)

  
> Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-1163
>                 URL: https://issues.apache.org/jira/browse/HBASE-1163
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Andrew Purtell
>            Priority: Critical
>
> Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 
> Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:
> Thread 21 (RegionManager.rootScanner):
>   State: WAITING
>   Blocked count: 500
>   Waited count: 621
>   Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>     org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>     $Proxy2.next(Unknown Source)
>     org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
>     org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
>     org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
>     org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
>     org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1163) Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668713#action_12668713 ] 

Andrew Purtell commented on HBASE-1163:
---------------------------------------

Every node in the cluster has an IPC connection open to /10.30.94.32:60020 (the HRS hosting ROOT) and all have the exact same stack trace as thread #309 in the comment above. 

In the other HRS stack traces I see that many are hung on RPC to the master out of CompactSplitThread now. RegionHistorian.add -> HTable.commit -> HTable.flushCommits -> [...] TableServers.locateRegionInMeta -> $Proxy.getClosestRowBefore 

and the master is not returning from that RPC to getClosestRowBefore.

> Master root scanner hung, clients blocked indefinitely waiting for getStartKeys()
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-1163
>                 URL: https://issues.apache.org/jira/browse/HBASE-1163
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Andrew Purtell
>            Priority: Critical
>
> Mapreduce tasks based on TIF won't start. Clients trying to find regions by start key block indefinitely (Heritrix hbase writer eventually times out archiver). 
> Master seems hung in root scan. I've dumped thread stacks 10 times in 10 minutes and the same HBaseClient$Call  object appears in the trace. See below:
> Thread 21 (RegionManager.rootScanner):
>   State: WAITING
>   Blocked count: 500
>   Waited count: 621
>   Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@55a2896d
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:695)
>     org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:321)
>     $Proxy2.next(Unknown Source)
>     org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:161)
>     org.apache.hadoop.hbase.master.RootScanner.scanRoot(RootScanner.java:55)
>     org.apache.hadoop.hbase.master.RootScanner.maintenanceScan(RootScanner.java:80)
>     org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
>     org.apache.hadoop.hbase.Chore.run(Chore.java:65)
> I only see messages from the MetaScanner scanner in the master log, nothing from RootScanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.