You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Justin Lynn (JIRA)" <ji...@apache.org> on 2009/09/04 00:26:57 UTC

[jira] Created: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
-----------------------------------------------------------------------------------------------

                 Key: HBASE-1815
                 URL: https://issues.apache.org/jira/browse/HBASE-1815
             Project: Hadoop HBase
          Issue Type: Bug
          Components: client
    Affects Versions: 0.20.0
         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
            Reporter: Justin Lynn
             Fix For: 0.20.1


While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "Justin Lynn (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Lynn updated HBASE-1815:
-------------------------------

    Attachment: thrift_server_log_excerpt
                thrift_server_threaddump

These are the thrift server threaddumps and log files from the time when the failure was noticed.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: thrift_server_log_excerpt, thrift_server_threaddump
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751208#action_12751208 ] 

stack commented on HBASE-1815:
------------------------------

Working w/ JSharp, looking in the thread dumps, it looks like each thread has to do ten retries sleeping a second between each retry.  When many threads, we get a lot of messages in the log about the failure to connect.  Need to recognize dead-remote-side and handle it promptly.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758264#action_12758264 ] 

Jean-Daniel Cryans commented on HBASE-1815:
-------------------------------------------

+1 patch seems good. Apart from unit tests and loading, did you try killing some region servers?

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>            Assignee: stack
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757671#action_12757671 ] 

stack commented on HBASE-1815:
------------------------------

I had this patch installed in my overnight loading.  The upload worked about same as usual so this patch doesn't seem to change basic workings.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>            Assignee: stack
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755760#action_12755760 ] 

stack commented on HBASE-1815:
------------------------------

HBaseClient also has this issue from list:

Yeah, this is down in guts of the hadoop rpc we use.  Around connection setup it has its own config. that is not well aligned with ours (ours being the retries and pause settings)

The maxretriies down in ipc is

this.maxRetries = conf.getInt("ipc.client.connect.max.retries", 10);

Thats for an IOE other than timeout.  For timeout, it does this:

          } catch (SocketTimeoutException toe) {
            /* The max number of retries is 45,
             * which amounts to 20s*45 = 15 minutes retries.
             */
            handleConnectionFailure(timeoutFailures++, 45, toe);

Let me file an issue to address the above.  The retries should be our retries... and in here it has a hardcoded 1000ms that instead should be our pause.... Not hard to fix.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757673#action_12757673 ] 

stack commented on HBASE-1815:
------------------------------

All unit tests pass.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>            Assignee: stack
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1815:
-------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed branch and trunk.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>            Assignee: stack
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1815:
-------------------------

    Attachment: ipctimeout.patch

This patch might be a bit radical, but here it goes.

High-level motivation is undo retrying and sleeps down in ipc; let retrying be done at a higher level up in the hbase client.

In ipc, socket setup had a timeout of 20 seconds.  Ipc then retries the socket setup ten times with a 1 second sleep in between.  Thats 210seconds  or so before we timeout down in the guts of RPC.  We then go up to the retry logic in hbase (usually, not always), and then do ten retries with a 2 second retry in between (If a SocketTimeoutException exception setting up the connection, we'd retry a hard-coded 45 times; i.e. 15 minutes).

In Justin's case, I don't think we were doing SocketTimeoutException going by the stack trace.  It was more the 210 seconds per thread but my guess is  that his thrift client had probably timed out already.

This patch turns off retry down in the ipc client (let the upper-layers do retry), changes hard-coded sleep times to be hbase.client.pause time (2 seconds), and removes the 45 hard-coding,   It also adds an hbase prefix to the ipc configuration parameters in case we want different values from hadoop.

Let me try out this patch.  My guess is that there are places in hbase where we don't retry because we were dependent on ipc doing retry for us.  Let me find those.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "Justin Lynn (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Lynn updated HBASE-1815:
-------------------------------

    Attachment: thrift_server_threaddump_1

Another thread dump.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1815:
-------------------------

    Assignee: stack
      Status: Patch Available  (was: Open)

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>            Assignee: stack
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758280#action_12758280 ] 

stack commented on HBASE-1815:
------------------------------

Yes.  I should have so.  I killed master and watched what regionservers did.  I also killed the cluster and watched client.   It all seems to run more regularly.    Less weird retrying.   


Thanks for review. 

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>            Assignee: stack
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1815:
-------------------------

    Attachment: hbaseclient-v3.patch

This version adds cleanup.

In HRegionServer main run loop, wait before retrying rather than just run all retries without pause.

Changed the HBaseRPC RetriesExhaustedException so its about failure to get proxy instead of a wonky message about unknown row.

Move the get of a regionserver connection into the try/catch so if fails, its retried.

This patch changes how our retrying from client and from servers works.  I tested up on a cluster and it seems more regular and 'live' now than previous but I may have missed cases where we used to rely on the rpc retry.  I'm not sure how to find those other than to commit and wait till someone complains.

Review appreciated.

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: hbaseclient-v3.patch, ipctimeout.patch, thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1815) HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751212#action_12751212 ] 

Andrew Purtell commented on HBASE-1815:
---------------------------------------

Clients can monitor RS liveness via ZK and respond quickly via watches? 

> HBaseClient can get stuck in an infinite loop while attempting to contact a failed regionserver
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1815
>                 URL: https://issues.apache.org/jira/browse/HBASE-1815
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.0
>         Environment: Ubuntu Linux (Linux <elided> 2.6.24-23-generic #1 SMP Wed Apr 1 21:43:24 UTC 2009 x86_64 GNU/Linux), java version "1.6.0_06", Java(TM) SE Runtime Environment (build 1.6.0_06-b02), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b22, mixed mode)
>            Reporter: Justin Lynn
>             Fix For: 0.20.1
>
>         Attachments: thrift_server_log_excerpt, thrift_server_threaddump, thrift_server_threaddump_1
>
>
> While using HBase Thrift server, if a regionserver goes down due to shutdown or failure clients will timeout because the thrift server cannot contact the dead regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.