You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2010/06/18 21:25:13 UTC

Re: How to recover from an attempt to connect to an unavailable region?

> While I'm trying to figure out what is causing the region to be non-responsive, what's the best way to recover?

I'd recover from that the same way I'd recover from any unavailable
storage component, since that can happen with any DB/SAN/etc right?
The link between your client and HBase could be cut, or a split took
way too much time, or a more serious issue like your whole cluster
went down... your choices are either to fail the user's request (if
it's user-facing), or buffer the edits somewhere until the problem is
resolved, or any other nifty trick you can think of.

BTW we're are planning to clean up the client retries policy, see
https://issues.apache.org/jira/browse/HBASE-2445

J-D

On Fri, Jun 18, 2010 at 6:13 AM, Michael Segel
<mi...@hotmail.com> wrote:
>
> Hi,
>
> Here's the situation ...
>
> Something is futzing up our HBase tables and when processes try to run a query, they end up getting an error of trying to connect to a region which isn't responding. After 10 tries it fails.
>
> While I'm trying to figure out what is causing the region to be non-responsive, what's the best way to recover?
>
> It looks like ZK and the Region Server is out of sync. But that's a guess.
> I had this problem late last night and I was using one of the tools in HBase Shell that seemed to correct it and then I could truncate the table. (Not sure which one did it.)
>
> Thx
>
> -Mike
>
>
> _________________________________________________________________
> The New Busy is not the old busy. Search, chat and e-mail from your inbox.
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3