You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Shushant Arora <sh...@gmail.com> on 2016/05/15 02:43:49 UTC

hbase zookeeper lag

Hi

Hbase uses zookeeper for various purposes. e.g for region split.

Regionserver creates a znode in zookeeper with splitting state and master
gets notification of this directory , since zookeeper is not fully
consistent - there may be lag between  actual directory creation and
notification till then regionserver will start splitting.
1.will this lag creates issue- Region is already splitted in two but master
does not even know about it until lag of zookeeper is cleared.


and also when regionserver is down it will be notified to master but there
also it can be lag. So it can happen a node in zookeeper is lagging lot
behind say ~2minutes . So master will be notified after 2 minutes.
2.Won't this lag create issue- make client will get region not reachable
and will try with backoff but actual recovery of region server backup will
start after 2 minutes?

Thanks!

Re: hbase zookeeper lag

Posted by Stack <st...@duboce.net>.
On Sat, May 14, 2016 at 7:43 PM, Shushant Arora <sh...@gmail.com>
wrote:

> Hi
>
> Hbase uses zookeeper for various purposes. e.g for region split.
>
> Regionserver creates a znode in zookeeper with splitting state and master
> gets notification of this directory , since zookeeper is not fully
> consistent - there may be lag between  actual directory creation and
> notification till then regionserver will start splitting.
> 1.will this lag creates issue- Region is already splitted in two but master
> does not even know about it until lag of zookeeper is cleared.
>
>
Have you seen an issue? First the regionserver 'asks' the master if it is
ok to split (before splitting).  If master says it is ok by changing the
znode state, then regionserver proceeds notifying the master via state
change in zk. There could be some lag here of course given there is an RPC
-- and in hbase 2.0, the intent is to undo our going via zk -- but we've
not had this identified as a problem. Have you seen it as so?


>
> and also when regionserver is down it will be notified to master but there
> also it can be lag. So it can happen a node in zookeeper is lagging lot
> behind say ~2minutes . So master will be notified after 2 minutes.
>


There is RPC so there may be a lag, yes.



> 2.Won't this lag create issue- make client will get region not reachable
> and will try with backoff but actual recovery of region server backup will
> start after 2 minutes?
>
>
(Where'd you get the two minutes?)

Yes, lag could slow down recovery.

St.Ack



> Thanks!
>