You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sean Busbey (Jira)" <ji...@apache.org> on 2019/08/25 20:36:00 UTC

[jira] [Commented] (HBASE-22918) RegionServer violates failfast fault assumption

    [ https://issues.apache.org/jira/browse/HBASE-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915344#comment-16915344 ] 

Sean Busbey commented on HBASE-22918:
-------------------------------------

I suspect this might be better suited to the mailing list dev@hbase. It's not clear to me if you're claiming you've found incorrect behavior or if you're asking about expected behavior.

Here's my attempt at describing the scenario I think you're setting up.

We have some master process, a region server process, and a zk that acts as liveliness check.

1) client is talking to RS process.
2) RS process is properly writing to HDFS
3) RS process can not talk to ZK during some window after client starts and long enough to reach the ZK timeout.

have I accurately described things?

In that scenario, what should happen is

1) ZK node for the RS will expire because it is not heart beating
2) Master will see that this has happened and will forcefully recover the HDFS lease on WALs for the RS
3) Master will then process recovery of those WALs and assign the regions elsewhere
4) If the RS heartbeats to the master after #2, master will send it a "you are dead" and RS should abort
5) If the RS attempts to write to the wal after #2, it will fail because it no longer has a lease, and the RS should abort
6) a client attempting to send writes to the RS after #2 will also fail, because the write to the wal will fail.
7) presuming retries are configured correctly, the client will eventually send writes to whomever the master picked in #3

are you observing something other than the above?



> RegionServer violates failfast fault assumption
> -----------------------------------------------
>
>                 Key: HBASE-22918
>                 URL: https://issues.apache.org/jira/browse/HBASE-22918
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ranpanfeng
>            Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> hbase 2.1.5 is tested and veriflied seriously before it will be deployed in our production environment. we give NP(network partition) fault a very important care. so NP fault injection tests are conducted in our test environment. Some findings are exposed.
> I use ycsb to write data  into table SYSTEM:test, which resides on regionserver0; during the writting, I use iptables to drop any packet from regionserver0 to zookeeper quorums. after
> a default zookeeper.session.out(90'), regionserver0 throws YouAreDeadException after retries  to connect to zookeeper on TimeoutException error. then, regionserver0 suicides itself, before regionserver0 invokes completeFile  on WAL, the active master already considered regionserver0 has dead pre-maturely, so invokes recoverLease to close the WAL on regionserver0 forcely.
> In trusted idc, distributed storage assumes that the error are always failstop/failfast faults, there are no Byzantine failures. so in above scenario, active master should take over the WAL on regionserver0 after regionserver0 is suicided successfully.  According to lease protocol, RS
> should suicide in a lease period, and active master should take over the WAL
>  after a grace period has elapsed, and invariant "lease period < grace period" should always hold.  in hbase-site.xml, only on config property "zookeeper.session.timeout" is given,  I think we should provide two properties:
>   1. regionserver.zookeeper.session.timeout
>   2. master.zookeeper.session.timeout
> HBase admin then can tune regionserver.zookeeper.session.timeout less than master.zookeeper.session.timeout. In this way, failstop assumption is guaranteed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)