You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Yunfan Zhong (JIRA)" <ji...@apache.org> on 2014/02/04 22:46:10 UTC

[jira] [Created] (HBASE-10464) Race condition during RS shutdown that could cause data loss

Yunfan Zhong created HBASE-10464:
------------------------------------

             Summary: Race condition during RS shutdown that could cause data loss
                 Key: HBASE-10464
                 URL: https://issues.apache.org/jira/browse/HBASE-10464
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.89-fb
            Reporter: Yunfan Zhong
            Priority: Critical
             Fix For: 0.89-fb


Bug scenario (T* are timestamps, say T1 < T2 < ... < Tn):
1. Master assigns a region to RS at T1
2. RS works on opening the region during T1 to T3
3. In the mean time of opening the region, RS starts to shut down at T2, and dfs client is closed at T5.
4. Regions owned by the RS get closed as a step of RS shutdown except that the newly opened region is online during T3 to T5 and holds some mutations in memory after possible last flush T4.
5. Since master thinks RS has a clean shutdown, there is no log splitting. The HLog was moved to old logs directory naturally.
6. Mutations in memory between T4 to T5 (if T4 does not exist, T3 to T5) are not flushed. They only exist in WAL if it is turned on.

Fix is to prevent region opening from succeeding when the RS is shutting down.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)