You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Toshihiro Suzuki (JIRA)" <ji...@apache.org> on 2018/04/04 09:45:00 UTC
[jira] [Comment Edited] (HBASE-19893) restore_snapshot is broken in master branch when region splits

    [ https://issues.apache.org/jira/browse/HBASE-19893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425254#comment-16425254 ] 

Toshihiro Suzuki edited comment on HBASE-19893 at 4/4/18 9:44 AM:
------------------------------------------------------------------

Sorry for the late reply [~ram_krish],

{quote}
So this process of restore snapshot procs adds the in memory info that the procedures has to the META. So when the table is enabled after restore snapshot, this META info is not taken as the source of truth is it? Ya i think we may not know whether after disabling and when we enable if the enable is from the snapshot or frm some where else. In that sense this fix LGTM.
So the change in TestRestoreSnapshotFromClient if run without the fix it would fail and now it would pass I believe.
{quote}
Yes, the META info should be the source of truth.
Currently when restoring snapshot, the restore snapshot procs changes only META info and it doesn't change in-memory states.
That's why this issue happens.
The fix in the patch is adding a logic to change in-memory states.

{quote}
If the Master crashes and gets started again just after restore snapshot procedure is run and then you enable the table, what happens? Atleast that time do we read from META?
{quote}
Yes. I think even when Master crashes, Master can recover in-memory stats from the META table and retry restoring snapshot.


And I attached a v3 patch. In the previous patch, all region replica infos in in-memory stats were removed when restoring a snapshot.
However, I thought it is not correct and in the v3 patch, I think region replica infos in in-memory are handled correctly.

Could you please review this patch? [~yuzhihong@gmail.com] [~ram_krish]


was (Author: brfrn169):
Sorry for the late reply [~ram_krish],

{quote}
So this process of restore snapshot procs adds the in memory info that the procedures has to the META. So when the table is enabled after restore snapshot, this META info is not taken as the source of truth is it? Ya i think we may not know whether after disabling and when we enable if the enable is from the snapshot or frm some where else. In that sense this fix LGTM.
So the change in TestRestoreSnapshotFromClient if run without the fix it would fail and now it would pass I believe.
{quote}
Yes, the META info should be the source of truth.
Currently when restoring snapshot, the restore snapshot procs changes only META info and it doesn't change in-memory states.
That's why this issue happens.
The fix in the patch is adding a logic to change in-memory states.

{quote}
If the Master crashes and gets started again just after restore snapshot procedure is run and then you enable the table, what happens? Atleast that time do we read from META?
{quote}
Yes. I think even when Master crashes, Master can recover in-memory stats from the META table and retry restoring snapshot.


And I attached a v3 patch. In the previous patch, all region replica infos in in-memory stats were removed when restoring a snapshot.
However, I thought it is not correct and in the v3 patch, region replica infos in in-memory are handled correctly.

Could you please review this patch? [~yuzhihong@gmail.com] [~ram_krish]

> restore_snapshot is broken in master branch when region splits
> --------------------------------------------------------------
>
>                 Key: HBASE-19893
>                 URL: https://issues.apache.org/jira/browse/HBASE-19893
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>            Reporter: Toshihiro Suzuki
>            Assignee: Toshihiro Suzuki
>            Priority: Critical
>         Attachments: HBASE-19893.master.001.patch, HBASE-19893.master.002.patch, HBASE-19893.master.003.patch
>
>
> When I was investigating HBASE-19850, I found restore_snapshot didn't work in master branch.
>  
> Steps to reproduce are as follows:
> 1. Create a table
> {code:java}
> create "test", "cf"
> {code}
> 2. Load data (2000 rows) to the table
> {code:java}
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> {code}
> 3. Split the table
> {code:java}
> split "test"
> {code}
> 4. Take a snapshot
> {code:java}
> snapshot "test", "snap"
> {code}
> 5. Load more data (2000 rows) to the table and split the table agin
> {code:java}
> (2000...4000).each{|i| put "test", "row#{i}", "cf:col", "val"}
> split "test"
> {code}
> 6. Restore the table from the snapshot 
> {code:java}
> disable "test"
> restore_snapshot "snap"
> enable "test"
> {code}
> 7. Scan the table
> {code:java}
> scan "test"
> {code}
> However, this scan returns only 244 rows (it should return 2000 rows) like the following:
> {code:java}
> hbase(main):038:0> scan "test"
> ROW COLUMN+CELL
>  row78 column=cf:col, timestamp=1517298307049, value=val
> ....
>   row999 column=cf:col, timestamp=1517298307608, value=val
> 244 row(s)
> Took 0.1500 seconds
> {code}
>  
> Also, the restored table should have 2 online regions but it has 3 online regions.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)