You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Toshihiro Suzuki (JIRA)" <ji...@apache.org> on 2018/03/04 06:11:00 UTC

[jira] [Comment Edited] (HBASE-20006) TestRestoreSnapshotFromClientWithRegionReplicas is flakey

    [ https://issues.apache.org/jira/browse/HBASE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384979#comment-16384979 ] 

Toshihiro Suzuki edited comment on HBASE-20006 at 3/4/18 6:10 AM:
------------------------------------------------------------------

I attached the v1 patch.

It seems like the problem occurs when taking a snapshot for a table some of whose regions has parent reference files, and when opening a replica region, HFileLink References aren't handled correctly.

I added the handle in the v1 patch.


was (Author: brfrn169):
I attached the v1 patch.

It seems like the problem occurs when taking a snapshot for a table some of whose regions has parent reference files, and when opening a replica region, a HFileLink References aren't handled correctly.

I added the handle in the v1 patch.

> TestRestoreSnapshotFromClientWithRegionReplicas is flakey
> ---------------------------------------------------------
>
>                 Key: HBASE-20006
>                 URL: https://issues.apache.org/jira/browse/HBASE-20006
>             Project: HBase
>          Issue Type: Bug
>          Components: read replicas
>            Reporter: stack
>            Assignee: Toshihiro Suzuki
>            Priority: Critical
>         Attachments: HBASE-20006.branch-2.001.patch, HBASE-20006.master.001.patch
>
>
> Failing 10% of the time. Interestingly, it is below that causes fail. We go to split but it is already split. We will then fail the split with an internal assert which messes up procedures; at a minimum we should just not split (this is in the prepare stage).
> {code}
> 2018-02-15 23:21:42,162 INFO  [PEWorker-12] procedure.MasterProcedureScheduler(571): pid=105, state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure table=testOnlineSnapshotAfterSplittingRegions-1518736887838, parent=3f850cea7d71a7ebd019f2f009efca4d, daughterA=06b5e6366efbef155d70e56cfdf58dc9, daughterB=8c175de1b33765a5683ac1e502edb0bd, table=testOnlineSnapshotAfterSplittingRegions-1518736887838, testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.
> 2018-02-15 23:21:42,162 INFO  [PEWorker-12] assignment.SplitTableRegionProcedure(440): Split of {ENCODED => 3f850cea7d71a7ebd019f2f009efca4d, NAME => 'testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.', STARTKEY => '', ENDKEY => '1'} skipped; state is already SPLIT
> 2018-02-15 23:21:42,163 ERROR [PEWorker-12] procedure2.ProcedureExecutor(1480): CODE-BUG: Uncaught runtime exception: pid=105, state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure table=testOnlineSnapshotAfterSplittingRegions-1518736887838, parent=3f850cea7d71a7ebd019f2f009efca4d, daughterA=06b5e6366efbef155d70e56cfdf58dc9, daughterB=8c175de1b33765a5683ac1e502edb0bd
> java.lang.AssertionError: split region should have an exception here
>   at org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:228)
>   at org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:89)
>   at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:180)
>   at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845)
>   at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1455)
>   at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1224)
>   at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
>   at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)