You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sergey Soldatov (Jira)" <ji...@apache.org> on 2022/04/22 23:15:00 UTC

[jira] [Commented] (HBASE-26972) Restored table from snapshot that has MOB is inconsistent

    [ https://issues.apache.org/jira/browse/HBASE-26972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526717#comment-17526717 ] 

Sergey Soldatov commented on HBASE-26972:
-----------------------------------------

The reason of that: 
/hbase2.4/mobdir/data/default/table_1/1bccf339572b9a4db7475abcf57eeb8f/data/table_1=1bccf339572b9a4db7475abcf57eeb8f-bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a
which is a link to 
/hbase2.4/archive/data/default/table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a
But HFileLink  is unable to detect that this is a link because of the underscore in the middle of the file name and that doesn't fit to the pattern:
{code}
public static final String HFILE_NAME_REGEX = "[0-9a-f]+(?:(?:_SeqId_[0-9]+_)|(?:_del))?";
{code}
So isHFileLink would return false for such files and that leads to unpredictable results. 
The easy way to fix it is to allow underscore in the HFILE_NAME_REGEX, but I'm feeling a bit uncomfortable about it.  I would really appreciate suggestions. 

> Restored table from snapshot that has MOB is inconsistent
> ---------------------------------------------------------
>
>                 Key: HBASE-26972
>                 URL: https://issues.apache.org/jira/browse/HBASE-26972
>             Project: HBase
>          Issue Type: Bug
>          Components: mob, snapshots
>    Affects Versions: 3.0.0-alpha-2
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>            Priority: Major
>
> When we restore the table from snapshot and it has MOB files, there are links that do not fit the pattern of HFileLink. I'm not sure to which side effects that might lead, but at least it's not possible to create a snapshot right after the restore:
> {quote}
> Version 3.0.0-alpha-3-SNAPSHOT, rcd45cadbc1a42db359ff4e775cbd4b55cfe28140, Fri Apr 22 03:04:25 PM PDT 2022
> Took 0.0016 seconds                                                                                                                                           
> hbase:001:0> list_snapshot
> list_snapshot_sizes   list_snapshots        
> hbase:001:0> list_snapshots
> SNAPSHOT                                 TABLE + CREATION TIME                                                                                                
>  t1                                      table_1 (2022-04-22 15:48:04 -0700)                                                                                  
> 1 row(s)
> Took 1.0881 seconds                                                                                                                                           
> => ["t1"]
> hbase:002:0> restore_snapshot 't1'
> Took 2.3942 seconds                                                                                                                                           
> hbase:003:0> snapshot
> snapshot                   snapshot_cleanup_enabled   snapshot_cleanup_switch    
> hbase:003:0> snapshot 'table_1', 't2'
> ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { ss=t2 table=table_1 type=FLUSH ttl=0 } had an error.  Procedure t2 { waiting=[] done=[] }
> 	at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:403)
> 	at org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1325)
> 	at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
> 	at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:106)
> 	at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:86)
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via Failed taking snapshot { ss=t2 table=table_1 type=FLUSH ttl=0 } due to exception:Can't find hfile: table_1=1bccf339572b9a4db7475abcf57eeb8f-bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a in the real (hdfs://localhost:8020/hbase2.4/mobdir/data/table_1/1bccf339572b9a4db7475abcf57eeb8f-table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a) or archive (hdfs://localhost:8020/hbase2.4/archive/data/table_1/1bccf339572b9a4db7475abcf57eeb8f-table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a) directory for the primary table.:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Can't find hfile: table_1=1bccf339572b9a4db7475abcf57eeb8f-bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a in the real (hdfs://localhost:8020/hbase2.4/mobdir/data/table_1/1bccf339572b9a4db7475abcf57eeb8f-table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a) or archive (hdfs://localhost:8020/hbase2.4/archive/data/table_1/1bccf339572b9a4db7475abcf57eeb8f-table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a) directory for the primary table.
> 	at org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
> 	at org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:322)
> 	at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:392)
> 	... 6 more
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Can't find hfile: table_1=1bccf339572b9a4db7475abcf57eeb8f-bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a in the real (hdfs://localhost:8020/hbase2.4/mobdir/data/table_1/1bccf339572b9a4db7475abcf57eeb8f-table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a) or archive (hdfs://localhost:8020/hbase2.4/archive/data/table_1/1bccf339572b9a4db7475abcf57eeb8f-table_1/1bccf339572b9a4db7475abcf57eeb8f/data/bee397acc400449ea3a35ed3fc87fea1202204220b9b3b97b4fc42379a7b6455c3dc1613_49a15ec2a84c8489965d1910a05cca3a) directory for the primary table.
> 	at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.concurrentVisitReferencedFiles(SnapshotReferenceUtil.java:232)
> 	at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.concurrentVisitReferencedFiles(SnapshotReferenceUtil.java:195)
> 	at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.verifySnapshot(SnapshotReferenceUtil.java:172)
> 	at org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:204)
> 	at org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:117)
> 	at org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:220)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:106)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> For usage try 'help "snapshot"'
> Took 1.7477 seconds                                                                                                                                           
> hbase:004:0> 
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)