You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Martin Braun (Jira)" <ji...@apache.org> on 2020/09/24 10:43:00 UTC

[jira] [Commented] (HBASE-22976) [HBCK2] Add RecoveredEditsPlayer

    [ https://issues.apache.org/jira/browse/HBASE-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201441#comment-17201441 ] 

Martin Braun commented on HBASE-22976:
--------------------------------------

Is there a workaround for reading in WAL files from recovered.edits? How can I achieve this?

I have an issue with hbase 2.2.5 (and hadoop-2.8.5) after a full disk event I have 38 inconsistencies, when I do a

hbase --internal-classpath hbck

I get a bunch of these errors: 
 
ERROR: Orphan region in HDFS: Unable to load .regioninfo from table tt_ix_bizStep_inserting in hdfs dir hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8a1acb499bf454b072daeee5960daa73! It may be an invalid format or version file. Treating as an orphaned regiondir.
ERROR: Orphan region in HDFS: Unable to load .regioninfo from table tt_ix_bizStep_inserting in hdfs dir hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8f64025b68958ebddeb812297facdfc6! It may be an invalid format or version file. Treating as an orphaned regiondir.


When looking into these directories I see that there is indeed no .regioninfo file:

hdfs dfs -ls -R hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0

drwxr-xr-x - jenkins supergroup 0 2020-09-21 11:23 hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits
-rw-r--r-- 3 jenkins supergroup 74133 2020-09-21 11:11 hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000285
-rw-r--r-- 3 jenkins supergroup 74413 2020-09-16 19:03 hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000286
-rw-r--r-- 3 jenkins supergroup 74693 2020-09-16 19:05 hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000287
-rw-r--r-- 3 jenkins supergroup 79427 2020-09-16 18:27 hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000305

 

So I have now a bunch of recovered.edits WAL files I would like to replay - but how?

The WALPlayer is not able to replay recovered.edits files, the source code http://hbase.apache.org/2.2/devapidocs/src-html/org/apache/hadoop/hbase/mapreduce/WALInputFormat.html

seems to expect an endtime coded into the filename:

long fileStartTime = Long.parseLong(name.substring(idx+1));
323 if (fileStartTime <= endTime) {
324 LOG.info("Found: " + file);
325 result.add(file);
326 }
327 } catch (NumberFormatException x) {
328 idx = 0;

But the files in recovered.edits are named differently (just a numbers like 00000000000000195).

Would renaming of the files help? But with which endtime?

 

 

> [HBCK2] Add RecoveredEditsPlayer
> --------------------------------
>
>                 Key: HBASE-22976
>                 URL: https://issues.apache.org/jira/browse/HBASE-22976
>             Project: HBase
>          Issue Type: Sub-task
>          Components: hbck2
>            Reporter: Michael Stack
>            Priority: Major
>
> We need a recovered edits player. Messing w/ the 'adoption service' -- tooling to adopt orphan regions and hfiles -- I've been manufacturing damaged clusters by moving stuff around under the running cluster. No reason to think that an hbase couldn't lose accounting of a whole region if a cataclysm. If so, region will have stuff like the '.regioninfo', dirs per column family w/ store files but it could too have a 'recovered_edits' directory with content in it. We have a WALPlayer for errant WALs. We have the FSHLog tool which can read recovered_edits content for debugging data loss. Missing is a RecoveredEditsPlayer.
> I took a look at extending the WALPlayer since it has a bunch of nice options and it can run at bulk. Ideally, it would just digest recovered edits content if passed an option or recovered edits directories. On first glance, it didn't seem like an easy integration.... Would be worth taking a look again. Would be good if we could avoid making a new, distinct tool, just for Recovered Edits.
> The bulkload tool expects hfiles in column family directories. Recovered edits files are not hfiles and the files are x-columnfamily so this is not the way to go though a bulkload-like tool that moved the recovered edits files under the appropriate region dir and asked the region reopen would be a possibility (Would need the bulk load complete trick of splitting input if the region boundaries in the live cluster do not align w/ those of the errant recovered edits files).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)