You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Yulin Niu (Jira)" <ji...@apache.org> on 2021/11/27 12:52:00 UTC
[jira] [Updated] (HBASE-25053) WAL replay should ignore 0-length files
[ https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yulin Niu updated HBASE-25053:
------------------------------
Fix Version/s: 2.3.8
> WAL replay should ignore 0-length files
> ---------------------------------------
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
> Issue Type: Bug
> Components: master, regionserver
> Affects Versions: 2.3.1
> Reporter: Nick Dimiduk
> Assignee: Yulin Niu
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.8
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to bring HBase back up, I noticed all masters -refused to start- abort. Logs complain of seeking past EOF. Indeed the last wal file name logged is a 0-length file. WAL replay should gracefully skip and clean up such an empty file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed to become active master
> java.io.EOFException: Cannot seek after EOF
> at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)