You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Shashikant Banerjee (Jira)" <ji...@apache.org> on 2021/12/03 05:22:00 UTC
[jira] [Resolved] (RATIS-1100) Make raft log gap error easier to troubleshoot
[ https://issues.apache.org/jira/browse/RATIS-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shashikant Banerjee resolved RATIS-1100.
----------------------------------------
Fix Version/s: 1.1.0
Resolution: Fixed
> Make raft log gap error easier to troubleshoot
> ----------------------------------------------
>
> Key: RATIS-1100
> URL: https://issues.apache.org/jira/browse/RATIS-1100
> Project: Ratis
> Issue Type: Improvement
> Components: server
> Affects Versions: 1.0.0
> Reporter: Wei-Chiu Chuang
> Assignee: Tsz-wo Sze
> Priority: Major
> Fix For: 1.1.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Upon restart, Ozone Manager won't start and emitted the following error:
>
> {code:java}
> 2020-10-19 12:04:10,639 INFO org.apache.ratis.server.raftlog.segmented.LogSegment: Successfully read 7553 entries from segment file /var/lib/hadoop-ozone/fake_om/ratis/1b9ac7ae-cd52-3ab1-8089-942f8267f22a/current/log_25657965-25665517
> 2020-10-19 12:04:10,639 ERROR org.apache.hadoop.ozone.om.OzoneManagerStarter: OM start failed with exception
> java.io.IOException: java.lang.IllegalStateException
> at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
> at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
> at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
> at org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:289)
> at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:301)
> at org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.start(OzoneManagerRatisServer.java:367)
> at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1138)
> at org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:125)
> at org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:79)
> at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:67)
> at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:38)
> at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
> at picocli.CommandLine.access$1100(CommandLine.java:145)
> at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
> at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
> at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
> at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2152)
> at picocli.CommandLine.parseWithHandlers(CommandLine.java:2530)
> at picocli.CommandLine.parseWithHandler(CommandLine.java:2465)
> at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:96)
> at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:87)
> at org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:51)
> Caused by: java.lang.IllegalStateException
> at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:36)
> at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.validateAdding(SegmentedRaftLogCache.java:400)
> at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.addSegment(SegmentedRaftLogCache.java:405)
> at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.loadSegment(SegmentedRaftLogCache.java:367)
> at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:249)
> at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:217)
> at org.apache.ratis.server.raftlog.RaftLog.open(RaftLog.java:276)
> at org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:191)
> at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:121)
> at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:123)
> at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:213){code}
>
> Looking at the code and checking the ratis log directory, I realized there is a gap in ratis log files (7659964 vs 25657965).
>
> File this Jira to make this error message easier to understand, without the need to look at the code.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)