You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Tsz-wo Sze (Jira)" <ji...@apache.org> on 2021/12/01 07:08:00 UTC

[jira] [Assigned] (RATIS-1100) Make raft log gap error easier to troubleshoot

     [ https://issues.apache.org/jira/browse/RATIS-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz-wo Sze reassigned RATIS-1100:
---------------------------------

    Assignee: Tsz-wo Sze

> Make raft log gap error easier to troubleshoot
> ----------------------------------------------
>
>                 Key: RATIS-1100
>                 URL: https://issues.apache.org/jira/browse/RATIS-1100
>             Project: Ratis
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Tsz-wo Sze
>            Priority: Major
>
> Upon restart, Ozone Manager won't start and emitted the following error:
>  
> {code:java}
> 2020-10-19 12:04:10,639 INFO org.apache.ratis.server.raftlog.segmented.LogSegment: Successfully read 7553 entries from segment file /var/lib/hadoop-ozone/fake_om/ratis/1b9ac7ae-cd52-3ab1-8089-942f8267f22a/current/log_25657965-25665517
> 2020-10-19 12:04:10,639 ERROR org.apache.hadoop.ozone.om.OzoneManagerStarter: OM start failed with exception
> java.io.IOException: java.lang.IllegalStateException
>  at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
>  at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
>  at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
>  at org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:289)
>  at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:301)
>  at org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.start(OzoneManagerRatisServer.java:367)
>  at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1138)
>  at org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:125)
>  at org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:79)
>  at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:67)
>  at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:38)
>  at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
>  at picocli.CommandLine.access$1100(CommandLine.java:145)
>  at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
>  at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
>  at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
>  at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2152)
>  at picocli.CommandLine.parseWithHandlers(CommandLine.java:2530)
>  at picocli.CommandLine.parseWithHandler(CommandLine.java:2465)
>  at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:96)
>  at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:87)
>  at org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:51)
> Caused by: java.lang.IllegalStateException
>  at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:36)
>  at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.validateAdding(SegmentedRaftLogCache.java:400)
>  at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.addSegment(SegmentedRaftLogCache.java:405)
>  at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.loadSegment(SegmentedRaftLogCache.java:367)
>  at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:249)
>  at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:217)
>  at org.apache.ratis.server.raftlog.RaftLog.open(RaftLog.java:276)
>  at org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:191)
>  at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:121)
>  at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:123)
>  at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:213){code}
>  
> Looking at the code and checking the ratis log directory, I realized there is a gap in ratis log files (7659964 vs 25657965). 
>  
> File this Jira to make this error message easier to understand, without the need to look at the code.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)