You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Bharat Viswanadham (JIRA)" <ji...@apache.org> on 2019/01/30 04:04:00 UTC

[jira] [Created] (HDDS-1031) Update ratis version to fix a DN restart Bug

Bharat Viswanadham created HDDS-1031:
----------------------------------------

             Summary: Update ratis version to fix a DN restart Bug
                 Key: HDDS-1031
                 URL: https://issues.apache.org/jira/browse/HDDS-1031
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
            Reporter: Bharat Viswanadham


This is related to RATIS-460.

When datanode is restarted, after ratis has taken a snapshot, we see below stack trace, and DN won't boot up. For more info refer RATIS-460

 
{code:java}
java.io.IOException: java.lang.IllegalStateException: lastEntry = 72856=72856: [77969640-aad9-4678-813b-8fb35bd5f568:172.27.37.0:9858, 7c6ae4fe-7db5-4e97-a407-0a9edff70c2c:172.27.35.192:9858, add14303-ecdf-4aed-84b7-abc3152177f6:172.27.37.128:9858], old=null, lastEntry.index >= logIndex = 0
        at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
        at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
        at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
        at org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:283)
        at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:295)
        at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:427)
        at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:149)
        at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:165)
        at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:334)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: lastEntry = 72856=72856: [77969640-aad9-4678-813b-8fb35bd5f568:172.27.37.0:9858, 7c6ae4fe-7db5-4e97-a407-0a9edff70c2c:172.27.35.192:9858, add14303-ecdf-4aed-84b7-abc3152177f6:172.27.37.128:9858], old=null, lastEntry.index >= logIndex = 0
        at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:72)
        at org.apache.ratis.server.impl.ConfigurationManager.addConfiguration(ConfigurationManager.java:54)
        at org.apache.ratis.server.impl.ServerState.setRaftConf(ServerState.java:352)
        at org.apache.ratis.server.impl.ServerState.setRaftConf(ServerState.java:347)
        at org.apache.ratis.server.storage.RaftLog.lambda$open$6(RaftLog.java:237)
        at org.apache.ratis.server.storage.LogSegment.lambda$loadSegment$0(LogSegment.java:140)
        at org.apache.ratis.server.storage.LogSegment.readSegmentFile(LogSegment.java:121)
        at org.apache.ratis.server.storage.LogSegment.loadSegment(LogSegment.java:137)
        at org.apache.ratis.server.storage.RaftLogCache.loadSegment(RaftLogCache.java:272)
        at org.apache.ratis.server.storage.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:159)
        at org.apache.ratis.server.storage.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:129)
        at org.apache.ratis.server.storage.RaftLog.open(RaftLog.java:233)
        at org.apache.ratis.server.impl.ServerState.initLog(ServerState.java:191)
        at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:114)
        at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:103)
        at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:207)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
        at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
2019-01-29 01:43:41,137 [main] ERROR      - Exception in HddsDatanodeService.
java.lang.NullPointerException
        at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.join(DatanodeStateMachine.java:363)
        at org.apache.hadoop.ozone.HddsDatanodeService.join(HddsDatanodeService.java:270)
        at org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:127)
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org