You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Arpit Agarwal (JIRA)" <ji...@apache.org> on 2018/10/25 17:52:00 UTC

[jira] [Resolved] (HDDS-543) datanode service start throwing NullPointerException

     [ https://issues.apache.org/jira/browse/HDDS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arpit Agarwal resolved HDDS-543.
--------------------------------
    Resolution: Done

This should be fixed as part of the restart fixes per [~msingh].

> datanode service start throwing NullPointerException
> ----------------------------------------------------
>
>                 Key: HDDS-543
>                 URL: https://issues.apache.org/jira/browse/HDDS-543
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Nilotpal Nandi
>            Assignee: Mukul Kumar Singh
>            Priority: Critical
>         Attachments: all-node-ozone-logs-1537784799.tar.gz
>
>
> steps taken :
> -------------------
>  # Stopped one of the nodes in 6 node cluster  while write operation was running.
>  # Tried to start datanode after sometime. Datanode failed to start
>  
> ozone version :
> ---------------------
>  
> {noformat}
> Source code repository git@github.com:apache/hadoop.git -r 968082ffa5d9e50ed8538f653c375edd1b8feea5
> Compiled by elek on 2018-09-19T20:57Z
> Compiled with protoc 2.5.0
> From source with checksum efbdeabb5670d69d9efde85846e4ee98
> Using HDDS 0.2.1-alpha
> Source code repository git@github.com:apache/hadoop.git -r 968082ffa5d9e50ed8538f653c375edd1b8feea5
> Compiled by elek on 2018-09-19T20:56Z
> Compiled with protoc 2.5.0
> From source with checksum 8bf78cff4b73c95d486da5b21053ef
>  
> {noformat}
>  
> console log :
> --------------------
>  
> {noformat}
> 2018-09-24 09:53:54,913 INFO impl.RaftServerImpl: 9348f4cb-0bd3-41e7-b1c6-c22463a20c09: set configuration 0: [0ba4abe0-c8d1-4d6c-8d28-ebfa08fb84c9:172.27.18.18:9858, 3d6c8b2c-f1d2-4425-a577-b9ef5dc7c634:172.27.80.23:9858, 9348f4cb-0bd3-41e7-b1c6-c22463a20c09:172.27.23.161:9858], old=null at 0
> Sep 24, 2018 9:53:55 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
> 2018-09-24 09:53:55,968 [main] INFO - Listening HDDS REST traffic on /0.0.0.0:9880
> 2018-09-24 09:53:55,969 [main] INFO - Started plug-in org.apache.hadoop.ozone.web.OzoneHddsDatanodeService@2fa3be26
> 2018-09-24 09:53:55,975 [Datanode State Machine Thread - 0] INFO - Attempting to start container services.
> 2018-09-24 09:53:56,171 [Datanode State Machine Thread - 0] INFO - Starting XceiverServerRatis 9348f4cb-0bd3-41e7-b1c6-c22463a20c09 at port 9858
> 2018-09-24 09:53:56,193 [Datanode State Machine Thread - 0] ERROR - Unable to start the DatanodeState Machine
> java.io.IOException: Premature EOF from inputStream
>  at org.apache.ratis.util.IOUtils.readFully(IOUtils.java:100)
>  at org.apache.ratis.server.storage.LogReader.decodeEntry(LogReader.java:250)
>  at org.apache.ratis.server.storage.LogReader.readEntry(LogReader.java:155)
>  at org.apache.ratis.server.storage.LogInputStream.nextEntry(LogInputStream.java:128)
>  at org.apache.ratis.server.storage.LogSegment.readSegmentFile(LogSegment.java:110)
>  at org.apache.ratis.server.storage.LogSegment.loadSegment(LogSegment.java:132)
>  at org.apache.ratis.server.storage.RaftLogCache.loadSegment(RaftLogCache.java:110)
>  at org.apache.ratis.server.storage.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:155)
>  at org.apache.ratis.server.storage.SegmentedRaftLog.open(SegmentedRaftLog.java:123)
>  at org.apache.ratis.server.impl.ServerState.initLog(ServerState.java:162)
>  at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:110)
>  at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:106)
>  at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$0(RaftServerProxy.java:191)
>  at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>  at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582)
>  at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>  at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>  at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>  at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> 2018-09-24 09:53:56,195 [main] ERROR - Exception in HddsDatanodeService.
> java.lang.NullPointerException
>  at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.join(DatanodeStateMachine.java:332)
>  at org.apache.hadoop.ozone.HddsDatanodeService.join(HddsDatanodeService.java:191)
>  at org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:250)
> 2018-09-24 09:53:56,198 INFO util.ExitUtil: Exiting with status 1: java.lang.NullPointerException
> 2018-09-24 09:53:56,201 WARN fs.CachingGetSpaceUsed: Thread Interrupted waiting to refresh disk information: sleep interrupted
> 2018-09-24 09:53:56,201 [shutdown-hook-0] INFO - SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down HddsDatanodeService at ctr-e138-1518143905142-468367-01-000003.hwx.site/172.27.23.161
>  
> {noformat}
>  
> datanode.log
> ----------------------
> {noformat}
> 2018-09-24 09:42:41,305 ERROR org.apache.ratis.server.impl.RaftServerImpl: 9348f4cb-0bd3-41e7-b1c6-c22463a20c09: applyTransaction failed for index:2 proto:(t:2, i:2)SMLOGENTRY, client-DBCE827E62E6, cid=1
> 2018-09-24 09:42:41,307 ERROR org.apache.ratis.server.impl.StateMachineUpdater: Terminating with exit status 2: StateMachineUpdater-9348f4cb-0bd3-41e7-b1c6-c22463a20c09: the StateMachineUpdater hits Throwable
> java.lang.NullPointerException
>  at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine$StateMachineHelper.handleChunkCommit(ContainerStateMachine.java:561)
>  at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine$StateMachineHelper.executeContainerCommand(ContainerStateMachine.java:615)
>  at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:394)
>  at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1150)
>  at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:148)
>  at java.lang.Thread.run(Thread.java:745)
> 2018-09-24 09:42:41,312 WARN org.apache.hadoop.fs.CachingGetSpaceUsed: Thread Interrupted waiting to refresh disk information: sleep interrupted
> 2018-09-24 09:53:38,901 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
> 2018-09-24 09:53:39,013 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
> 2018-09-24 09:53:39,013 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: HddsDatanode metrics system started
> 2018-09-24 09:53:39,056 INFO org.apache.hadoop.conf.Configuration.deprecation: No unit for ozone.scm.heartbeat.rpc-timeout(1000) assuming MILLISECONDS
> 2018-09-24 09:53:39,069 INFO org.apache.hadoop.conf.Configuration.deprecation: No unit for hdds.write.lock.reporting.threshold.ms(5000) assuming MILLISECONDS
> 2018-09-24 09:53:39,069 INFO org.apache.hadoop.conf.Configuration.deprecation: No unit for hdds.lock.suppress.warning.interval.ms(10000) assuming MILLISECONDS
> 2018-09-24 09:53:39,751 WARN org.apache.hadoop.hdds.scm.HddsServerUtil: Storage directory for Ratis is not configured.Mapping Ratis storage under /data/disk1/ozone/meta/ratis. It is a good idea to map this to an SSD disk.
> 2018-09-24 09:53:39,783 INFO org.apache.ratis.server.impl.RaftServerProxy: raft.rpc.type = GRPC (default)
> 2018-09-24 09:53:39,793 INFO org.apache.ratis.grpc.GrpcConfigKeys$Server: raft.grpc.server.port = 9858 (custom)
> 2018-09-24 09:53:39,793 INFO org.apache.ratis.grpc.RaftGRpcService: raft.grpc.message.size.max = 167772160 (custom)
> 2018-09-24 09:53:39,795 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.appender.buffer.capacity = 134217728 (custom)
> 2018-09-24 09:53:39,795 INFO org.apache.ratis.grpc.RaftGRpcService: raft.grpc.flow.control.window = 1MB (=1048576) (default)
> 2018-09-24 09:53:39,796 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.rpc.request.timeout = 3000 ms (custom)
> 2018-09-24 09:53:39,835 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.storage.dir = /data/disk1/ozone/meta/ratis (custom)
> 2018-09-24 09:53:39,836 INFO org.apache.ratis.server.impl.RaftServerProxy: 9348f4cb-0bd3-41e7-b1c6-c22463a20c09: found a subdirectory /data/disk1/ozone/meta/ratis/6ff6479b-11f4-456f-b46e-85804b25606d
> 2018-09-24 09:53:39,838 INFO org.apache.ratis.server.impl.RaftServerProxy: 9348f4cb-0bd3-41e7-b1c6-c22463a20c09: addNew group-85804B25606D:[] returns group-85804B25606D:java.util.concurrent.CompletableFuture@74e28667[Not completed]
> 2018-09-24 09:53:39,850 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.rpc.timeout.min = 1 s (custom)
> 2018-09-24 09:53:39,850 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.rpc.timeout.max = 1200 ms (custom)
> 2018-09-24 09:53:39,850 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.rpcslowness.timeout = 120 s (custom)
> 2018-09-24 09:53:39,868 INFO org.apache.ratis.server.impl.RaftServerImpl: 9348f4cb-0bd3-41e7-b1c6-c22463a20c09: ConfigurationManager, init=-1: [], old=null, confs=<EMPTY_MAP>
> 2018-09-24 09:53:39,868 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.storage.dir = /data/disk1/ozone/meta/ratis (custom)
> 2018-09-24 09:53:39,873 INFO org.apache.ratis.server.storage.RaftStorageDirectory: Lock on /data/disk1/ozone/meta/ratis/6ff6479b-11f4-456f-b46e-85804b25606d/in_use.lock acquired by nodename 3792419@ctr-e138-1518143905142-468367-01-000003.hwx.site
> 2018-09-24 09:53:39,878 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.leader.election.timeout = 120 s (custom)
> 2018-09-24 09:53:39,881 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.use.memory = false (default)
> 2018-09-24 09:53:39,882 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.appender.buffer.capacity = 134217728 (custom)
> 2018-09-24 09:53:39,883 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.segment.size.max = 1073741824 (custom)
> 2018-09-24 09:53:39,886 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.segment.cache.num.max = 2 (custom)
> 2018-09-24 09:53:39,889 INFO org.apache.ratis.server.storage.RaftLogWorker: new 9348f4cb-0bd3-41e7-b1c6-c22463a20c09-RaftLogWorker for Storage Directory /data/disk1/ozone/meta/ratis/6ff6479b-11f4-456f-b46e-85804b25606d
> 2018-09-24 09:53:39,889 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.segment.size.max = 1073741824 (custom)
> 2018-09-24 09:53:39,889 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.preallocated.size = 134217728 (custom)
> 2018-09-24 09:53:39,890 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.write.buffer.size = 33554432 (custom)
> 2018-09-24 09:53:39,890 INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.log.force.sync.num = 128 (default)
> 2018-09-24 09:53:39,925 INFO org.apache.ratis.server.impl.RaftServerImpl: 9348f4cb-0bd3-41e7-b1c6-c22463a20c09: set configuration 0: [0ba4abe0-c8d1-4d6c-8d28-ebfa08fb84c9:172.27.18.18:9858, 3d6c8b2c-f1d2-4425-a577-b9ef5dc7c634:172.27.80.23:9858, 9348f4cb-0bd3-41e7-b1c6-c22463a20c09:172.27.23.161:9858], old=null at 0
> 2018-09-24 09:53:41,058 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.NullPointerException
> 2018-09-24 09:53:41,065 WARN org.apache.hadoop.fs.CachingGetSpaceUsed: Thread Interrupted waiting to refresh disk information: sleep interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org