You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "Erik Krogen (Jira)" <ji...@apache.org> on 2022/12/21 18:08:00 UTC
[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324
[ https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17650959#comment-17650959 ]
Erik Krogen commented on HDFS-16853:
------------------------------------
[~omalley] any thoughts on the PR?
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: ZanderXu
> Assignee: ZanderXu
> Priority: Major
> Labels: pull-request-available
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed with error message: Waiting for cluster to become active. And the blocking jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 00007fc1108fa000 nid=0x19303 waiting on condition [0x0000700017884000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000007430a9ec0> (a java.util.concurrent.SynchronousQueue$TransferQueue)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
> at java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
> at java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
> at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
> at org.apache.hadoop.ipc.Client.call(Client.java:1482)
> at org.apache.hadoop.ipc.Client.call(Client.java:1429)
> at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
> at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
> at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
> at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
> at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
> at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
> at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
> at java.lang.Thread.run(Thread.java:748) {code}
> After looking into the code and found that this bug is imported by HADOOP-18324. Because RpcRequestSender exited without cleaning up the rpcRequestQueue, then caused BPServiceActor was blocked in sending request.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org