You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Jianfei Jiang (JIRA)" <ji...@apache.org> on 2017/12/11 11:46:00 UTC

[jira] [Comment Edited] (HADOOP-15108) Testcase TestBalancer#testBalancerWithPinnedBlocks always fails

    [ https://issues.apache.org/jira/browse/HADOOP-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285797#comment-16285797 ] 

Jianfei Jiang edited comment on HADOOP-15108 at 12/11/17 11:45 AM:
-------------------------------------------------------------------

Due to debugging experience and the error log message, there may be something wrong with the code below. There are two favoredNodes both target to local, the file: /tmp.txt seems to have lease conflict. 

    DFSTestUtil.createFile(cluster.getFileSystem(0), filePath, false, 1024,
        totalUsedSpace / numOfDatanodes, DEFAULT_BLOCK_SIZE,
        (short) numOfDatanodes, 0, false, favoredNodes);

When I change the two datanodes in the cluster to only one which shown in my patch, the testcase runs successfully. It will only remain only one favoredNode and have no conflict. In my opinion, the testcase will still reach its goal when given only one node at the beginning. However, I am not certain about it.

The following is the error log:

2017-12-11 18:45:54,063 [PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003, type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715]] INFO  datanode.DataNode (BlockReceiver.java:run(1497)) - PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003, type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715] terminating
2017-12-11 18:46:02,292 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1957)) - Shutting down the Mini HDFS Cluster
2017-12-11 18:46:02,293 [DataStreamer for file /tmp.txt] WARN  hdfs.DataStreamer (DataStreamer.java:run(843)) - DataStreamer Exception
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,298 [main] ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(602)) - Failed to close file: /tmp.txt with inode: 16386
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,299 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNode(2005)) - Shutting down DataNode 1


was (Author: jiangjianfei):
Due to debugging experience and the error log message, there may be something wrong with the code below. There are two favoredNodes both target to local, the file: /tmp.txt seems to have lease conflict. 
    DFSTestUtil.createFile(cluster.getFileSystem(0), filePath, false, 1024,
        totalUsedSpace / numOfDatanodes, DEFAULT_BLOCK_SIZE,
        (short) numOfDatanodes, 0, false, favoredNodes);

When I change the two favoredNodes to only one which shown in my patch, the testcase runs successfully. I am not certain this change has no influence to the original target of function in this test.

2017-12-11 18:45:54,063 [PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003, type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715]] INFO  datanode.DataNode (BlockReceiver.java:run(1497)) - PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003, type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715] terminating
2017-12-11 18:46:02,292 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1957)) - Shutting down the Mini HDFS Cluster
2017-12-11 18:46:02,293 [DataStreamer for file /tmp.txt] WARN  hdfs.DataStreamer (DataStreamer.java:run(843)) - DataStreamer Exception
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,298 [main] ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(602)) - Failed to close file: /tmp.txt with inode: 16386
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,299 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNode(2005)) - Shutting down DataNode 1

> Testcase TestBalancer#testBalancerWithPinnedBlocks always fails
> ---------------------------------------------------------------
>
>                 Key: HADOOP-15108
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15108
>             Project: Hadoop Common
>          Issue Type: Test
>    Affects Versions: 3.0.0-beta1
>            Reporter: Jianfei Jiang
>         Attachments: HADOOP-15108.000.patch
>
>
> When running testcases without any code changes, the function testBalancerWithPinnedBlocks in TestBalancer.java never succeeded. I tried to use Ubuntu 16.04 and redhat 7, maybe the failure is not related to various linux environment. I am not sure if there is some bug in this case or I used wrong environment and settings. Could anyone give some advice.
> -------------------------------------------------------------------------------
> Test set: org.apache.hadoop.hdfs.server.balancer.TestBalancer
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 100.389 sec <<< FAILURE! - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
> testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer)  Time elapsed: 100.134 sec  <<< ERROR!
> java.lang.Exception: test timed out after 100000 milliseconds
> 	at java.lang.Object.wait(Native Method)
> 	at org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:903)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:773)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:870)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> 	at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:441)
> 	at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:515)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org