You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2013/03/20 05:27:15 UTC

[jira] [Comment Edited] (HBASE-7878) recoverFileLease does not check return value of recoverLease

    [ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607267#comment-13607267 ] 

Ted Yu edited comment on HBASE-7878 at 3/20/13 4:26 AM:
--------------------------------------------------------

>From https://builds.apache.org/job/PreCommit-HBASE-Build/4900/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt , notice the second line below:
{code}
2013-03-20 00:51:04,404 INFO  [IPC Server handler 4 on 33341] namenode.FSNamesystem(2294): Recovering lease=[Lease.  Holder: DFSClient_NONMAPREDUCE_289697813_8, pendingcreates: 1], src=/hbase/hlog/hlog.dat.1
2013-03-20 00:51:04,405 DEBUG [pool-1-thread-1] util.FSHDFSUtils(89): recoverLease returned false
2013-03-20 00:51:04,922 INFO  [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(2130): NameNode calls recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:55498, 127.0.0.1:50293])
2013-03-20 00:51:04,923 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset(2143): Interrupting active writer threads for block blk_1918679507966322481_1015
2013-03-20 00:51:04,924 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.BlockReceiver(613): Exception in receiveBlock for block blk_1918679507966322481_1015 java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/127.0.0.1:55498 remote=/127.0.0.1:54218]. 0 millis timeout left.
2013-03-20 00:51:04,924 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received exception java.io.IOException: Interrupted receiveBlock
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): PacketResponder blk_1918679507966322481_1015 1 : Thread is interrupted.
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): PacketResponder 1 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,924 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.BlockReceiver(613): Exception in receiveBlock for block blk_1918679507966322481_1015 java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): PacketResponder blk_1918679507966322481_1015 0 : Thread is interrupted.
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): PacketResponder 0 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,925 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received exception java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 WARN  [ResponseProcessor for block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream$ResponseProcessor(3175): DFSOutputStream ResponseProcessor exception  for block blk_1918679507966322481_1015java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:180)
	at java.io.DataInputStream.readLong(DataInputStream.java:399)
	at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)

2013-03-20 00:51:04,925 ERROR [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:50293, storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, ipcPort=59764):DataXceiver
java.io.EOFException: while trying to read 547 bytes
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:295)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:339)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:403)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:581)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
	at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,925 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3211): Error Recovery for block blk_1918679507966322481_1015 bad datanode[0] 127.0.0.1:55498
2013-03-20 00:51:04,927 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:50293, storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, ipcPort=59764):Number of active connections is: 2
2013-03-20 00:51:04,927 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3262): Error Recovery for block blk_1918679507966322481_1015 in pipeline 127.0.0.1:55498, 127.0.0.1:50293: bad datanode 127.0.0.1:55498
2013-03-20 00:51:04,925 ERROR [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:55498, storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, ipcPort=60306):DataXceiver
java.io.IOException: Interrupted receiveBlock
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:626)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
	at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,928 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:55498, storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, ipcPort=60306):Number of active connections is: 3
2013-03-20 00:51:04,928 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset(2159): getBlockMetaDataInfo successful block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,933 INFO  [IPC Server handler 1 on 59764] datanode.DataNode(2130): Client calls recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:50293])
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] datanode.FSDataset(2143): Interrupting active writer threads for block blk_1918679507966322481_1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] datanode.FSDataset(2159): getBlockMetaDataInfo successful block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=false
2013-03-20 00:51:04,935 INFO  [IPC Server handler 1 on 33341] namenode.FSNamesystem(5662): blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 ERROR [IPC Server handler 1 on 33341] security.UserGroupInformation(1152): PriviledgedActionException as:jenkins cause:java.io.IOException: blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 DEBUG [IPC Server handler 0 on 59764] datanode.FSDataset(2143): Interrupting active writer threads for block blk_1918679507966322481_1015
2013-03-20 00:51:04,936 DEBUG [IPC Server handler 0 on 59764] datanode.FSDataset(2159): getBlockMetaDataInfo successful block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,936 ERROR [IPC Server handler 1 on 59764] security.UserGroupInformation(1152): PriviledgedActionException as:blk_1918679507966322481_1015 cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

2013-03-20 00:51:04,936 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 wasRecoveredOnStartup=false) node=127.0.0.1:55498), BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=true
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3287): Failed recovery attempt #0 from primary datanode 127.0.0.1:50293
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

	at org.apache.hadoop.ipc.Client.call(Client.java:1107)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at $Proxy12.nextGenerationStamp(Unknown Source)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2066)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2034)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2114)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

	at org.apache.hadoop.ipc.Client.call(Client.java:1107)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at $Proxy17.recoverBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3285)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2749)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2953)
2013-03-20 00:51:04,937 INFO  [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:55498
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3326): Error Recovery for block blk_1918679507966322481_1015 failed  because recovery from primary datanode 127.0.0.1:50293 failed 1 times.  Pipeline was 127.0.0.1:55498, 127.0.0.1:50293. Will retry...
2013-03-20 00:51:04,938 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset$FSDir(175): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481_1024.meta to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,938 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset$FSDir(176): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481 to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481
2013-03-20 00:51:04,938 INFO  [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 1006 as part of lease recovery.
2013-03-20 00:51:04,938 DEBUG [DataNode: [/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data2]] datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:55498, storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, ipcPort=60306) sleeping for artificial delay: 1 ms
2013-03-20 00:51:04,938 INFO  [IPC Server handler 2 on 59764] datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:50293
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] datanode.FSDataset$FSDir(175): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481_1024.meta to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] datanode.FSDataset$FSDir(176): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481 to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481
2013-03-20 00:51:04,939 INFO  [IPC Server handler 2 on 59764] datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 1006 as part of lease recovery.
2013-03-20 00:51:04,939 DEBUG [DataNode: [/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data4]] datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:50293, storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, ipcPort=59764) sleeping for artificial delay: 3 ms
2013-03-20 00:51:04,940 INFO  [IPC Server handler 3 on 33341] namenode.FSNamesystem(2374): commitBlockSynchronization(lastblock=blk_1918679507966322481_1015, newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498, 127.0.0.1:50293], closeFile=true, deleteBlock=false)
2013-03-20 00:51:04,942 INFO  [IPC Server handler 3 on 33341] namenode.FSNamesystem(2458): commitBlockSynchronization(newblock=blk_1918679507966322481_1024, file=/hbase/hlog/hlog.dat.1, newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498, 127.0.0.1:50293]) successful
2013-03-20 00:51:04,943 WARN  [IPC Server handler 5 on 33341] namenode.FSNamesystem(3850): BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request received for blk_1918679507966322481_1024 on 127.0.0.1:50293 size 1006
2013-03-20 00:51:05,406 INFO  [pool-1-thread-1] util.FSHDFSUtils(137): Finished lease recover attempt for hdfs://localhost:33341/hbase/hlog/hlog.dat.1
{code}
                
      was (Author: yuzhihong@gmail.com):
    From https://builds.apache.org/job/PreCommit-HBASE-Build/4900/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt , notice the first line below:
{code}
2013-03-20 00:51:04,405 DEBUG [pool-1-thread-1] util.FSHDFSUtils(89): recoverLease returned false
2013-03-20 00:51:04,922 INFO  [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(2130): NameNode calls recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:55498, 127.0.0.1:50293])
2013-03-20 00:51:04,923 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset(2143): Interrupting active writer threads for block blk_1918679507966322481_1015
2013-03-20 00:51:04,924 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.BlockReceiver(613): Exception in receiveBlock for block blk_1918679507966322481_1015 java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/127.0.0.1:55498 remote=/127.0.0.1:54218]. 0 millis timeout left.
2013-03-20 00:51:04,924 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received exception java.io.IOException: Interrupted receiveBlock
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): PacketResponder blk_1918679507966322481_1015 1 : Thread is interrupted.
2013-03-20 00:51:04,924 INFO  [PacketResponder 1 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): PacketResponder 1 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,924 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.BlockReceiver(613): Exception in receiveBlock for block blk_1918679507966322481_1015 java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(893): PacketResponder blk_1918679507966322481_1015 0 : Thread is interrupted.
2013-03-20 00:51:04,925 INFO  [PacketResponder 0 for Block blk_1918679507966322481_1015] datanode.BlockReceiver$PacketResponder(958): PacketResponder 0 for block blk_1918679507966322481_1015 terminating
2013-03-20 00:51:04,925 INFO  [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.DataXceiver(425): writeBlock blk_1918679507966322481_1015 received exception java.io.EOFException: while trying to read 547 bytes
2013-03-20 00:51:04,925 WARN  [ResponseProcessor for block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream$ResponseProcessor(3175): DFSOutputStream ResponseProcessor exception  for block blk_1918679507966322481_1015java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:180)
	at java.io.DataInputStream.readLong(DataInputStream.java:399)
	at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)

2013-03-20 00:51:04,925 ERROR [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:50293, storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, ipcPort=59764):DataXceiver
java.io.EOFException: while trying to read 547 bytes
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:295)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:339)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:403)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:581)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
	at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,925 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3211): Error Recovery for block blk_1918679507966322481_1015 bad datanode[0] 127.0.0.1:55498
2013-03-20 00:51:04,927 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataXceiver@252a17] datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:50293, storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, ipcPort=59764):Number of active connections is: 2
2013-03-20 00:51:04,927 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3262): Error Recovery for block blk_1918679507966322481_1015 in pipeline 127.0.0.1:55498, 127.0.0.1:50293: bad datanode 127.0.0.1:55498
2013-03-20 00:51:04,925 ERROR [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:55498, storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, ipcPort=60306):DataXceiver
java.io.IOException: Interrupted receiveBlock
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:626)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:406)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112)
	at java.lang.Thread.run(Thread.java:662)
2013-03-20 00:51:04,928 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataXceiver@1f6f3dc] datanode.DataXceiver(138): DatanodeRegistration(127.0.0.1:55498, storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, ipcPort=60306):Number of active connections is: 3
2013-03-20 00:51:04,928 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset(2159): getBlockMetaDataInfo successful block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,933 INFO  [IPC Server handler 1 on 59764] datanode.DataNode(2130): Client calls recoverBlock(block=blk_1918679507966322481_1015, targets=[127.0.0.1:50293])
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] datanode.FSDataset(2143): Interrupting active writer threads for block blk_1918679507966322481_1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] datanode.FSDataset(2159): getBlockMetaDataInfo successful block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,934 DEBUG [IPC Server handler 1 on 59764] datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=false
2013-03-20 00:51:04,935 INFO  [IPC Server handler 1 on 33341] namenode.FSNamesystem(5662): blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 ERROR [IPC Server handler 1 on 33341] security.UserGroupInformation(1152): PriviledgedActionException as:jenkins cause:java.io.IOException: blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
2013-03-20 00:51:04,935 DEBUG [IPC Server handler 0 on 59764] datanode.FSDataset(2143): Interrupting active writer threads for block blk_1918679507966322481_1015
2013-03-20 00:51:04,936 DEBUG [IPC Server handler 0 on 59764] datanode.FSDataset(2159): getBlockMetaDataInfo successful block=blk_1918679507966322481_1015 length 1006 genstamp 1015
2013-03-20 00:51:04,936 ERROR [IPC Server handler 1 on 59764] security.UserGroupInformation(1152): PriviledgedActionException as:blk_1918679507966322481_1015 cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

2013-03-20 00:51:04,936 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(2046): block=blk_1918679507966322481_1015, (length=1006), syncList=[BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 wasRecoveredOnStartup=false) node=127.0.0.1:55498), BlockRecord(info=BlockRecoveryInfo(block=blk_1918679507966322481_1015 wasRecoveredOnStartup=false) node=127.0.0.1:50293)], closeFile=true
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3287): Failed recovery attempt #0 from primary datanode 127.0.0.1:50293
org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: blk_1918679507966322481_1015is being recovered by NameNode, ignoring the request from a client
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:5663)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:788)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

	at org.apache.hadoop.ipc.Client.call(Client.java:1107)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at $Proxy12.nextGenerationStamp(Unknown Source)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2066)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2034)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2114)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

	at org.apache.hadoop.ipc.Client.call(Client.java:1107)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at $Proxy17.recoverBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3285)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2749)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2953)
2013-03-20 00:51:04,937 INFO  [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:55498
2013-03-20 00:51:04,937 WARN  [DataStreamer for file /hbase/hlog/hlog.dat.1 block blk_1918679507966322481_1015] hdfs.DFSClient$DFSOutputStream(3326): Error Recovery for block blk_1918679507966322481_1015 failed  because recovery from primary datanode 127.0.0.1:50293 failed 1 times.  Pipeline was 127.0.0.1:55498, 127.0.0.1:50293. Will retry...
2013-03-20 00:51:04,938 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset$FSDir(175): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481_1024.meta to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,938 DEBUG [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.FSDataset$FSDir(176): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/blocksBeingWritten/blk_1918679507966322481 to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1/current/blk_1918679507966322481
2013-03-20 00:51:04,938 INFO  [org.apache.hadoop.hdfs.server.datanode.DataNode$2@1544e44] datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 1006 as part of lease recovery.
2013-03-20 00:51:04,938 DEBUG [DataNode: [/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data1,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data2]] datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:55498, storageID=DS-204805625-67.195.138.31-55498-1363740658791, infoPort=49242, ipcPort=60306) sleeping for artificial delay: 1 ms
2013-03-20 00:51:04,938 INFO  [IPC Server handler 2 on 59764] datanode.DataNode(1806): oldblock=blk_1918679507966322481_1015(length=1006), newblock=blk_1918679507966322481_1024(length=1006), datanode=127.0.0.1:50293
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] datanode.FSDataset$FSDir(175): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481_1024.meta to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481_1024.meta
2013-03-20 00:51:04,939 DEBUG [IPC Server handler 2 on 59764] datanode.FSDataset$FSDir(176): addBlock: Moved /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/blocksBeingWritten/blk_1918679507966322481 to /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3/current/blk_1918679507966322481
2013-03-20 00:51:04,939 INFO  [IPC Server handler 2 on 59764] datanode.DataNode(1814): Received block blk_1918679507966322481_1024 of size 1006 as part of lease recovery.
2013-03-20 00:51:04,939 DEBUG [DataNode: [/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/target/test-data/4e69e985-5c95-4028-8b78-d393394d7a60/dfscluster_30065559-622d-4882-ae72-2b31c742d8bc/dfs/data/data4]] datanode.DataNode(1124): DataNode DatanodeRegistration(127.0.0.1:50293, storageID=DS-1414966149-67.195.138.31-50293-1363740659066, infoPort=34946, ipcPort=59764) sleeping for artificial delay: 3 ms
2013-03-20 00:51:04,940 INFO  [IPC Server handler 3 on 33341] namenode.FSNamesystem(2374): commitBlockSynchronization(lastblock=blk_1918679507966322481_1015, newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498, 127.0.0.1:50293], closeFile=true, deleteBlock=false)
2013-03-20 00:51:04,942 INFO  [IPC Server handler 3 on 33341] namenode.FSNamesystem(2458): commitBlockSynchronization(newblock=blk_1918679507966322481_1024, file=/hbase/hlog/hlog.dat.1, newgenerationstamp=1024, newlength=1006, newtargets=[127.0.0.1:55498, 127.0.0.1:50293]) successful
2013-03-20 00:51:04,943 WARN  [IPC Server handler 5 on 33341] namenode.FSNamesystem(3850): BLOCK* NameSystem.addStoredBlock: Redundant addStoredBlock request received for blk_1918679507966322481_1024 on 127.0.0.1:50293 size 1006
2013-03-20 00:51:05,406 INFO  [pool-1-thread-1] util.FSHDFSUtils(137): Finished lease recover attempt for hdfs://localhost:33341/hbase/hlog/hlog.dat.1
{code}
                  
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
>                 Key: HBASE-7878
>                 URL: https://issues.apache.org/jira/browse/HBASE-7878
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.95.0, 0.94.6
>            Reporter: Eric Newton
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.95.0, 0.98.0, 0.94.7
>
>         Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2, 7878-trunk.addendum, 7878-trunk.addendum2, 7878-trunk-v10.txt, 7878-trunk-v2.txt, 7878-trunk-v3.txt, 7878-trunk-v4.txt, 7878-trunk-v5.txt, 7878-trunk-v6.txt, 7878-trunk-v7.txt, 7878-trunk-v8.txt, 7878-trunk-v9.txt, 7878-trunk-v9.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery.  During testing, we experienced data loss.  I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed.  In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira