You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by sudhakara st <su...@gmail.com> on 2015/07/07 04:00:22 UTC

Re: Socket Timeout Exception

Hello Dejan

Check configuration parameter  for short-circuit local reads
(dfs.client.read.shortcircuit) is set to true.

On Tue, May 26, 2015 at 7:15 PM, Dejan Menges <de...@gmail.com>
wrote:

> Hi,
>
> I'm seeing this exception on every HDFS node once in a while on one
> cluster:
>
> 2015-05-26 13:37:31,831 INFO  datanode.DataNode
> (BlockSender.java:sendPacket(566)) - Failed to send data:
> java.net.SocketTimeoutException: 10000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.22.5.34:50010
> remote=/172.22.5.34:31684]
>
> 2015-05-26 13:37:31,831 INFO  DataNode.clienttrace
> (BlockSender.java:sendBlock(738)) - src: /172.22.5.34:50010, dest: /
> 172.22.5.34:31684, bytes: 12451840, op: HDFS_READ, cliID:
> DFSClient_hb_rs_my-hadoop-node-fqdn,60020,1432041913240_-1351889511_35,
> offset: 47212032, srvID: 9bfc58b8-94b0-40a5-ba33-6d712fa1faa2, blockid:
> BP-1988583858-172.22.5.40-1424448407690:blk_1105314202_31576629, duration:
> 10486866121
>
> 2015-05-26 13:37:31,831 WARN  datanode.DataNode
> (DataXceiver.java:readBlock(541)) - DatanodeRegistration(172.22.5.34,
> datanodeUuid=9bfc58b8-94b0-40a5-ba33-6d712fa1faa2, infoPort=50075,
> ipcPort=8010,
> storageInfo=lv=-55;cid=CID-962af1ea-201a-4d27-ae80-e4a7b712f1ac;nsid=109597947;c=0):Got
> exception while serving
> BP-1988583858-172.22.5.40-1424448407690:blk_1105314202_31576629 to /
> 172.22.5.34:31684
>
> java.net.SocketTimeoutException: 10000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.22.5.34:50010
> remote=/172.22.5.34:31684]
>
> at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>
> at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
>
> at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
>
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
>
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:716)
>
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
>
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
>
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
>
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
>
> at java.lang.Thread.run(Thread.java:745)
>
> 2015-05-26 13:37:31,831 ERROR datanode.DataNode
> (DataXceiver.java:run(250)) - my-hadoop-node-fqdn:50010:DataXceiver error
> processing READ_BLOCK operation  src: /172.22.5.34:31684 dst: /
> 172.22.5.34:50010
>
> java.net.SocketTimeoutException: 10000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.22.5.34:50010
> remote=/172.22.5.34:31684]
>
> at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>
> at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
>
> at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
>
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
>
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:716)
>
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
>
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
>
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
>
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
> ...and it's basically only complaining about itself. On same node there's
> HDFS, RegionServer and Yarn.
>
> I'm struggling little bit how to interpret this. Funny thing is that this
> is our live cluster, the one where we are writing everything. Thinking if
> it's possible that HBase flush size (256M) is problem while block size is
> 128M.
>
> Any advice where to look is welcome!
>
> Thanks,
> Dejan
>



-- 

Regards,
...sudhakara