You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Tamir Kamara <ta...@gmail.com> on 2009/04/13 09:15:57 UTC

DataXceiver Errors in 0.19.1

Hi,

I've recently upgraded to 0.19.1 and now there're some DataXceiver errors in
the datanodes logs. There're also messages about interruption while waiting
for IO. Both messages are below.
Can I do something to fix it ?

Thanks,
Tamir


2009-04-13 09:57:20,334 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.14.3:50010,
storageID=DS-727246419-127.0.0.1-50010-1234873914501, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 65557 bytes
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
	at java.lang.Thread.run(Unknown Source)


2009-04-13 09:57:20,333 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
blk_8486030874928774495_54856 1 Exception
java.io.InterruptedIOException: Interruped while waiting for IO on
channel java.nio.channels.SocketChannel[connected
local=/192.168.14.3:50439 remote=/192.168.14.7:50010]. 58972 millis
timeout left.
	at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:277)
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
	at java.io.DataInputStream.readFully(Unknown Source)
	at java.io.DataInputStream.readLong(Unknown Source)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:853)
	at java.lang.Thread.run(Unknown Source)

Re: DataXceiver Errors in 0.19.1

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
It need not be anything to worry about. Do you see anything at user 
level (task, job, copy, or script) fail because of this?

On a distributed system with many nodes, there would be some errors on 
some of the nodes for various reasons (load, hardware, reboot, etc). 
HDFS usually should work around it (because of multiple replicas).

In this particular case, client is trying to write some data and one of 
the DataNodes writing a replica might have gone down. HDFS should 
recover from it and write to rest of the nodes. Please check if the 
write actually succeeded.

Raghu.

Tamir Kamara wrote:
> Hi,
> 
> I've recently upgraded to 0.19.1 and now there're some DataXceiver errors in
> the datanodes logs. There're also messages about interruption while waiting
> for IO. Both messages are below.
> Can I do something to fix it ?
> 
> Thanks,
> Tamir
> 
> 
> 2009-04-13 09:57:20,334 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.14.3:50010,
> storageID=DS-727246419-127.0.0.1-50010-1234873914501, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.EOFException: while trying to read 65557 bytes
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
> 	at java.lang.Thread.run(Unknown Source)
> 
> 
> 2009-04-13 09:57:20,333 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
> blk_8486030874928774495_54856 1 Exception
> java.io.InterruptedIOException: Interruped while waiting for IO on
> channel java.nio.channels.SocketChannel[connected
> local=/192.168.14.3:50439 remote=/192.168.14.7:50010]. 58972 millis
> timeout left.
> 	at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:277)
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
> 	at java.io.DataInputStream.readFully(Unknown Source)
> 	at java.io.DataInputStream.readLong(Unknown Source)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:853)
> 	at java.lang.Thread.run(Unknown Source)
>