You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Dhanasekaran Anbalagan <bu...@gmail.com> on 2013/03/08 08:45:12 UTC

DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Hi Guys

I am frequently getting is error in my Data nodes.

Please guide what is the exact problem this.


dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK
operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
java.net.SocketTimeoutException: 70000 millis timeout while waiting
for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280
remote=/172.16.30.140:50010]

at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
at java.io.FilterInputStream.read(FilterInputStream.java:66)
at java.io.FilterInputStream.read(FilterInputStream.java:66)
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)

at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)

at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
at java.lang.Thread.run(Thread.java:662)



dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK
operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
java.io.EOFException: while trying to read 65563 bytes

at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)

at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)

at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
at java.lang.Thread.run(Thread.java:662)




How to resolve this.

-Dhanasekaran.

Did I learn something today? If not, I wasted it.

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am having some GC pauses (70 secs) but I don't think this could cause 
480 secs
timeout. And its even more weird when it happens from one datanode to 
ITSELF.

 > Socket is ready for receiving, but client closed abnormally. so you 
generally got this error.

What would abnormally be in this case?

 > xcievers : 4096 is enough, and I don't think you pasted a full stack 
exception.

Follows.

Thanks very much for the help,
Pablo Musa

2013-03-12 09:41:52,779 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66393088, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24309480
2013-03-12 09:41:52,810 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66458624, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24791908

...

2013-03-12 11:57:54,176 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45037, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2755072, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 26533296

...

2013-03-12 12:12:56,524 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_6121120387190865802_12522001
2013-03-12 12:12:56,844 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_7798078179913116741_9709757
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45063, bytes: 594432, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2886144, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480311786486
2013-03-12 12:12:57,412 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45063
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45063 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:58,043 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_8022508854015956034_21426598
2013-03-12 12:12:58,069 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-5102464265454077361_17771877
2013-03-12 12:12:58,443 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-7350069832338632205_21397596

...

2013-03-12 12:37:21,267 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_9101061522956099413_17372672
2013-03-12 12:37:21,298 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-2427596758655123110_10847650
2013-03-12 12:37:21,310 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_5607661776053432519_17155914
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45213, bytes: 528384, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 9052672, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480102794116
2013-03-12 12:37:21,323 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45213
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45213 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,326 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6873281681928280553_12192883
2013-03-12 12:37:21,342 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6420939594294632128_2665052



On 03/10/2013 11:23 PM, Azuryy Yu wrote:
> xcievers : 4096 is enough, and I don't think you pasted a full stack 
> exception.
> Socket is ready for receiving, but client closed abnormally. so you 
> generally got this error.
>
>
> On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     This variable was already set:
>     <property>
>       <name>dfs.datanode.max.xcievers</name>
>       <value>4096</value>
>       <final>true</final>
>     </property>
>
>     Should I increase it more?
>
>     Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
>     2013-03-10 15:26:42,818 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     PSLBHDN002:50010:DataXceiver error processing READ_BLOCK
>     operation  src: /172.17.2.18:46422 <http://172.17.2.18:46422>
>     dest: /172.17.2.18:50010 <http://172.17.2.18:50010>
>     java.net.SocketTimeoutException: 480000 millis timeout while
>     waiting for channel to be ready for write. ch :
>     java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010
>     <http://172.17.2.18:50010> remote=/172.17.2.18:46422
>     <http://172.17.2.18:46422>]
>
>
>     ]$ lsof | wc -l
>     2393
>
>     ]$ lsof | grep hbase | wc -l
>     4
>
>     ]$ lsof | grep hdfs | wc -l
>     322
>
>     ]$ lsof | grep hadoop | wc -l
>     162
>
>     ]$ cat /proc/sys/fs/file-nr
>     4416    0    7327615
>
>     ]$ date
>     Sun Mar 10 15:31:47 BRT 2013
>
>
>     What can be the causes? How could I extract more info about the error?
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>>     Hi,
>>
>>     If all of the # of open files limit ( hbase , and hdfs : users )
>>     are set to more than 30 K. Please change
>>     the dfs.datanode.max.xcievers to more than the value below.
>>
>>     <property>
>>
>>        <name>dfs.datanode.max.xcievers</name>
>>
>>        <value>2096</value>
>>
>>            <description>PRIVATE CONFIG VARIABLE</description>
>>
>>                  </property>
>>
>>     Try to increase this one and tunne it to the hbase usage.
>>
>>
>>     Thanks
>>
>>     -Abdelrahman
>>
>>
>>
>>
>>
>>
>>     On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com
>>     <ma...@psafe.com>> wrote:
>>
>>         I am also having this issue and tried a lot of solutions, but
>>         could not solve it.
>>
>>         ]# ulimit -n ** running as root and hdfs (datanode user)
>>         32768
>>
>>         ]# cat /proc/sys/fs/file-nr
>>         2080    0    8047008
>>
>>         ]# lsof | wc -l
>>         5157
>>
>>         Sometimes this issue happens from one node to the same node :(
>>
>>         I also think this issue is messing with my regionservers
>>         which are crashing all day long!!
>>
>>         Thanks,
>>         Pablo
>>
>>
>>         On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>>         Hi Varun
>>>
>>>         I believe is not ulimit issue.
>>>
>>>
>>>         /etc/security/limits.conf
>>>         # End of file
>>>         *               -      nofile      1000000
>>>         *               -      nproc       1000000
>>>
>>>
>>>         please guide me Guys, I want fix this. share your
>>>         thoughts DataXceiver error.
>>>
>>>         Did I learn something today? If not, I wasted it.
>>>
>>>
>>>         On Fri, Mar 8, 2013 at 3:50 AM, varun kumar
>>>         <varun.uid@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>             Hi Dhana,
>>>
>>>             Increase the ulimit for all the datanodes.
>>>
>>>             If you are starting the service using hadoop increase
>>>             the ulimit value for hadoop user.
>>>
>>>             Do the  changes in the following file.
>>>
>>>             */etc/security/limits.conf*
>>>
>>>             Example:-
>>>             *hadoop          soft  nofile          35000*
>>>             *hadoop          hard  nofile          35000*
>>>
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>             On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>>             <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>                 Hi Guys
>>>
>>>                 I am frequently getting is error in my Data nodes.
>>>
>>>                 Please guide what is the exact problem this.
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>                 How to resolve this.
>>>
>>>                 -Dhanasekaran.
>>>
>>>                 Did I learn something today? If not, I wasted it.
>>>
>>>                 -- 
>>>
>>>
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am having some GC pauses (70 secs) but I don't think this could cause 
480 secs
timeout. And its even more weird when it happens from one datanode to 
ITSELF.

 > Socket is ready for receiving, but client closed abnormally. so you 
generally got this error.

What would abnormally be in this case?

 > xcievers : 4096 is enough, and I don't think you pasted a full stack 
exception.

Follows.

Thanks very much for the help,
Pablo Musa

2013-03-12 09:41:52,779 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66393088, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24309480
2013-03-12 09:41:52,810 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66458624, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24791908

...

2013-03-12 11:57:54,176 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45037, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2755072, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 26533296

...

2013-03-12 12:12:56,524 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_6121120387190865802_12522001
2013-03-12 12:12:56,844 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_7798078179913116741_9709757
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45063, bytes: 594432, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2886144, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480311786486
2013-03-12 12:12:57,412 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45063
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45063 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:58,043 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_8022508854015956034_21426598
2013-03-12 12:12:58,069 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-5102464265454077361_17771877
2013-03-12 12:12:58,443 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-7350069832338632205_21397596

...

2013-03-12 12:37:21,267 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_9101061522956099413_17372672
2013-03-12 12:37:21,298 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-2427596758655123110_10847650
2013-03-12 12:37:21,310 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_5607661776053432519_17155914
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45213, bytes: 528384, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 9052672, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480102794116
2013-03-12 12:37:21,323 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45213
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45213 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,326 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6873281681928280553_12192883
2013-03-12 12:37:21,342 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6420939594294632128_2665052



On 03/10/2013 11:23 PM, Azuryy Yu wrote:
> xcievers : 4096 is enough, and I don't think you pasted a full stack 
> exception.
> Socket is ready for receiving, but client closed abnormally. so you 
> generally got this error.
>
>
> On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     This variable was already set:
>     <property>
>       <name>dfs.datanode.max.xcievers</name>
>       <value>4096</value>
>       <final>true</final>
>     </property>
>
>     Should I increase it more?
>
>     Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
>     2013-03-10 15:26:42,818 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     PSLBHDN002:50010:DataXceiver error processing READ_BLOCK
>     operation  src: /172.17.2.18:46422 <http://172.17.2.18:46422>
>     dest: /172.17.2.18:50010 <http://172.17.2.18:50010>
>     java.net.SocketTimeoutException: 480000 millis timeout while
>     waiting for channel to be ready for write. ch :
>     java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010
>     <http://172.17.2.18:50010> remote=/172.17.2.18:46422
>     <http://172.17.2.18:46422>]
>
>
>     ]$ lsof | wc -l
>     2393
>
>     ]$ lsof | grep hbase | wc -l
>     4
>
>     ]$ lsof | grep hdfs | wc -l
>     322
>
>     ]$ lsof | grep hadoop | wc -l
>     162
>
>     ]$ cat /proc/sys/fs/file-nr
>     4416    0    7327615
>
>     ]$ date
>     Sun Mar 10 15:31:47 BRT 2013
>
>
>     What can be the causes? How could I extract more info about the error?
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>>     Hi,
>>
>>     If all of the # of open files limit ( hbase , and hdfs : users )
>>     are set to more than 30 K. Please change
>>     the dfs.datanode.max.xcievers to more than the value below.
>>
>>     <property>
>>
>>        <name>dfs.datanode.max.xcievers</name>
>>
>>        <value>2096</value>
>>
>>            <description>PRIVATE CONFIG VARIABLE</description>
>>
>>                  </property>
>>
>>     Try to increase this one and tunne it to the hbase usage.
>>
>>
>>     Thanks
>>
>>     -Abdelrahman
>>
>>
>>
>>
>>
>>
>>     On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com
>>     <ma...@psafe.com>> wrote:
>>
>>         I am also having this issue and tried a lot of solutions, but
>>         could not solve it.
>>
>>         ]# ulimit -n ** running as root and hdfs (datanode user)
>>         32768
>>
>>         ]# cat /proc/sys/fs/file-nr
>>         2080    0    8047008
>>
>>         ]# lsof | wc -l
>>         5157
>>
>>         Sometimes this issue happens from one node to the same node :(
>>
>>         I also think this issue is messing with my regionservers
>>         which are crashing all day long!!
>>
>>         Thanks,
>>         Pablo
>>
>>
>>         On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>>         Hi Varun
>>>
>>>         I believe is not ulimit issue.
>>>
>>>
>>>         /etc/security/limits.conf
>>>         # End of file
>>>         *               -      nofile      1000000
>>>         *               -      nproc       1000000
>>>
>>>
>>>         please guide me Guys, I want fix this. share your
>>>         thoughts DataXceiver error.
>>>
>>>         Did I learn something today? If not, I wasted it.
>>>
>>>
>>>         On Fri, Mar 8, 2013 at 3:50 AM, varun kumar
>>>         <varun.uid@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>             Hi Dhana,
>>>
>>>             Increase the ulimit for all the datanodes.
>>>
>>>             If you are starting the service using hadoop increase
>>>             the ulimit value for hadoop user.
>>>
>>>             Do the  changes in the following file.
>>>
>>>             */etc/security/limits.conf*
>>>
>>>             Example:-
>>>             *hadoop          soft  nofile          35000*
>>>             *hadoop          hard  nofile          35000*
>>>
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>             On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>>             <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>                 Hi Guys
>>>
>>>                 I am frequently getting is error in my Data nodes.
>>>
>>>                 Please guide what is the exact problem this.
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>                 How to resolve this.
>>>
>>>                 -Dhanasekaran.
>>>
>>>                 Did I learn something today? If not, I wasted it.
>>>
>>>                 -- 
>>>
>>>
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am having some GC pauses (70 secs) but I don't think this could cause 
480 secs
timeout. And its even more weird when it happens from one datanode to 
ITSELF.

 > Socket is ready for receiving, but client closed abnormally. so you 
generally got this error.

What would abnormally be in this case?

 > xcievers : 4096 is enough, and I don't think you pasted a full stack 
exception.

Follows.

Thanks very much for the help,
Pablo Musa

2013-03-12 09:41:52,779 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66393088, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24309480
2013-03-12 09:41:52,810 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66458624, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24791908

...

2013-03-12 11:57:54,176 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45037, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2755072, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 26533296

...

2013-03-12 12:12:56,524 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_6121120387190865802_12522001
2013-03-12 12:12:56,844 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_7798078179913116741_9709757
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45063, bytes: 594432, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2886144, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480311786486
2013-03-12 12:12:57,412 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45063
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45063 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:58,043 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_8022508854015956034_21426598
2013-03-12 12:12:58,069 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-5102464265454077361_17771877
2013-03-12 12:12:58,443 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-7350069832338632205_21397596

...

2013-03-12 12:37:21,267 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_9101061522956099413_17372672
2013-03-12 12:37:21,298 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-2427596758655123110_10847650
2013-03-12 12:37:21,310 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_5607661776053432519_17155914
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45213, bytes: 528384, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 9052672, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480102794116
2013-03-12 12:37:21,323 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45213
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45213 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,326 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6873281681928280553_12192883
2013-03-12 12:37:21,342 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6420939594294632128_2665052



On 03/10/2013 11:23 PM, Azuryy Yu wrote:
> xcievers : 4096 is enough, and I don't think you pasted a full stack 
> exception.
> Socket is ready for receiving, but client closed abnormally. so you 
> generally got this error.
>
>
> On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     This variable was already set:
>     <property>
>       <name>dfs.datanode.max.xcievers</name>
>       <value>4096</value>
>       <final>true</final>
>     </property>
>
>     Should I increase it more?
>
>     Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
>     2013-03-10 15:26:42,818 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     PSLBHDN002:50010:DataXceiver error processing READ_BLOCK
>     operation  src: /172.17.2.18:46422 <http://172.17.2.18:46422>
>     dest: /172.17.2.18:50010 <http://172.17.2.18:50010>
>     java.net.SocketTimeoutException: 480000 millis timeout while
>     waiting for channel to be ready for write. ch :
>     java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010
>     <http://172.17.2.18:50010> remote=/172.17.2.18:46422
>     <http://172.17.2.18:46422>]
>
>
>     ]$ lsof | wc -l
>     2393
>
>     ]$ lsof | grep hbase | wc -l
>     4
>
>     ]$ lsof | grep hdfs | wc -l
>     322
>
>     ]$ lsof | grep hadoop | wc -l
>     162
>
>     ]$ cat /proc/sys/fs/file-nr
>     4416    0    7327615
>
>     ]$ date
>     Sun Mar 10 15:31:47 BRT 2013
>
>
>     What can be the causes? How could I extract more info about the error?
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>>     Hi,
>>
>>     If all of the # of open files limit ( hbase , and hdfs : users )
>>     are set to more than 30 K. Please change
>>     the dfs.datanode.max.xcievers to more than the value below.
>>
>>     <property>
>>
>>        <name>dfs.datanode.max.xcievers</name>
>>
>>        <value>2096</value>
>>
>>            <description>PRIVATE CONFIG VARIABLE</description>
>>
>>                  </property>
>>
>>     Try to increase this one and tunne it to the hbase usage.
>>
>>
>>     Thanks
>>
>>     -Abdelrahman
>>
>>
>>
>>
>>
>>
>>     On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com
>>     <ma...@psafe.com>> wrote:
>>
>>         I am also having this issue and tried a lot of solutions, but
>>         could not solve it.
>>
>>         ]# ulimit -n ** running as root and hdfs (datanode user)
>>         32768
>>
>>         ]# cat /proc/sys/fs/file-nr
>>         2080    0    8047008
>>
>>         ]# lsof | wc -l
>>         5157
>>
>>         Sometimes this issue happens from one node to the same node :(
>>
>>         I also think this issue is messing with my regionservers
>>         which are crashing all day long!!
>>
>>         Thanks,
>>         Pablo
>>
>>
>>         On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>>         Hi Varun
>>>
>>>         I believe is not ulimit issue.
>>>
>>>
>>>         /etc/security/limits.conf
>>>         # End of file
>>>         *               -      nofile      1000000
>>>         *               -      nproc       1000000
>>>
>>>
>>>         please guide me Guys, I want fix this. share your
>>>         thoughts DataXceiver error.
>>>
>>>         Did I learn something today? If not, I wasted it.
>>>
>>>
>>>         On Fri, Mar 8, 2013 at 3:50 AM, varun kumar
>>>         <varun.uid@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>             Hi Dhana,
>>>
>>>             Increase the ulimit for all the datanodes.
>>>
>>>             If you are starting the service using hadoop increase
>>>             the ulimit value for hadoop user.
>>>
>>>             Do the  changes in the following file.
>>>
>>>             */etc/security/limits.conf*
>>>
>>>             Example:-
>>>             *hadoop          soft  nofile          35000*
>>>             *hadoop          hard  nofile          35000*
>>>
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>             On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>>             <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>                 Hi Guys
>>>
>>>                 I am frequently getting is error in my Data nodes.
>>>
>>>                 Please guide what is the exact problem this.
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>                 How to resolve this.
>>>
>>>                 -Dhanasekaran.
>>>
>>>                 Did I learn something today? If not, I wasted it.
>>>
>>>                 -- 
>>>
>>>
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am having some GC pauses (70 secs) but I don't think this could cause 
480 secs
timeout. And its even more weird when it happens from one datanode to 
ITSELF.

 > Socket is ready for receiving, but client closed abnormally. so you 
generally got this error.

What would abnormally be in this case?

 > xcievers : 4096 is enough, and I don't think you pasted a full stack 
exception.

Follows.

Thanks very much for the help,
Pablo Musa

2013-03-12 09:41:52,779 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66393088, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24309480
2013-03-12 09:41:52,810 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:43364, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 
66458624, srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 24791908

...

2013-03-12 11:57:54,176 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45037, bytes: 66564, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2755072, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 26533296

...

2013-03-12 12:12:56,524 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_6121120387190865802_12522001
2013-03-12 12:12:56,844 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_7798078179913116741_9709757
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45063, bytes: 594432, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 2886144, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480311786486
2013-03-12 12:12:57,412 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45063
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:57,412 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45063 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45063]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:12:58,043 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_8022508854015956034_21426598
2013-03-12 12:12:58,069 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-5102464265454077361_17771877
2013-03-12 12:12:58,443 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-7350069832338632205_21397596

...

2013-03-12 12:37:21,267 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_9101061522956099413_17372672
2013-03-12 12:37:21,298 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-2427596758655123110_10847650
2013-03-12 12:37:21,310 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_5607661776053432519_17155914
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/172.17.2.18:50010, dest: /172.17.2.18:45213, bytes: 528384, op: 
HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1549283955_26, offset: 9052672, 
srvID: DS-229334310-172.17.2.18-50010-1328651636364, blockid: 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577, 
duration: 480102794116
2013-03-12 12:37:21,323 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(172.17.2.18, 
storageID=DS-229334310-172.17.2.18-50010-1328651636364, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-40;cid=CID-26cd999e-460a-4dbc-b940-9250a76930a8;nsid=276058127;c=1362491004838):Got 
exception while serving 
BP-43236042-172.17.2.10-1362490844340:blk_7228654423351524558_25176577 
to /172.17.2.18:45213
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,323 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation src: 
/172.17.2.18:45213 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:45213]
         at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
         at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:510)
         at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:673)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:344)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
         at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
         at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
         at java.lang.Thread.run(Thread.java:722)
2013-03-12 12:37:21,326 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6873281681928280553_12192883
2013-03-12 12:37:21,342 INFO 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: 
Verification succeeded for 
BP-43236042-172.17.2.10-1362490844340:blk_-6420939594294632128_2665052



On 03/10/2013 11:23 PM, Azuryy Yu wrote:
> xcievers : 4096 is enough, and I don't think you pasted a full stack 
> exception.
> Socket is ready for receiving, but client closed abnormally. so you 
> generally got this error.
>
>
> On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     This variable was already set:
>     <property>
>       <name>dfs.datanode.max.xcievers</name>
>       <value>4096</value>
>       <final>true</final>
>     </property>
>
>     Should I increase it more?
>
>     Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
>     2013-03-10 15:26:42,818 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     PSLBHDN002:50010:DataXceiver error processing READ_BLOCK
>     operation  src: /172.17.2.18:46422 <http://172.17.2.18:46422>
>     dest: /172.17.2.18:50010 <http://172.17.2.18:50010>
>     java.net.SocketTimeoutException: 480000 millis timeout while
>     waiting for channel to be ready for write. ch :
>     java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010
>     <http://172.17.2.18:50010> remote=/172.17.2.18:46422
>     <http://172.17.2.18:46422>]
>
>
>     ]$ lsof | wc -l
>     2393
>
>     ]$ lsof | grep hbase | wc -l
>     4
>
>     ]$ lsof | grep hdfs | wc -l
>     322
>
>     ]$ lsof | grep hadoop | wc -l
>     162
>
>     ]$ cat /proc/sys/fs/file-nr
>     4416    0    7327615
>
>     ]$ date
>     Sun Mar 10 15:31:47 BRT 2013
>
>
>     What can be the causes? How could I extract more info about the error?
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>>     Hi,
>>
>>     If all of the # of open files limit ( hbase , and hdfs : users )
>>     are set to more than 30 K. Please change
>>     the dfs.datanode.max.xcievers to more than the value below.
>>
>>     <property>
>>
>>        <name>dfs.datanode.max.xcievers</name>
>>
>>        <value>2096</value>
>>
>>            <description>PRIVATE CONFIG VARIABLE</description>
>>
>>                  </property>
>>
>>     Try to increase this one and tunne it to the hbase usage.
>>
>>
>>     Thanks
>>
>>     -Abdelrahman
>>
>>
>>
>>
>>
>>
>>     On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com
>>     <ma...@psafe.com>> wrote:
>>
>>         I am also having this issue and tried a lot of solutions, but
>>         could not solve it.
>>
>>         ]# ulimit -n ** running as root and hdfs (datanode user)
>>         32768
>>
>>         ]# cat /proc/sys/fs/file-nr
>>         2080    0    8047008
>>
>>         ]# lsof | wc -l
>>         5157
>>
>>         Sometimes this issue happens from one node to the same node :(
>>
>>         I also think this issue is messing with my regionservers
>>         which are crashing all day long!!
>>
>>         Thanks,
>>         Pablo
>>
>>
>>         On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>>         Hi Varun
>>>
>>>         I believe is not ulimit issue.
>>>
>>>
>>>         /etc/security/limits.conf
>>>         # End of file
>>>         *               -      nofile      1000000
>>>         *               -      nproc       1000000
>>>
>>>
>>>         please guide me Guys, I want fix this. share your
>>>         thoughts DataXceiver error.
>>>
>>>         Did I learn something today? If not, I wasted it.
>>>
>>>
>>>         On Fri, Mar 8, 2013 at 3:50 AM, varun kumar
>>>         <varun.uid@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>             Hi Dhana,
>>>
>>>             Increase the ulimit for all the datanodes.
>>>
>>>             If you are starting the service using hadoop increase
>>>             the ulimit value for hadoop user.
>>>
>>>             Do the  changes in the following file.
>>>
>>>             */etc/security/limits.conf*
>>>
>>>             Example:-
>>>             *hadoop          soft  nofile          35000*
>>>             *hadoop          hard  nofile          35000*
>>>
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>             On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>>             <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>>
>>>                 Hi Guys
>>>
>>>                 I am frequently getting is error in my Data nodes.
>>>
>>>                 Please guide what is the exact problem this.
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>                 at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>                 dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>>
>>>
>>>
>>>                 java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>>                 at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>                 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>                 How to resolve this.
>>>
>>>                 -Dhanasekaran.
>>>
>>>                 Did I learn something today? If not, I wasted it.
>>>
>>>                 -- 
>>>
>>>
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Regards,
>>>             Varun Kumar.P
>>>
>>>
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Azuryy Yu <az...@gmail.com>.
xcievers : 4096 is enough, and I don't think you pasted a full stack
exception.
Socket is ready for receiving, but client closed abnormally. so you
generally got this error.


On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pa...@psafe.com> wrote:

>  This variable was already set:
> <property>
>   <name>dfs.datanode.max.xcievers</name>
>   <value>4096</value>
>   <final>true</final>
> </property>
>
> Should I increase it more?
>
> Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
> 2013-03-10 15:26:42,818 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: /
> 172.17.2.18:46422 dest: /172.17.2.18:50010
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010remote=/
> 172.17.2.18:46422]
>
>
> ]$ lsof | wc -l
> 2393
>
> ]$ lsof | grep hbase | wc -l
> 4
>
> ]$ lsof | grep hdfs | wc -l
> 322
>
> ]$ lsof | grep hadoop | wc -l
> 162
>
> ]$ cat /proc/sys/fs/file-nr
> 4416    0    7327615
>
> ]$ date
> Sun Mar 10 15:31:47 BRT 2013
>
>
> What can be the causes? How could I extract more info about the error?
>
> Thanks,
> Pablo
>
>
>  On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>
> Hi,
>
>  If all of the # of open files limit ( hbase , and hdfs : users ) are set
> to more than 30 K. Please change the dfs.datanode.max.xcievers to more than
> the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
>  Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:
>
>>  I am also having this issue and tried a lot of solutions, but could not
>> solve it.
>>
>> ]# ulimit -n ** running as root and hdfs (datanode user)
>> 32768
>>
>> ]# cat /proc/sys/fs/file-nr
>> 2080    0    8047008
>>
>> ]# lsof | wc -l
>> 5157
>>
>> Sometimes this issue happens from one node to the same node :(
>>
>> I also think this issue is messing with my regionservers which are
>> crashing all day long!!
>>
>> Thanks,
>> Pablo
>>
>>
>> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>
>> Hi Varun
>>
>>  I believe is not ulimit issue.
>>
>>
>>  /etc/security/limits.conf
>>  # End of file
>> *               -      nofile          1000000
>> *               -      nproc           1000000
>>
>>
>>  please guide me Guys, I want fix this. share your thoughts DataXceiver
>> error.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>
>> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>>
>>> Hi Dhana,
>>>
>>>  Increase the ulimit for all the datanodes.
>>>
>>>  If you are starting the service using hadoop increase the ulimit value
>>> for hadoop user.
>>>
>>>  Do the  changes in the following file.
>>>
>>>  */etc/security/limits.conf*
>>>
>>>  Example:-
>>> *hadoop          soft    nofile          35000*
>>> *hadoop          hard    nofile          35000*
>>>
>>>  Regards,
>>> Varun Kumar.P
>>>
>>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>>> bugcy013@gmail.com> wrote:
>>>
>>>>   Hi Guys
>>>>
>>>>  I am frequently getting is error in my Data nodes.
>>>>
>>>>  Please guide what is the exact problem this.
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.io.EOFException: while trying to read 65563 bytes
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>
>>>>
>>>>  How to resolve this.
>>>>
>>>>  -Dhanasekaran.
>>>>
>>>>  Did I learn something today? If not, I wasted it.
>>>>
>>>>    --
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>  --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Azuryy Yu <az...@gmail.com>.
xcievers : 4096 is enough, and I don't think you pasted a full stack
exception.
Socket is ready for receiving, but client closed abnormally. so you
generally got this error.


On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pa...@psafe.com> wrote:

>  This variable was already set:
> <property>
>   <name>dfs.datanode.max.xcievers</name>
>   <value>4096</value>
>   <final>true</final>
> </property>
>
> Should I increase it more?
>
> Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
> 2013-03-10 15:26:42,818 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: /
> 172.17.2.18:46422 dest: /172.17.2.18:50010
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010remote=/
> 172.17.2.18:46422]
>
>
> ]$ lsof | wc -l
> 2393
>
> ]$ lsof | grep hbase | wc -l
> 4
>
> ]$ lsof | grep hdfs | wc -l
> 322
>
> ]$ lsof | grep hadoop | wc -l
> 162
>
> ]$ cat /proc/sys/fs/file-nr
> 4416    0    7327615
>
> ]$ date
> Sun Mar 10 15:31:47 BRT 2013
>
>
> What can be the causes? How could I extract more info about the error?
>
> Thanks,
> Pablo
>
>
>  On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>
> Hi,
>
>  If all of the # of open files limit ( hbase , and hdfs : users ) are set
> to more than 30 K. Please change the dfs.datanode.max.xcievers to more than
> the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
>  Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:
>
>>  I am also having this issue and tried a lot of solutions, but could not
>> solve it.
>>
>> ]# ulimit -n ** running as root and hdfs (datanode user)
>> 32768
>>
>> ]# cat /proc/sys/fs/file-nr
>> 2080    0    8047008
>>
>> ]# lsof | wc -l
>> 5157
>>
>> Sometimes this issue happens from one node to the same node :(
>>
>> I also think this issue is messing with my regionservers which are
>> crashing all day long!!
>>
>> Thanks,
>> Pablo
>>
>>
>> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>
>> Hi Varun
>>
>>  I believe is not ulimit issue.
>>
>>
>>  /etc/security/limits.conf
>>  # End of file
>> *               -      nofile          1000000
>> *               -      nproc           1000000
>>
>>
>>  please guide me Guys, I want fix this. share your thoughts DataXceiver
>> error.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>
>> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>>
>>> Hi Dhana,
>>>
>>>  Increase the ulimit for all the datanodes.
>>>
>>>  If you are starting the service using hadoop increase the ulimit value
>>> for hadoop user.
>>>
>>>  Do the  changes in the following file.
>>>
>>>  */etc/security/limits.conf*
>>>
>>>  Example:-
>>> *hadoop          soft    nofile          35000*
>>> *hadoop          hard    nofile          35000*
>>>
>>>  Regards,
>>> Varun Kumar.P
>>>
>>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>>> bugcy013@gmail.com> wrote:
>>>
>>>>   Hi Guys
>>>>
>>>>  I am frequently getting is error in my Data nodes.
>>>>
>>>>  Please guide what is the exact problem this.
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.io.EOFException: while trying to read 65563 bytes
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>
>>>>
>>>>  How to resolve this.
>>>>
>>>>  -Dhanasekaran.
>>>>
>>>>  Did I learn something today? If not, I wasted it.
>>>>
>>>>    --
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>  --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Azuryy Yu <az...@gmail.com>.
xcievers : 4096 is enough, and I don't think you pasted a full stack
exception.
Socket is ready for receiving, but client closed abnormally. so you
generally got this error.


On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pa...@psafe.com> wrote:

>  This variable was already set:
> <property>
>   <name>dfs.datanode.max.xcievers</name>
>   <value>4096</value>
>   <final>true</final>
> </property>
>
> Should I increase it more?
>
> Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
> 2013-03-10 15:26:42,818 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: /
> 172.17.2.18:46422 dest: /172.17.2.18:50010
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010remote=/
> 172.17.2.18:46422]
>
>
> ]$ lsof | wc -l
> 2393
>
> ]$ lsof | grep hbase | wc -l
> 4
>
> ]$ lsof | grep hdfs | wc -l
> 322
>
> ]$ lsof | grep hadoop | wc -l
> 162
>
> ]$ cat /proc/sys/fs/file-nr
> 4416    0    7327615
>
> ]$ date
> Sun Mar 10 15:31:47 BRT 2013
>
>
> What can be the causes? How could I extract more info about the error?
>
> Thanks,
> Pablo
>
>
>  On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>
> Hi,
>
>  If all of the # of open files limit ( hbase , and hdfs : users ) are set
> to more than 30 K. Please change the dfs.datanode.max.xcievers to more than
> the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
>  Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:
>
>>  I am also having this issue and tried a lot of solutions, but could not
>> solve it.
>>
>> ]# ulimit -n ** running as root and hdfs (datanode user)
>> 32768
>>
>> ]# cat /proc/sys/fs/file-nr
>> 2080    0    8047008
>>
>> ]# lsof | wc -l
>> 5157
>>
>> Sometimes this issue happens from one node to the same node :(
>>
>> I also think this issue is messing with my regionservers which are
>> crashing all day long!!
>>
>> Thanks,
>> Pablo
>>
>>
>> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>
>> Hi Varun
>>
>>  I believe is not ulimit issue.
>>
>>
>>  /etc/security/limits.conf
>>  # End of file
>> *               -      nofile          1000000
>> *               -      nproc           1000000
>>
>>
>>  please guide me Guys, I want fix this. share your thoughts DataXceiver
>> error.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>
>> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>>
>>> Hi Dhana,
>>>
>>>  Increase the ulimit for all the datanodes.
>>>
>>>  If you are starting the service using hadoop increase the ulimit value
>>> for hadoop user.
>>>
>>>  Do the  changes in the following file.
>>>
>>>  */etc/security/limits.conf*
>>>
>>>  Example:-
>>> *hadoop          soft    nofile          35000*
>>> *hadoop          hard    nofile          35000*
>>>
>>>  Regards,
>>> Varun Kumar.P
>>>
>>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>>> bugcy013@gmail.com> wrote:
>>>
>>>>   Hi Guys
>>>>
>>>>  I am frequently getting is error in my Data nodes.
>>>>
>>>>  Please guide what is the exact problem this.
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.io.EOFException: while trying to read 65563 bytes
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>
>>>>
>>>>  How to resolve this.
>>>>
>>>>  -Dhanasekaran.
>>>>
>>>>  Did I learn something today? If not, I wasted it.
>>>>
>>>>    --
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>  --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Azuryy Yu <az...@gmail.com>.
xcievers : 4096 is enough, and I don't think you pasted a full stack
exception.
Socket is ready for receiving, but client closed abnormally. so you
generally got this error.


On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pa...@psafe.com> wrote:

>  This variable was already set:
> <property>
>   <name>dfs.datanode.max.xcievers</name>
>   <value>4096</value>
>   <final>true</final>
> </property>
>
> Should I increase it more?
>
> Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
> 2013-03-10 15:26:42,818 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: /
> 172.17.2.18:46422 dest: /172.17.2.18:50010
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010remote=/
> 172.17.2.18:46422]
>
>
> ]$ lsof | wc -l
> 2393
>
> ]$ lsof | grep hbase | wc -l
> 4
>
> ]$ lsof | grep hdfs | wc -l
> 322
>
> ]$ lsof | grep hadoop | wc -l
> 162
>
> ]$ cat /proc/sys/fs/file-nr
> 4416    0    7327615
>
> ]$ date
> Sun Mar 10 15:31:47 BRT 2013
>
>
> What can be the causes? How could I extract more info about the error?
>
> Thanks,
> Pablo
>
>
>  On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>
> Hi,
>
>  If all of the # of open files limit ( hbase , and hdfs : users ) are set
> to more than 30 K. Please change the dfs.datanode.max.xcievers to more than
> the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
>  Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:
>
>>  I am also having this issue and tried a lot of solutions, but could not
>> solve it.
>>
>> ]# ulimit -n ** running as root and hdfs (datanode user)
>> 32768
>>
>> ]# cat /proc/sys/fs/file-nr
>> 2080    0    8047008
>>
>> ]# lsof | wc -l
>> 5157
>>
>> Sometimes this issue happens from one node to the same node :(
>>
>> I also think this issue is messing with my regionservers which are
>> crashing all day long!!
>>
>> Thanks,
>> Pablo
>>
>>
>> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>
>> Hi Varun
>>
>>  I believe is not ulimit issue.
>>
>>
>>  /etc/security/limits.conf
>>  # End of file
>> *               -      nofile          1000000
>> *               -      nproc           1000000
>>
>>
>>  please guide me Guys, I want fix this. share your thoughts DataXceiver
>> error.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>
>> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>>
>>> Hi Dhana,
>>>
>>>  Increase the ulimit for all the datanodes.
>>>
>>>  If you are starting the service using hadoop increase the ulimit value
>>> for hadoop user.
>>>
>>>  Do the  changes in the following file.
>>>
>>>  */etc/security/limits.conf*
>>>
>>>  Example:-
>>> *hadoop          soft    nofile          35000*
>>> *hadoop          hard    nofile          35000*
>>>
>>>  Regards,
>>> Varun Kumar.P
>>>
>>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>>> bugcy013@gmail.com> wrote:
>>>
>>>>   Hi Guys
>>>>
>>>>  I am frequently getting is error in my Data nodes.
>>>>
>>>>  Please guide what is the exact problem this.
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.io.EOFException: while trying to read 65563 bytes
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>
>>>>
>>>>  How to resolve this.
>>>>
>>>>  -Dhanasekaran.
>>>>
>>>>  Did I learn something today? If not, I wasted it.
>>>>
>>>>    --
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>  --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
This variable was already set:
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
   <final>true</final>
</property>

Should I increase it more?

Same error happening every 5-8 minutes in the datanode 172.17.2.18.

2013-03-10 15:26:42,818 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: 
/172.17.2.18:46422 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:46422]


]$ lsof | wc -l
2393

]$ lsof | grep hbase | wc -l
4

]$ lsof | grep hdfs | wc -l
322

]$ lsof | grep hadoop | wc -l
162

]$ cat /proc/sys/fs/file-nr
4416    0    7327615

]$ date
Sun Mar 10 15:31:47 BRT 2013


What can be the causes? How could I extract more info about the error?

Thanks,
Pablo


On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
> Hi,
>
> If all of the # of open files limit ( hbase , and hdfs : users ) are 
> set to more than 30 K. Please change the dfs.datanode.max.xcievers to 
> more than the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
> Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     I am also having this issue and tried a lot of solutions, but
>     could not solve it.
>
>     ]# ulimit -n ** running as root and hdfs (datanode user)
>     32768
>
>     ]# cat /proc/sys/fs/file-nr
>     2080    0    8047008
>
>     ]# lsof | wc -l
>     5157
>
>     Sometimes this issue happens from one node to the same node :(
>
>     I also think this issue is messing with my regionservers which are
>     crashing all day long!!
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>     Hi Varun
>>
>>     I believe is not ulimit issue.
>>
>>
>>     /etc/security/limits.conf
>>     # End of file
>>     *               -      nofile  1000000
>>     *               -      nproc 1000000
>>
>>
>>     please guide me Guys, I want fix this. share your
>>     thoughts DataXceiver error.
>>
>>     Did I learn something today? If not, I wasted it.
>>
>>
>>     On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>         Hi Dhana,
>>
>>         Increase the ulimit for all the datanodes.
>>
>>         If you are starting the service using hadoop increase the
>>         ulimit value for hadoop user.
>>
>>         Do the  changes in the following file.
>>
>>         */etc/security/limits.conf*
>>
>>         Example:-
>>         *hadoop          soft    nofile    35000*
>>         *hadoop          hard    nofile    35000*
>>
>>         Regards,
>>         Varun Kumar.P
>>
>>         On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>         <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>
>>             Hi Guys
>>
>>             I am frequently getting is error in my Data nodes.
>>
>>             Please guide what is the exact problem this.
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.io.EOFException: while trying to read 65563 bytes
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>             How to resolve this.
>>
>>             -Dhanasekaran.
>>
>>             Did I learn something today? If not, I wasted it.
>>
>>             -- 
>>
>>
>>
>>
>>
>>
>>         -- 
>>         Regards,
>>         Varun Kumar.P
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
This variable was already set:
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
   <final>true</final>
</property>

Should I increase it more?

Same error happening every 5-8 minutes in the datanode 172.17.2.18.

2013-03-10 15:26:42,818 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: 
/172.17.2.18:46422 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:46422]


]$ lsof | wc -l
2393

]$ lsof | grep hbase | wc -l
4

]$ lsof | grep hdfs | wc -l
322

]$ lsof | grep hadoop | wc -l
162

]$ cat /proc/sys/fs/file-nr
4416    0    7327615

]$ date
Sun Mar 10 15:31:47 BRT 2013


What can be the causes? How could I extract more info about the error?

Thanks,
Pablo


On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
> Hi,
>
> If all of the # of open files limit ( hbase , and hdfs : users ) are 
> set to more than 30 K. Please change the dfs.datanode.max.xcievers to 
> more than the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
> Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     I am also having this issue and tried a lot of solutions, but
>     could not solve it.
>
>     ]# ulimit -n ** running as root and hdfs (datanode user)
>     32768
>
>     ]# cat /proc/sys/fs/file-nr
>     2080    0    8047008
>
>     ]# lsof | wc -l
>     5157
>
>     Sometimes this issue happens from one node to the same node :(
>
>     I also think this issue is messing with my regionservers which are
>     crashing all day long!!
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>     Hi Varun
>>
>>     I believe is not ulimit issue.
>>
>>
>>     /etc/security/limits.conf
>>     # End of file
>>     *               -      nofile  1000000
>>     *               -      nproc 1000000
>>
>>
>>     please guide me Guys, I want fix this. share your
>>     thoughts DataXceiver error.
>>
>>     Did I learn something today? If not, I wasted it.
>>
>>
>>     On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>         Hi Dhana,
>>
>>         Increase the ulimit for all the datanodes.
>>
>>         If you are starting the service using hadoop increase the
>>         ulimit value for hadoop user.
>>
>>         Do the  changes in the following file.
>>
>>         */etc/security/limits.conf*
>>
>>         Example:-
>>         *hadoop          soft    nofile    35000*
>>         *hadoop          hard    nofile    35000*
>>
>>         Regards,
>>         Varun Kumar.P
>>
>>         On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>         <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>
>>             Hi Guys
>>
>>             I am frequently getting is error in my Data nodes.
>>
>>             Please guide what is the exact problem this.
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.io.EOFException: while trying to read 65563 bytes
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>             How to resolve this.
>>
>>             -Dhanasekaran.
>>
>>             Did I learn something today? If not, I wasted it.
>>
>>             -- 
>>
>>
>>
>>
>>
>>
>>         -- 
>>         Regards,
>>         Varun Kumar.P
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
This variable was already set:
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
   <final>true</final>
</property>

Should I increase it more?

Same error happening every 5-8 minutes in the datanode 172.17.2.18.

2013-03-10 15:26:42,818 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: 
/172.17.2.18:46422 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:46422]


]$ lsof | wc -l
2393

]$ lsof | grep hbase | wc -l
4

]$ lsof | grep hdfs | wc -l
322

]$ lsof | grep hadoop | wc -l
162

]$ cat /proc/sys/fs/file-nr
4416    0    7327615

]$ date
Sun Mar 10 15:31:47 BRT 2013


What can be the causes? How could I extract more info about the error?

Thanks,
Pablo


On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
> Hi,
>
> If all of the # of open files limit ( hbase , and hdfs : users ) are 
> set to more than 30 K. Please change the dfs.datanode.max.xcievers to 
> more than the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
> Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     I am also having this issue and tried a lot of solutions, but
>     could not solve it.
>
>     ]# ulimit -n ** running as root and hdfs (datanode user)
>     32768
>
>     ]# cat /proc/sys/fs/file-nr
>     2080    0    8047008
>
>     ]# lsof | wc -l
>     5157
>
>     Sometimes this issue happens from one node to the same node :(
>
>     I also think this issue is messing with my regionservers which are
>     crashing all day long!!
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>     Hi Varun
>>
>>     I believe is not ulimit issue.
>>
>>
>>     /etc/security/limits.conf
>>     # End of file
>>     *               -      nofile  1000000
>>     *               -      nproc 1000000
>>
>>
>>     please guide me Guys, I want fix this. share your
>>     thoughts DataXceiver error.
>>
>>     Did I learn something today? If not, I wasted it.
>>
>>
>>     On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>         Hi Dhana,
>>
>>         Increase the ulimit for all the datanodes.
>>
>>         If you are starting the service using hadoop increase the
>>         ulimit value for hadoop user.
>>
>>         Do the  changes in the following file.
>>
>>         */etc/security/limits.conf*
>>
>>         Example:-
>>         *hadoop          soft    nofile    35000*
>>         *hadoop          hard    nofile    35000*
>>
>>         Regards,
>>         Varun Kumar.P
>>
>>         On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>         <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>
>>             Hi Guys
>>
>>             I am frequently getting is error in my Data nodes.
>>
>>             Please guide what is the exact problem this.
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.io.EOFException: while trying to read 65563 bytes
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>             How to resolve this.
>>
>>             -Dhanasekaran.
>>
>>             Did I learn something today? If not, I wasted it.
>>
>>             -- 
>>
>>
>>
>>
>>
>>
>>         -- 
>>         Regards,
>>         Varun Kumar.P
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
This variable was already set:
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
   <final>true</final>
</property>

Should I increase it more?

Same error happening every 5-8 minutes in the datanode 172.17.2.18.

2013-03-10 15:26:42,818 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: 
/172.17.2.18:46422 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for 
channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010 
remote=/172.17.2.18:46422]


]$ lsof | wc -l
2393

]$ lsof | grep hbase | wc -l
4

]$ lsof | grep hdfs | wc -l
322

]$ lsof | grep hadoop | wc -l
162

]$ cat /proc/sys/fs/file-nr
4416    0    7327615

]$ date
Sun Mar 10 15:31:47 BRT 2013


What can be the causes? How could I extract more info about the error?

Thanks,
Pablo


On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
> Hi,
>
> If all of the # of open files limit ( hbase , and hdfs : users ) are 
> set to more than 30 K. Please change the dfs.datanode.max.xcievers to 
> more than the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
> Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com 
> <ma...@psafe.com>> wrote:
>
>     I am also having this issue and tried a lot of solutions, but
>     could not solve it.
>
>     ]# ulimit -n ** running as root and hdfs (datanode user)
>     32768
>
>     ]# cat /proc/sys/fs/file-nr
>     2080    0    8047008
>
>     ]# lsof | wc -l
>     5157
>
>     Sometimes this issue happens from one node to the same node :(
>
>     I also think this issue is messing with my regionservers which are
>     crashing all day long!!
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>     Hi Varun
>>
>>     I believe is not ulimit issue.
>>
>>
>>     /etc/security/limits.conf
>>     # End of file
>>     *               -      nofile  1000000
>>     *               -      nproc 1000000
>>
>>
>>     please guide me Guys, I want fix this. share your
>>     thoughts DataXceiver error.
>>
>>     Did I learn something today? If not, I wasted it.
>>
>>
>>     On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>         Hi Dhana,
>>
>>         Increase the ulimit for all the datanodes.
>>
>>         If you are starting the service using hadoop increase the
>>         ulimit value for hadoop user.
>>
>>         Do the  changes in the following file.
>>
>>         */etc/security/limits.conf*
>>
>>         Example:-
>>         *hadoop          soft    nofile    35000*
>>         *hadoop          hard    nofile    35000*
>>
>>         Regards,
>>         Varun Kumar.P
>>
>>         On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>         <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>>
>>             Hi Guys
>>
>>             I am frequently getting is error in my Data nodes.
>>
>>             Please guide what is the exact problem this.
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>             at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.io.EOFException: while trying to read 65563 bytes
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>             at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>             at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>             How to resolve this.
>>
>>             -Dhanasekaran.
>>
>>             Did I learn something today? If not, I wasted it.
>>
>>             -- 
>>
>>
>>
>>
>>
>>
>>         -- 
>>         Regards,
>>         Varun Kumar.P
>>
>>
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi,

If all of the # of open files limit ( hbase , and hdfs : users ) are set to
more than 30 K. Please change the dfs.datanode.max.xcievers to more than
the value below.

<property>

   <name>dfs.datanode.max.xcievers</name>

   <value>2096</value>

       <description>PRIVATE CONFIG VARIABLE</description>

             </property>

Try to increase this one and tunne it to the hbase usage.


Thanks

-Abdelrahman






On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:

>  I am also having this issue and tried a lot of solutions, but could not
> solve it.
>
> ]# ulimit -n ** running as root and hdfs (datanode user)
> 32768
>
> ]# cat /proc/sys/fs/file-nr
> 2080    0    8047008
>
> ]# lsof | wc -l
> 5157
>
> Sometimes this issue happens from one node to the same node :(
>
> I also think this issue is messing with my regionservers which are
> crashing all day long!!
>
> Thanks,
> Pablo
>
>
> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>
> Hi Varun
>
>  I believe is not ulimit issue.
>
>
>  /etc/security/limits.conf
>  # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
>  please guide me Guys, I want fix this. share your thoughts DataXceiver
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>
>> Hi Dhana,
>>
>>  Increase the ulimit for all the datanodes.
>>
>>  If you are starting the service using hadoop increase the ulimit value
>> for hadoop user.
>>
>>  Do the  changes in the following file.
>>
>>  */etc/security/limits.conf*
>>
>>  Example:-
>> *hadoop          soft    nofile          35000*
>> *hadoop          hard    nofile          35000*
>>
>>  Regards,
>> Varun Kumar.P
>>
>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>> bugcy013@gmail.com> wrote:
>>
>>>   Hi Guys
>>>
>>>  I am frequently getting is error in my Data nodes.
>>>
>>>  Please guide what is the exact problem this.
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>  How to resolve this.
>>>
>>>  -Dhanasekaran.
>>>
>>>  Did I learn something today? If not, I wasted it.
>>>
>>>    --
>>>
>>>
>>>
>>>
>>
>>
>>
>>  --
>> Regards,
>> Varun Kumar.P
>>
>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi,

If all of the # of open files limit ( hbase , and hdfs : users ) are set to
more than 30 K. Please change the dfs.datanode.max.xcievers to more than
the value below.

<property>

   <name>dfs.datanode.max.xcievers</name>

   <value>2096</value>

       <description>PRIVATE CONFIG VARIABLE</description>

             </property>

Try to increase this one and tunne it to the hbase usage.


Thanks

-Abdelrahman






On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:

>  I am also having this issue and tried a lot of solutions, but could not
> solve it.
>
> ]# ulimit -n ** running as root and hdfs (datanode user)
> 32768
>
> ]# cat /proc/sys/fs/file-nr
> 2080    0    8047008
>
> ]# lsof | wc -l
> 5157
>
> Sometimes this issue happens from one node to the same node :(
>
> I also think this issue is messing with my regionservers which are
> crashing all day long!!
>
> Thanks,
> Pablo
>
>
> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>
> Hi Varun
>
>  I believe is not ulimit issue.
>
>
>  /etc/security/limits.conf
>  # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
>  please guide me Guys, I want fix this. share your thoughts DataXceiver
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>
>> Hi Dhana,
>>
>>  Increase the ulimit for all the datanodes.
>>
>>  If you are starting the service using hadoop increase the ulimit value
>> for hadoop user.
>>
>>  Do the  changes in the following file.
>>
>>  */etc/security/limits.conf*
>>
>>  Example:-
>> *hadoop          soft    nofile          35000*
>> *hadoop          hard    nofile          35000*
>>
>>  Regards,
>> Varun Kumar.P
>>
>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>> bugcy013@gmail.com> wrote:
>>
>>>   Hi Guys
>>>
>>>  I am frequently getting is error in my Data nodes.
>>>
>>>  Please guide what is the exact problem this.
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>  How to resolve this.
>>>
>>>  -Dhanasekaran.
>>>
>>>  Did I learn something today? If not, I wasted it.
>>>
>>>    --
>>>
>>>
>>>
>>>
>>
>>
>>
>>  --
>> Regards,
>> Varun Kumar.P
>>
>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi,

If all of the # of open files limit ( hbase , and hdfs : users ) are set to
more than 30 K. Please change the dfs.datanode.max.xcievers to more than
the value below.

<property>

   <name>dfs.datanode.max.xcievers</name>

   <value>2096</value>

       <description>PRIVATE CONFIG VARIABLE</description>

             </property>

Try to increase this one and tunne it to the hbase usage.


Thanks

-Abdelrahman






On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:

>  I am also having this issue and tried a lot of solutions, but could not
> solve it.
>
> ]# ulimit -n ** running as root and hdfs (datanode user)
> 32768
>
> ]# cat /proc/sys/fs/file-nr
> 2080    0    8047008
>
> ]# lsof | wc -l
> 5157
>
> Sometimes this issue happens from one node to the same node :(
>
> I also think this issue is messing with my regionservers which are
> crashing all day long!!
>
> Thanks,
> Pablo
>
>
> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>
> Hi Varun
>
>  I believe is not ulimit issue.
>
>
>  /etc/security/limits.conf
>  # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
>  please guide me Guys, I want fix this. share your thoughts DataXceiver
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>
>> Hi Dhana,
>>
>>  Increase the ulimit for all the datanodes.
>>
>>  If you are starting the service using hadoop increase the ulimit value
>> for hadoop user.
>>
>>  Do the  changes in the following file.
>>
>>  */etc/security/limits.conf*
>>
>>  Example:-
>> *hadoop          soft    nofile          35000*
>> *hadoop          hard    nofile          35000*
>>
>>  Regards,
>> Varun Kumar.P
>>
>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>> bugcy013@gmail.com> wrote:
>>
>>>   Hi Guys
>>>
>>>  I am frequently getting is error in my Data nodes.
>>>
>>>  Please guide what is the exact problem this.
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>  How to resolve this.
>>>
>>>  -Dhanasekaran.
>>>
>>>  Did I learn something today? If not, I wasted it.
>>>
>>>    --
>>>
>>>
>>>
>>>
>>
>>
>>
>>  --
>> Regards,
>> Varun Kumar.P
>>
>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Abdelrahman Shettia <as...@hortonworks.com>.
Hi,

If all of the # of open files limit ( hbase , and hdfs : users ) are set to
more than 30 K. Please change the dfs.datanode.max.xcievers to more than
the value below.

<property>

   <name>dfs.datanode.max.xcievers</name>

   <value>2096</value>

       <description>PRIVATE CONFIG VARIABLE</description>

             </property>

Try to increase this one and tunne it to the hbase usage.


Thanks

-Abdelrahman






On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pa...@psafe.com> wrote:

>  I am also having this issue and tried a lot of solutions, but could not
> solve it.
>
> ]# ulimit -n ** running as root and hdfs (datanode user)
> 32768
>
> ]# cat /proc/sys/fs/file-nr
> 2080    0    8047008
>
> ]# lsof | wc -l
> 5157
>
> Sometimes this issue happens from one node to the same node :(
>
> I also think this issue is messing with my regionservers which are
> crashing all day long!!
>
> Thanks,
> Pablo
>
>
> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>
> Hi Varun
>
>  I believe is not ulimit issue.
>
>
>  /etc/security/limits.conf
>  # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
>  please guide me Guys, I want fix this. share your thoughts DataXceiver
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:
>
>> Hi Dhana,
>>
>>  Increase the ulimit for all the datanodes.
>>
>>  If you are starting the service using hadoop increase the ulimit value
>> for hadoop user.
>>
>>  Do the  changes in the following file.
>>
>>  */etc/security/limits.conf*
>>
>>  Example:-
>> *hadoop          soft    nofile          35000*
>> *hadoop          hard    nofile          35000*
>>
>>  Regards,
>> Varun Kumar.P
>>
>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>> bugcy013@gmail.com> wrote:
>>
>>>   Hi Guys
>>>
>>>  I am frequently getting is error in my Data nodes.
>>>
>>>  Please guide what is the exact problem this.
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>
>>>
>>>
>>> java.io.EOFException: while trying to read 65563 bytes
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>
>>>
>>>
>>>
>>>
>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>>  How to resolve this.
>>>
>>>  -Dhanasekaran.
>>>
>>>  Did I learn something today? If not, I wasted it.
>>>
>>>    --
>>>
>>>
>>>
>>>
>>
>>
>>
>>  --
>> Regards,
>> Varun Kumar.P
>>
>
>
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am also having this issue and tried a lot of solutions, but could not 
solve it.

]# ulimit -n ** running as root and hdfs (datanode user)
32768

]# cat /proc/sys/fs/file-nr
2080    0    8047008

]# lsof | wc -l
5157

Sometimes this issue happens from one node to the same node :(

I also think this issue is messing with my regionservers which are 
crashing all day long!!

Thanks,
Pablo

On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
> Hi Varun
>
> I believe is not ulimit issue.
>
>
> /etc/security/limits.conf
> # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
> please guide me Guys, I want fix this. share your thoughts DataXceiver 
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi Dhana,
>
>     Increase the ulimit for all the datanodes.
>
>     If you are starting the service using hadoop increase the ulimit
>     value for hadoop user.
>
>     Do the  changes in the following file.
>
>     */etc/security/limits.conf*
>
>     Example:-
>     *hadoop          soft    nofile          35000*
>     *hadoop          hard    nofile          35000*
>
>     Regards,
>     Varun Kumar.P
>
>     On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>     <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>
>         Hi Guys
>
>         I am frequently getting is error in my Data nodes.
>
>         Please guide what is the exact problem this.
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>
>
>
>
>
>         at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
>
>
>
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.io.EOFException: while trying to read 65563 bytes
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>         How to resolve this.
>
>         -Dhanasekaran.
>
>         Did I learn something today? If not, I wasted it.
>
>         -- 
>
>
>
>
>
>
>     -- 
>     Regards,
>     Varun Kumar.P
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am also having this issue and tried a lot of solutions, but could not 
solve it.

]# ulimit -n ** running as root and hdfs (datanode user)
32768

]# cat /proc/sys/fs/file-nr
2080    0    8047008

]# lsof | wc -l
5157

Sometimes this issue happens from one node to the same node :(

I also think this issue is messing with my regionservers which are 
crashing all day long!!

Thanks,
Pablo

On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
> Hi Varun
>
> I believe is not ulimit issue.
>
>
> /etc/security/limits.conf
> # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
> please guide me Guys, I want fix this. share your thoughts DataXceiver 
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi Dhana,
>
>     Increase the ulimit for all the datanodes.
>
>     If you are starting the service using hadoop increase the ulimit
>     value for hadoop user.
>
>     Do the  changes in the following file.
>
>     */etc/security/limits.conf*
>
>     Example:-
>     *hadoop          soft    nofile          35000*
>     *hadoop          hard    nofile          35000*
>
>     Regards,
>     Varun Kumar.P
>
>     On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>     <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>
>         Hi Guys
>
>         I am frequently getting is error in my Data nodes.
>
>         Please guide what is the exact problem this.
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>
>
>
>
>
>         at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
>
>
>
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.io.EOFException: while trying to read 65563 bytes
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>         How to resolve this.
>
>         -Dhanasekaran.
>
>         Did I learn something today? If not, I wasted it.
>
>         -- 
>
>
>
>
>
>
>     -- 
>     Regards,
>     Varun Kumar.P
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am also having this issue and tried a lot of solutions, but could not 
solve it.

]# ulimit -n ** running as root and hdfs (datanode user)
32768

]# cat /proc/sys/fs/file-nr
2080    0    8047008

]# lsof | wc -l
5157

Sometimes this issue happens from one node to the same node :(

I also think this issue is messing with my regionservers which are 
crashing all day long!!

Thanks,
Pablo

On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
> Hi Varun
>
> I believe is not ulimit issue.
>
>
> /etc/security/limits.conf
> # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
> please guide me Guys, I want fix this. share your thoughts DataXceiver 
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi Dhana,
>
>     Increase the ulimit for all the datanodes.
>
>     If you are starting the service using hadoop increase the ulimit
>     value for hadoop user.
>
>     Do the  changes in the following file.
>
>     */etc/security/limits.conf*
>
>     Example:-
>     *hadoop          soft    nofile          35000*
>     *hadoop          hard    nofile          35000*
>
>     Regards,
>     Varun Kumar.P
>
>     On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>     <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>
>         Hi Guys
>
>         I am frequently getting is error in my Data nodes.
>
>         Please guide what is the exact problem this.
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>
>
>
>
>
>         at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
>
>
>
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.io.EOFException: while trying to read 65563 bytes
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>         How to resolve this.
>
>         -Dhanasekaran.
>
>         Did I learn something today? If not, I wasted it.
>
>         -- 
>
>
>
>
>
>
>     -- 
>     Regards,
>     Varun Kumar.P
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Pablo Musa <pa...@psafe.com>.
I am also having this issue and tried a lot of solutions, but could not 
solve it.

]# ulimit -n ** running as root and hdfs (datanode user)
32768

]# cat /proc/sys/fs/file-nr
2080    0    8047008

]# lsof | wc -l
5157

Sometimes this issue happens from one node to the same node :(

I also think this issue is messing with my regionservers which are 
crashing all day long!!

Thanks,
Pablo

On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
> Hi Varun
>
> I believe is not ulimit issue.
>
>
> /etc/security/limits.conf
> # End of file
> *               -      nofile          1000000
> *               -      nproc           1000000
>
>
> please guide me Guys, I want fix this. share your thoughts DataXceiver 
> error.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi Dhana,
>
>     Increase the ulimit for all the datanodes.
>
>     If you are starting the service using hadoop increase the ulimit
>     value for hadoop user.
>
>     Do the  changes in the following file.
>
>     */etc/security/limits.conf*
>
>     Example:-
>     *hadoop          soft    nofile          35000*
>     *hadoop          hard    nofile          35000*
>
>     Regards,
>     Varun Kumar.P
>
>     On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>     <bugcy013@gmail.com <ma...@gmail.com>> wrote:
>
>         Hi Guys
>
>         I am frequently getting is error in my Data nodes.
>
>         Please guide what is the exact problem this.
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>
>
>
>
>
>         at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
>
>
>
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at java.io.FilterInputStream.read(FilterInputStream.java:66)
>         at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>         dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531  <http://172.16.30.138:50531>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>
>
>
>         java.io.EOFException: while trying to read 65563 bytes
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
>
>
>
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>         How to resolve this.
>
>         -Dhanasekaran.
>
>         Did I learn something today? If not, I wasted it.
>
>         -- 
>
>
>
>
>
>
>     -- 
>     Regards,
>     Varun Kumar.P
>
>


Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Dhanasekaran Anbalagan <bu...@gmail.com>.
Hi Varun

I believe is not ulimit issue.


/etc/security/limits.conf
# End of file
*               -      nofile          1000000
*               -      nproc           1000000


please guide me Guys, I want fix this. share your thoughts DataXceiver
error.

Did I learn something today? If not, I wasted it.


On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:

> Hi Dhana,
>
> Increase the ulimit for all the datanodes.
>
> If you are starting the service using hadoop increase the ulimit value for
> hadoop user.
>
> Do the  changes in the following file.
>
> */etc/security/limits.conf*
>
> Example:-
> *hadoop          soft    nofile          35000*
> *hadoop          hard    nofile          35000*
>
> Regards,
> Varun Kumar.P
>
> On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <bugcy013@gmail.com
> > wrote:
>
>> Hi Guys
>>
>> I am frequently getting is error in my Data nodes.
>>
>> Please guide what is the exact problem this.
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>
>>
>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>
>>
>> java.io.EOFException: while trying to read 65563 bytes
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>
>> How to resolve this.
>>
>> -Dhanasekaran.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>  --
>>
>>
>>
>>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Dhanasekaran Anbalagan <bu...@gmail.com>.
Hi Varun

I believe is not ulimit issue.


/etc/security/limits.conf
# End of file
*               -      nofile          1000000
*               -      nproc           1000000


please guide me Guys, I want fix this. share your thoughts DataXceiver
error.

Did I learn something today? If not, I wasted it.


On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:

> Hi Dhana,
>
> Increase the ulimit for all the datanodes.
>
> If you are starting the service using hadoop increase the ulimit value for
> hadoop user.
>
> Do the  changes in the following file.
>
> */etc/security/limits.conf*
>
> Example:-
> *hadoop          soft    nofile          35000*
> *hadoop          hard    nofile          35000*
>
> Regards,
> Varun Kumar.P
>
> On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <bugcy013@gmail.com
> > wrote:
>
>> Hi Guys
>>
>> I am frequently getting is error in my Data nodes.
>>
>> Please guide what is the exact problem this.
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>
>>
>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>
>>
>> java.io.EOFException: while trying to read 65563 bytes
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>
>> How to resolve this.
>>
>> -Dhanasekaran.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>  --
>>
>>
>>
>>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Dhanasekaran Anbalagan <bu...@gmail.com>.
Hi Varun

I believe is not ulimit issue.


/etc/security/limits.conf
# End of file
*               -      nofile          1000000
*               -      nproc           1000000


please guide me Guys, I want fix this. share your thoughts DataXceiver
error.

Did I learn something today? If not, I wasted it.


On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:

> Hi Dhana,
>
> Increase the ulimit for all the datanodes.
>
> If you are starting the service using hadoop increase the ulimit value for
> hadoop user.
>
> Do the  changes in the following file.
>
> */etc/security/limits.conf*
>
> Example:-
> *hadoop          soft    nofile          35000*
> *hadoop          hard    nofile          35000*
>
> Regards,
> Varun Kumar.P
>
> On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <bugcy013@gmail.com
> > wrote:
>
>> Hi Guys
>>
>> I am frequently getting is error in my Data nodes.
>>
>> Please guide what is the exact problem this.
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>
>>
>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>
>>
>> java.io.EOFException: while trying to read 65563 bytes
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>
>> How to resolve this.
>>
>> -Dhanasekaran.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>  --
>>
>>
>>
>>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by Dhanasekaran Anbalagan <bu...@gmail.com>.
Hi Varun

I believe is not ulimit issue.


/etc/security/limits.conf
# End of file
*               -      nofile          1000000
*               -      nproc           1000000


please guide me Guys, I want fix this. share your thoughts DataXceiver
error.

Did I learn something today? If not, I wasted it.


On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <va...@gmail.com> wrote:

> Hi Dhana,
>
> Increase the ulimit for all the datanodes.
>
> If you are starting the service using hadoop increase the ulimit value for
> hadoop user.
>
> Do the  changes in the following file.
>
> */etc/security/limits.conf*
>
> Example:-
> *hadoop          soft    nofile          35000*
> *hadoop          hard    nofile          35000*
>
> Regards,
> Varun Kumar.P
>
> On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <bugcy013@gmail.com
> > wrote:
>
>> Hi Guys
>>
>> I am frequently getting is error in my Data nodes.
>>
>> Please guide what is the exact problem this.
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>
>>
>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>
>>
>>
>>
>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>
>>
>> java.io.EOFException: while trying to read 65563 bytes
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>
>>
>>
>>
>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>>
>> How to resolve this.
>>
>> -Dhanasekaran.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>  --
>>
>>
>>
>>
>
>
>
> --
> Regards,
> Varun Kumar.P
>

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by varun kumar <va...@gmail.com>.
Hi Dhana,

Increase the ulimit for all the datanodes.

If you are starting the service using hadoop increase the ulimit value for
hadoop user.

Do the  changes in the following file.

*/etc/security/limits.conf*

Example:-
*hadoop          soft    nofile          35000*
*hadoop          hard    nofile          35000*

Regards,
Varun Kumar.P

On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
<bu...@gmail.com>wrote:

> Hi Guys
>
> I am frequently getting is error in my Data nodes.
>
> Please guide what is the exact problem this.
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>
>
> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
> java.io.EOFException: while trying to read 65563 bytes
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
>
>
> How to resolve this.
>
> -Dhanasekaran.
>
> Did I learn something today? If not, I wasted it.
>
>  --
>
>
>
>



-- 
Regards,
Varun Kumar.P

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by varun kumar <va...@gmail.com>.
Hi Dhana,

Increase the ulimit for all the datanodes.

If you are starting the service using hadoop increase the ulimit value for
hadoop user.

Do the  changes in the following file.

*/etc/security/limits.conf*

Example:-
*hadoop          soft    nofile          35000*
*hadoop          hard    nofile          35000*

Regards,
Varun Kumar.P

On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
<bu...@gmail.com>wrote:

> Hi Guys
>
> I am frequently getting is error in my Data nodes.
>
> Please guide what is the exact problem this.
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>
>
> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
> java.io.EOFException: while trying to read 65563 bytes
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
>
>
> How to resolve this.
>
> -Dhanasekaran.
>
> Did I learn something today? If not, I wasted it.
>
>  --
>
>
>
>



-- 
Regards,
Varun Kumar.P

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by varun kumar <va...@gmail.com>.
Hi Dhana,

Increase the ulimit for all the datanodes.

If you are starting the service using hadoop increase the ulimit value for
hadoop user.

Do the  changes in the following file.

*/etc/security/limits.conf*

Example:-
*hadoop          soft    nofile          35000*
*hadoop          hard    nofile          35000*

Regards,
Varun Kumar.P

On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
<bu...@gmail.com>wrote:

> Hi Guys
>
> I am frequently getting is error in my Data nodes.
>
> Please guide what is the exact problem this.
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>
>
> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
> java.io.EOFException: while trying to read 65563 bytes
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
>
>
> How to resolve this.
>
> -Dhanasekaran.
>
> Did I learn something today? If not, I wasted it.
>
>  --
>
>
>
>



-- 
Regards,
Varun Kumar.P

Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010

Posted by varun kumar <va...@gmail.com>.
Hi Dhana,

Increase the ulimit for all the datanodes.

If you are starting the service using hadoop increase the ulimit value for
hadoop user.

Do the  changes in the following file.

*/etc/security/limits.conf*

Example:-
*hadoop          soft    nofile          35000*
*hadoop          hard    nofile          35000*

Regards,
Varun Kumar.P

On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
<bu...@gmail.com>wrote:

> Hi Guys
>
> I am frequently getting is error in my Data nodes.
>
> Please guide what is the exact problem this.
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373 dest: /172.16.30.138:50010
> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280 remote=/172.16.30.140:50010]
>
>
> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>
>
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at java.io.FilterInputStream.read(FilterInputStream.java:66)
> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>
>
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
> dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50531 dest: /172.16.30.138:50010
> java.io.EOFException: while trying to read 65563 bytes
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>
>
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>
>
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
> at java.lang.Thread.run(Thread.java:662)
>
>
>
>
> How to resolve this.
>
> -Dhanasekaran.
>
> Did I learn something today? If not, I wasted it.
>
>  --
>
>
>
>



-- 
Regards,
Varun Kumar.P