You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "S.L" <si...@gmail.com> on 2015/05/27 02:00:04 UTC

DataNode Timeout exceptions.

Hi All,

I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
happening frequently.Can someone tell me the root cause of this issue.

I have set the the property in mapred-site.xml as follows , is there any
other property that I need to set also?

    <property>
      <name>mapreduce.task.timeout</name>
      <value>1800000</value>
      <description>
      The time out value for taks, I set this because the JVMs might be
busy in GC and this is causing timeout in Hadoop Tasks.
      </description>
    </property>



15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
exception  for block
BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
java.net.SocketTimeoutException: 65000 millis timeout while waiting for
channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
remote=/112.123.123.123:50010]
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging area
/tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
15/05/26 02:06:54 WARN security.UserGroupInformation:
PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
All datanodes 112.123.123.123:50010 are bad. Aborting...
15/05/26 02:06:54 WARN security.UserGroupInformation:
PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
All datanodes 112.123.123.123:50010 are bad. Aborting...
Exception in thread "main" java.io.IOException: All datanodes
112.123.123.123:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)

Re: DataNode Timeout exceptions.

Posted by "S.L" <si...@gmail.com>.
Hi Ted , I have only 3 Datanodes.

When I check the logs , I see the following exception in the DataNode log
and no exceptions in the NameNode log.

Stack Trace from the DataNode log.

2015-05-27 10:52:34,741 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
123.32.23.234:50010, dest: /123.32.23.234:56653, bytes: 1453056, op:
HDFS_READ, cliID:
DFSClient_attempt_1431824165463_0265_m_000002_0_-805582199_1, offset: 0,
srvID: 3eb119a1-b922-4b38-9adf-35074dc88c94, blockid:
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719, duration:
481096638884
2015-05-27 10:52:34,772 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(123.32.23.234,
datanodeUuid=3eb119a1-b922-4b38-9adf-35074dc88c94, infoPort=50075,
ipcPort=50020,
storageInfo=lv=-51;cid=CID-f3f9b2dc-893a-45f3-8bac-54fe5d77acfc;nsid=1583960326;c=0):Got
exception while serving
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719 to /
123.32.23.234:56653
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
server1.dealyaft.com:50010:DataXceiver
error processing READ_BLOCK operation  src: /123.32.23.234:56653 dest: /
123.32.23.234:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:35,890 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56655]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)

On Tue, May 26, 2015 at 8:29 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. All datanodes 112.123.123.123:50010 are bad. Aborting...
>
> How many datanodes do you have ?
>
> Can you check datanode namenode log ?
>
> Cheers
>
> On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
>> happening frequently.Can someone tell me the root cause of this issue.
>>
>> I have set the the property in mapred-site.xml as follows , is there any
>> other property that I need to set also?
>>
>>     <property>
>>       <name>mapreduce.task.timeout</name>
>>       <value>1800000</value>
>>       <description>
>>       The time out value for taks, I set this because the JVMs might be
>> busy in GC and this is causing timeout in Hadoop Tasks.
>>       </description>
>>     </property>
>>
>>
>>
>> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
>> exception  for block
>> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
>> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
>> channel to be ready for read. ch :
>> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
>> remote=/112.123.123.123:50010]
>> at
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at
>> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
>> at
>> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
>> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> Exception in thread "main" java.io.IOException: All datanodes
>> 112.123.123.123:50010 are bad. Aborting...
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>>
>>
>>
>>
>

Re: DataNode Timeout exceptions.

Posted by "S.L" <si...@gmail.com>.
Hi Ted , I have only 3 Datanodes.

When I check the logs , I see the following exception in the DataNode log
and no exceptions in the NameNode log.

Stack Trace from the DataNode log.

2015-05-27 10:52:34,741 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
123.32.23.234:50010, dest: /123.32.23.234:56653, bytes: 1453056, op:
HDFS_READ, cliID:
DFSClient_attempt_1431824165463_0265_m_000002_0_-805582199_1, offset: 0,
srvID: 3eb119a1-b922-4b38-9adf-35074dc88c94, blockid:
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719, duration:
481096638884
2015-05-27 10:52:34,772 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(123.32.23.234,
datanodeUuid=3eb119a1-b922-4b38-9adf-35074dc88c94, infoPort=50075,
ipcPort=50020,
storageInfo=lv=-51;cid=CID-f3f9b2dc-893a-45f3-8bac-54fe5d77acfc;nsid=1583960326;c=0):Got
exception while serving
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719 to /
123.32.23.234:56653
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
server1.dealyaft.com:50010:DataXceiver
error processing READ_BLOCK operation  src: /123.32.23.234:56653 dest: /
123.32.23.234:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:35,890 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56655]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)

On Tue, May 26, 2015 at 8:29 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. All datanodes 112.123.123.123:50010 are bad. Aborting...
>
> How many datanodes do you have ?
>
> Can you check datanode namenode log ?
>
> Cheers
>
> On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
>> happening frequently.Can someone tell me the root cause of this issue.
>>
>> I have set the the property in mapred-site.xml as follows , is there any
>> other property that I need to set also?
>>
>>     <property>
>>       <name>mapreduce.task.timeout</name>
>>       <value>1800000</value>
>>       <description>
>>       The time out value for taks, I set this because the JVMs might be
>> busy in GC and this is causing timeout in Hadoop Tasks.
>>       </description>
>>     </property>
>>
>>
>>
>> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
>> exception  for block
>> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
>> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
>> channel to be ready for read. ch :
>> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
>> remote=/112.123.123.123:50010]
>> at
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at
>> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
>> at
>> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
>> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> Exception in thread "main" java.io.IOException: All datanodes
>> 112.123.123.123:50010 are bad. Aborting...
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>>
>>
>>
>>
>

Re: DataNode Timeout exceptions.

Posted by "S.L" <si...@gmail.com>.
Hi Ted , I have only 3 Datanodes.

When I check the logs , I see the following exception in the DataNode log
and no exceptions in the NameNode log.

Stack Trace from the DataNode log.

2015-05-27 10:52:34,741 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
123.32.23.234:50010, dest: /123.32.23.234:56653, bytes: 1453056, op:
HDFS_READ, cliID:
DFSClient_attempt_1431824165463_0265_m_000002_0_-805582199_1, offset: 0,
srvID: 3eb119a1-b922-4b38-9adf-35074dc88c94, blockid:
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719, duration:
481096638884
2015-05-27 10:52:34,772 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(123.32.23.234,
datanodeUuid=3eb119a1-b922-4b38-9adf-35074dc88c94, infoPort=50075,
ipcPort=50020,
storageInfo=lv=-51;cid=CID-f3f9b2dc-893a-45f3-8bac-54fe5d77acfc;nsid=1583960326;c=0):Got
exception while serving
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719 to /
123.32.23.234:56653
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
server1.dealyaft.com:50010:DataXceiver
error processing READ_BLOCK operation  src: /123.32.23.234:56653 dest: /
123.32.23.234:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:35,890 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56655]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)

On Tue, May 26, 2015 at 8:29 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. All datanodes 112.123.123.123:50010 are bad. Aborting...
>
> How many datanodes do you have ?
>
> Can you check datanode namenode log ?
>
> Cheers
>
> On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
>> happening frequently.Can someone tell me the root cause of this issue.
>>
>> I have set the the property in mapred-site.xml as follows , is there any
>> other property that I need to set also?
>>
>>     <property>
>>       <name>mapreduce.task.timeout</name>
>>       <value>1800000</value>
>>       <description>
>>       The time out value for taks, I set this because the JVMs might be
>> busy in GC and this is causing timeout in Hadoop Tasks.
>>       </description>
>>     </property>
>>
>>
>>
>> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
>> exception  for block
>> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
>> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
>> channel to be ready for read. ch :
>> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
>> remote=/112.123.123.123:50010]
>> at
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at
>> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
>> at
>> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
>> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> Exception in thread "main" java.io.IOException: All datanodes
>> 112.123.123.123:50010 are bad. Aborting...
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>>
>>
>>
>>
>

Re: DataNode Timeout exceptions.

Posted by "S.L" <si...@gmail.com>.
Hi Ted , I have only 3 Datanodes.

When I check the logs , I see the following exception in the DataNode log
and no exceptions in the NameNode log.

Stack Trace from the DataNode log.

2015-05-27 10:52:34,741 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
123.32.23.234:50010, dest: /123.32.23.234:56653, bytes: 1453056, op:
HDFS_READ, cliID:
DFSClient_attempt_1431824165463_0265_m_000002_0_-805582199_1, offset: 0,
srvID: 3eb119a1-b922-4b38-9adf-35074dc88c94, blockid:
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719, duration:
481096638884
2015-05-27 10:52:34,772 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(123.32.23.234,
datanodeUuid=3eb119a1-b922-4b38-9adf-35074dc88c94, infoPort=50075,
ipcPort=50020,
storageInfo=lv=-51;cid=CID-f3f9b2dc-893a-45f3-8bac-54fe5d77acfc;nsid=1583960326;c=0):Got
exception while serving
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719 to /
123.32.23.234:56653
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
server1.dealyaft.com:50010:DataXceiver
error processing READ_BLOCK operation  src: /123.32.23.234:56653 dest: /
123.32.23.234:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:35,890 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56655]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)

On Tue, May 26, 2015 at 8:29 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. All datanodes 112.123.123.123:50010 are bad. Aborting...
>
> How many datanodes do you have ?
>
> Can you check datanode namenode log ?
>
> Cheers
>
> On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
>> happening frequently.Can someone tell me the root cause of this issue.
>>
>> I have set the the property in mapred-site.xml as follows , is there any
>> other property that I need to set also?
>>
>>     <property>
>>       <name>mapreduce.task.timeout</name>
>>       <value>1800000</value>
>>       <description>
>>       The time out value for taks, I set this because the JVMs might be
>> busy in GC and this is causing timeout in Hadoop Tasks.
>>       </description>
>>     </property>
>>
>>
>>
>> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
>> exception  for block
>> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
>> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
>> channel to be ready for read. ch :
>> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
>> remote=/112.123.123.123:50010]
>> at
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at
>> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
>> at
>> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
>> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> Exception in thread "main" java.io.IOException: All datanodes
>> 112.123.123.123:50010 are bad. Aborting...
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>>
>>
>>
>>
>

Re: DataNode Timeout exceptions.

Posted by Ted Yu <yu...@gmail.com>.
bq. All datanodes 112.123.123.123:50010 are bad. Aborting...

How many datanodes do you have ?

Can you check datanode namenode log ?

Cheers

On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:

> Hi All,
>
> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
> happening frequently.Can someone tell me the root cause of this issue.
>
> I have set the the property in mapred-site.xml as follows , is there any
> other property that I need to set also?
>
>     <property>
>       <name>mapreduce.task.timeout</name>
>       <value>1800000</value>
>       <description>
>       The time out value for taks, I set this because the JVMs might be
> busy in GC and this is causing timeout in Hadoop Tasks.
>       </description>
>     </property>
>
>
>
> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
> exception  for block
> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
> channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
> remote=/112.123.123.123:50010]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> Exception in thread "main" java.io.IOException: All datanodes
> 112.123.123.123:50010 are bad. Aborting...
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>
>
>
>

Re: DataNode Timeout exceptions.

Posted by Ted Yu <yu...@gmail.com>.
bq. All datanodes 112.123.123.123:50010 are bad. Aborting...

How many datanodes do you have ?

Can you check datanode namenode log ?

Cheers

On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:

> Hi All,
>
> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
> happening frequently.Can someone tell me the root cause of this issue.
>
> I have set the the property in mapred-site.xml as follows , is there any
> other property that I need to set also?
>
>     <property>
>       <name>mapreduce.task.timeout</name>
>       <value>1800000</value>
>       <description>
>       The time out value for taks, I set this because the JVMs might be
> busy in GC and this is causing timeout in Hadoop Tasks.
>       </description>
>     </property>
>
>
>
> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
> exception  for block
> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
> channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
> remote=/112.123.123.123:50010]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> Exception in thread "main" java.io.IOException: All datanodes
> 112.123.123.123:50010 are bad. Aborting...
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>
>
>
>

Re: DataNode Timeout exceptions.

Posted by Ted Yu <yu...@gmail.com>.
bq. All datanodes 112.123.123.123:50010 are bad. Aborting...

How many datanodes do you have ?

Can you check datanode namenode log ?

Cheers

On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:

> Hi All,
>
> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
> happening frequently.Can someone tell me the root cause of this issue.
>
> I have set the the property in mapred-site.xml as follows , is there any
> other property that I need to set also?
>
>     <property>
>       <name>mapreduce.task.timeout</name>
>       <value>1800000</value>
>       <description>
>       The time out value for taks, I set this because the JVMs might be
> busy in GC and this is causing timeout in Hadoop Tasks.
>       </description>
>     </property>
>
>
>
> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
> exception  for block
> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
> channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
> remote=/112.123.123.123:50010]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> Exception in thread "main" java.io.IOException: All datanodes
> 112.123.123.123:50010 are bad. Aborting...
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>
>
>
>

Re: DataNode Timeout exceptions.

Posted by Ted Yu <yu...@gmail.com>.
bq. All datanodes 112.123.123.123:50010 are bad. Aborting...

How many datanodes do you have ?

Can you check datanode namenode log ?

Cheers

On Tue, May 26, 2015 at 5:00 PM, S.L <si...@gmail.com> wrote:

> Hi All,
>
> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
> happening frequently.Can someone tell me the root cause of this issue.
>
> I have set the the property in mapred-site.xml as follows , is there any
> other property that I need to set also?
>
>     <property>
>       <name>mapreduce.task.timeout</name>
>       <value>1800000</value>
>       <description>
>       The time out value for taks, I set this because the JVMs might be
> busy in GC and this is causing timeout in Hadoop Tasks.
>       </description>
>     </property>
>
>
>
> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
> exception  for block
> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
> channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
> remote=/112.123.123.123:50010]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> 15/05/26 02:06:54 WARN security.UserGroupInformation:
> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
> All datanodes 112.123.123.123:50010 are bad. Aborting...
> Exception in thread "main" java.io.IOException: All datanodes
> 112.123.123.123:50010 are bad. Aborting...
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>
>
>
>