You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Christian Kunz (JIRA)" <ji...@apache.org> on 2008/03/01 01:43:51 UTC

[jira] Commented: (HADOOP-2883) Extensive write failures

    [ https://issues.apache.org/jira/browse/HADOOP-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574036#action_12574036 ] 

Christian Kunz commented on HADOOP-2883:
----------------------------------------

The job ran much better with the higher timeout value of  180,000.
Unless we are the only ones having issues with 'all datanodes are bad' I would suggest this as the default value.

> Extensive write failures
> ------------------------
>
>                 Key: HADOOP-2883
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2883
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.16.1
>
>
> With the new release 0.16.0 we experience extensive write failures under heavy load.
> The job shuffles 300TB on 1400 nodes and runs 3 waves of 2500 reducers. Each reducer uses libhdfs to write to around 70 dfs files simultaneously. We did not experience particular write problems up to nightly build #835 on hadoopqa (Jan 28),
> but now with released 0.16.0 (candidate 2) we see a lot of exceptions related to 'all datanodes are bad':
> typical exception(s):
> 08/02/22 10:34:47 WARN fs.DFSClient: Error Recovery for block blk_434406883423887779 in pipeline xxx.xxx.xxx.146:50010, xxx.xxx.xxx.224:50010: bad datanode xxx.xxx.xxx.146:50010
> 08/02/22 10:34:51 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:34:51 WARN fs.DFSClient: Error Recovery for block blk_-1957866292089920792 in pipeline xxx.xxx.xxx.147:50010, xxx.xxx.xxx.10:50010: bad datanode xxx.xxx.xxx.147:50010
> 08/02/22 10:34:54 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:34:54 WARN fs.DFSClient: Error Recovery for block blk_-5265240773298481019 in pipeline xxx.xxx.xxx.152:50010, xxx.xxx.xxx.71:50010: bad datanode xxx.xxx.xxx.152:50010
> 08/02/22 10:34:54 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:34:54 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed outxxx.xxx.xxx.166:50010
> 08/02/22 10:34:55 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 WARN fs.DFSClient: Error Recovery for block blk_8456718220685890569 in pipeline xxx.xxx.xxx.158:50010, xxx.xxx.xxx.225:50010: bad datanode xxx.xxx.xxx.158:50010
> 08/02/22 10:35:00 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 WARN fs.DFSClient: Error Recovery for block blk_1420965154382429572 in pipeline xxx.xxx.xxx.169:50010, xxx.xxx.xxx.221:50010: bad datanode xxx.xxx.xxx.169:50010
> 08/02/22 10:35:00 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 WARN fs.DFSClient: Error Recovery for block blk_-519424763987472708 in pipeline xxx.xxx.xxx.154:50010, xxx.xxx.xxx.37:50010: bad datanode xxx.xxx.xxx.154:50010
> 08/02/22 10:35:00 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 WARN fs.DFSClient: Error Recovery for block blk_-8376556524788296783 in pipeline xxx.xxx.xxx.154:50010, xxx.xxx.xxx.212:50010: bad datanode xxx.xxx.xxx.154:50010
> 08/02/22 10:35:00 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 WARN fs.DFSClient: Error Recovery for block blk_-2429564741658530079 in pipeline xxx.xxx.xxx.160:50010, xxx.xxx.xxx.105:50010: bad datanode xxx.xxx.xxx.160:50010
> 08/02/22 10:35:00 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:00 WARN fs.DFSClient: Error Recovery for block blk_-6653210787685458124 in pipeline xxx.xxx.xxx.143:50010, xxx.xxx.xxx.37:50010: bad datanode xxx.xxx.xxx.143:50010
> 08/02/22 10:35:01 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:01 WARN fs.DFSClient: Error Recovery for block blk_7515160028005424426 in pipeline xxx.xxx.xxx.167:50010, xxx.xxx.xxx.152:50010: bad datanode xxx.xxx.xxx.167:50010
> 08/02/22 10:35:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:03 WARN fs.DFSClient: Error Recovery for block blk_-7191475898558388503 in pipeline xxx.xxx.xxx.139:50010, xxx.xxx.xxx.6:50010: bad datanode xxx.xxx.xxx.139:50010
> 08/02/22 10:35:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:03 WARN fs.DFSClient: Error Recovery for block blk_-340745015348833165 in pipeline xxx.xxx.xxx.141:50010, xxx.xxx.xxx.153:50010: bad datanode xxx.xxx.xxx.141:50010
> 08/02/22 10:35:04 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:04 WARN fs.DFSClient: Error Recovery for block blk_-6861254790596076341 in pipeline xxx.xxx.xxx.157:50010, xxx.xxx.xxx.224:50010: bad datanode xxx.xxx.xxx.157:50010
> 08/02/22 10:35:14 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:14 INFO fs.DFSClient: Abandoning block blk_6188945400680100475
> 08/02/22 10:35:14 INFO fs.DFSClient: Waiting to find target node: xxx.xxx.xxx.161:50010
> 08/02/22 10:35:43 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:47 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:48 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:49 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:49 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:50 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:50 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:50 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:53 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:54 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:57 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:35:57 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:03 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:04 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:06 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:06 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> 08/02/22 10:36:07 INFO fs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: Read timed out
> Exception in thread "main" java.io.IOException: All datanodes xxx.xxx.xxx.83:50010 are bad. Aborting...
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:1839)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1100(DFSClient.java:1479)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1571)
> Call to org.apache.hadoop.fs.FSDataOutputStream::write failed!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.