You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2009/04/07 20:06:13 UTC

[jira] Resolved: (HADOOP-1938) NameNode.create failed

     [ https://issues.apache.org/jira/browse/HADOOP-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang resolved HADOOP-1938.
-----------------------------------

    Resolution: Duplicate

HADOOP-3810 is resolved.

> NameNode.create failed 
> -----------------------
>
>                 Key: HADOOP-1938
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1938
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.1
>            Reporter: Runping Qi
>
> Under heavy load, DFS namenode fails to create file
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: Failed to create file /xxx/xxx/_task_0001_r_000001_0/part-00001 on client xxx.xxx.xxx.xxx because there were not enough datanodes available. Found 0 datanodes but MIN_REPLICATION for the cluster is configured to be 1.
> 	at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:651)
> 	at org.apache.hadoop.dfs.NameNode.create(NameNode.java:294)
> 	at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:341)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:573)
> The above problem occurred when I ran a well tuned map/reduce program on a hood node cluster.
> The program is well tuned in the sense that the map output data are evenly partitioned among 180 reducers.
> The shuffling and sorting was completed at about the same time on all the reducers.
> The reducers started reduce work at about the same time and were expected to produce about the same amount of output (2GB).
> This "synchronized" behavior caused  the reducers to try to create output dfs files at about the same time.
> The namenode seemed to have difficulty to handle that situation, causing the reducers waiting on file creation for long period of time.
> Eeventually, they failed with the above exception.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.