You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "zhaoyunjiong (JIRA)" <ji...@apache.org> on 2014/07/01 09:09:24 UTC

[jira] [Created] (HDFS-6616) bestNode shouldn't always return the first DataNode

zhaoyunjiong created HDFS-6616:
----------------------------------

             Summary: bestNode shouldn't always return the first DataNode
                 Key: HDFS-6616
                 URL: https://issues.apache.org/jira/browse/HDFS-6616
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: zhaoyunjiong
            Assignee: zhaoyunjiong
            Priority: Minor


When we are doing distcp between clusters, job failed:
014-06-30 20:56:28,430 INFO org.apache.hadoop.tools.DistCp: FAIL part-r-00101.avro : java.net.NoRouteToHostException: No route to host
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
	at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:322)
	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
	at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:419)
	at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
	at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

The root reason is one of the DataNode can't access from outside, but inside cluster, it's health.
In NamenodeWebHdfsMethods.java:bestNode, it always return the first DataNode, so even after the distcp retries, it still failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)