You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Hong Tang (JIRA)" <ji...@apache.org> on 2009/11/19 15:16:39 UTC

[jira] Created: (HDFS-778) DistributedFileSystem.getFileBlockLocations() may occasionally return numeric ips as hostnames.

DistributedFileSystem.getFileBlockLocations() may occasionally return numeric ips as hostnames.
-----------------------------------------------------------------------------------------------

                 Key: HDFS-778
                 URL: https://issues.apache.org/jira/browse/HDFS-778
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Hong Tang


DistributedFileSystem.getFileBlockLocations() may occasionally return numeric ips as hostnames. This seems to be a breach of the FileSystem.getFileBlockLocation() contract:
{noformat}
  /**
   * Return an array containing hostnames, offset and size of 
   * portions of the given file.  For a nonexistent 
   * file or regions, null will be returned.
   *
   * This call is most helpful with DFS, where it returns 
   * hostnames of machines that contain the given file.
   *
   * The FileSystem will simply return an elt containing 'localhost'.
   */
  public BlockLocation[] getFileBlockLocations(FileStatus file, 
      long start, long len) throws IOException
{noformat}

One (maybe minor) consequence of this issue is: When a job includes such numeric ips in in its splits' locations, JobTracker would not be able to assign the job's map tasks local to the file blocks.

We should either fix the implementation or change the contract. In the latter case, JobTracker needs to be fixed to maintain both the hostnames and ips of the TaskTrackers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.