You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Juan Yu (JIRA)" <ji...@apache.org> on 2016/05/25 22:31:12 UTC

[jira] [Created] (HDFS-10466) DistributedFileSystem.listLocatedStatus() should return HdfsBlockLocation instead of BlockLocation

Juan Yu created HDFS-10466:
------------------------------

             Summary: DistributedFileSystem.listLocatedStatus() should return HdfsBlockLocation instead of BlockLocation
                 Key: HDFS-10466
                 URL: https://issues.apache.org/jira/browse/HDFS-10466
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs
            Reporter: Juan Yu
            Assignee: Juan Yu
            Priority: Minor


https://issues.apache.org/jira/browse/HDFS-202 added a new API listLocatedStatus() to get all files' status with block locations for a directory. This is great that we don't need to call FileSystem.getFileBlockLocations() for each file. it's much faster (about 8-10 times).
However, the returned LocatedFileStatus only contains basic BlockLocation instead of HdfsBlockLocation, the LocatedBlock details are stripped out.

It should do the similar as DFSClient.getBlockLocations(), return HdfsBlockLocation which provide full block location details.

The implementation of DistributedFileSystem. listLocatedStatus() retrieves HdfsLocatedFileStatus which contains all information, but when convert it to LocatedFileStatus, it doesn't keep LocatedBlock data. It's a simple (and compatible) change to make to keep the LocatedBlock details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org