You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Juan Yu (JIRA)" <ji...@apache.org> on 2016/05/25 22:31:12 UTC
[jira] [Created] (HDFS-10466)
DistributedFileSystem.listLocatedStatus() should return HdfsBlockLocation
instead of BlockLocation
Juan Yu created HDFS-10466:
------------------------------
Summary: DistributedFileSystem.listLocatedStatus() should return HdfsBlockLocation instead of BlockLocation
Key: HDFS-10466
URL: https://issues.apache.org/jira/browse/HDFS-10466
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
https://issues.apache.org/jira/browse/HDFS-202 added a new API listLocatedStatus() to get all files' status with block locations for a directory. This is great that we don't need to call FileSystem.getFileBlockLocations() for each file. it's much faster (about 8-10 times).
However, the returned LocatedFileStatus only contains basic BlockLocation instead of HdfsBlockLocation, the LocatedBlock details are stripped out.
It should do the similar as DFSClient.getBlockLocations(), return HdfsBlockLocation which provide full block location details.
The implementation of DistributedFileSystem. listLocatedStatus() retrieves HdfsLocatedFileStatus which contains all information, but when convert it to LocatedFileStatus, it doesn't keep LocatedBlock data. It's a simple (and compatible) change to make to keep the LocatedBlock details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org