You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Amith sha <am...@gmail.com> on 2020/01/01 15:55:52 UTC

Re: Create a block - file map

enable DEBUG mode on org.apache.hadoop.hdfs.server.blockmanagement on
namenode.

Thanks & Regards
Amithsha


On Wed, Jan 1, 2020 at 4:55 AM Arpit Agarwal <aa...@cloudera.com.invalid>
wrote:

> That is the only way to do it using the client API.
>
> Just curious why you need the mapping.
>
>
> On Tue, Dec 31, 2019, 00:41 Davide Vergari <ve...@gmail.com>
> wrote:
>
>> Hi all,
>> I need to create a block map for all files in a specific directory (and
>> subdir) in HDFS.
>>
>> I'm using fs.listFiles API then I loop in the
>> RemoteIterator[LocatedFileStatus] returned by listFiles and for each
>> LocatedFileStatus I use the getFileBlockLocations api to get all the block
>> ids of that file, but it takes long time because I have millions of file in
>> the HDFS directory.
>> I also tried to use Spark to parallelize the execution, but HDFS' API are
>> not serializable.
>>
>> Is there a better way? I know there is the "hdfs oiv" command but I can't
>> access directly the Namenode directory, also the ImageFS file could be
>> outdated and I can't force the safemode to execute the saveNamespace
>> command.
>>
>> I'm using Scala 2.11 with Hadoop 2.7.1 (HDP 2.6.3)
>>
>> Thank you
>>
>