You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Adam Kawa <ka...@gmail.com> on 2014/07/09 16:16:33 UTC

Re: listing a 530k files directory

You can try snakebite https://github.com/spotify/snakebite.

$ snakebite ls -R <path>

I just run it to list 705K files and it went fine.



2014-05-30 20:42 GMT+02:00 Harsh J <ha...@cloudera.com>:

> The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
> utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
> instead, for FsShell and other client applications such as "hadoop
> fs"/"hdfs dfs"/"hadoop jar", etc..
>
> On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
> <bh...@gmail.com> wrote:
> > Hi Guido,
> >
> > You can set client side heap in HADOOP_OPTS variable before running the
> ls
> > command.
> >
> > export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
> >
> > - Bharath
> >
> >
> > On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
> >>
> >> Hi,
> >> do you have an idea on how to look at the content of a 530k-files HDFS
> >> folder?
> >> (yes, I know it is a bad idea to have such setup, but that’s the status
> >> and I’d like to debug it)
> >> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> >> folder/“
> >>
> >> -ls goes out of memory, -count with the folder/* goes out of memory …
> >> I’d like at least at the first 10 file names, see the size, maybe open
> one
> >>
> >> thanks,
> >> G.
> >
> >
>
>
>
> --
> Harsh J
>