You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Guido Serra <ze...@fsfe.org> on 2014/05/30 13:52:01 UTC

listing a 530k files directory

Hi,
do you have an idea on how to look at the content of a 530k-files HDFS folder?
(yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“

-ls goes out of memory, -count with the folder/* goes out of memory … 
I’d like at least at the first 10 file names, see the size, maybe open one

thanks,
G.

Re: listing a 530k files directory

Posted by Adam Kawa <ka...@gmail.com>.
You can try snakebite https://github.com/spotify/snakebite.

$ snakebite ls -R <path>

I just run it to list 705K files and it went fine.



2014-05-30 20:42 GMT+02:00 Harsh J <ha...@cloudera.com>:

> The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
> utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
> instead, for FsShell and other client applications such as "hadoop
> fs"/"hdfs dfs"/"hadoop jar", etc..
>
> On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
> <bh...@gmail.com> wrote:
> > Hi Guido,
> >
> > You can set client side heap in HADOOP_OPTS variable before running the
> ls
> > command.
> >
> > export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
> >
> > - Bharath
> >
> >
> > On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
> >>
> >> Hi,
> >> do you have an idea on how to look at the content of a 530k-files HDFS
> >> folder?
> >> (yes, I know it is a bad idea to have such setup, but that’s the status
> >> and I’d like to debug it)
> >> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> >> folder/“
> >>
> >> -ls goes out of memory, -count with the folder/* goes out of memory …
> >> I’d like at least at the first 10 file names, see the size, maybe open
> one
> >>
> >> thanks,
> >> G.
> >
> >
>
>
>
> --
> Harsh J
>

Re: listing a 530k files directory

Posted by Adam Kawa <ka...@gmail.com>.
You can try snakebite https://github.com/spotify/snakebite.

$ snakebite ls -R <path>

I just run it to list 705K files and it went fine.



2014-05-30 20:42 GMT+02:00 Harsh J <ha...@cloudera.com>:

> The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
> utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
> instead, for FsShell and other client applications such as "hadoop
> fs"/"hdfs dfs"/"hadoop jar", etc..
>
> On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
> <bh...@gmail.com> wrote:
> > Hi Guido,
> >
> > You can set client side heap in HADOOP_OPTS variable before running the
> ls
> > command.
> >
> > export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
> >
> > - Bharath
> >
> >
> > On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
> >>
> >> Hi,
> >> do you have an idea on how to look at the content of a 530k-files HDFS
> >> folder?
> >> (yes, I know it is a bad idea to have such setup, but that’s the status
> >> and I’d like to debug it)
> >> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> >> folder/“
> >>
> >> -ls goes out of memory, -count with the folder/* goes out of memory …
> >> I’d like at least at the first 10 file names, see the size, maybe open
> one
> >>
> >> thanks,
> >> G.
> >
> >
>
>
>
> --
> Harsh J
>

Re: listing a 530k files directory

Posted by Adam Kawa <ka...@gmail.com>.
You can try snakebite https://github.com/spotify/snakebite.

$ snakebite ls -R <path>

I just run it to list 705K files and it went fine.



2014-05-30 20:42 GMT+02:00 Harsh J <ha...@cloudera.com>:

> The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
> utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
> instead, for FsShell and other client applications such as "hadoop
> fs"/"hdfs dfs"/"hadoop jar", etc..
>
> On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
> <bh...@gmail.com> wrote:
> > Hi Guido,
> >
> > You can set client side heap in HADOOP_OPTS variable before running the
> ls
> > command.
> >
> > export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
> >
> > - Bharath
> >
> >
> > On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
> >>
> >> Hi,
> >> do you have an idea on how to look at the content of a 530k-files HDFS
> >> folder?
> >> (yes, I know it is a bad idea to have such setup, but that’s the status
> >> and I’d like to debug it)
> >> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> >> folder/“
> >>
> >> -ls goes out of memory, -count with the folder/* goes out of memory …
> >> I’d like at least at the first 10 file names, see the size, maybe open
> one
> >>
> >> thanks,
> >> G.
> >
> >
>
>
>
> --
> Harsh J
>

Re: listing a 530k files directory

Posted by Adam Kawa <ka...@gmail.com>.
You can try snakebite https://github.com/spotify/snakebite.

$ snakebite ls -R <path>

I just run it to list 705K files and it went fine.



2014-05-30 20:42 GMT+02:00 Harsh J <ha...@cloudera.com>:

> The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
> utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
> instead, for FsShell and other client applications such as "hadoop
> fs"/"hdfs dfs"/"hadoop jar", etc..
>
> On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
> <bh...@gmail.com> wrote:
> > Hi Guido,
> >
> > You can set client side heap in HADOOP_OPTS variable before running the
> ls
> > command.
> >
> > export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
> >
> > - Bharath
> >
> >
> > On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
> >>
> >> Hi,
> >> do you have an idea on how to look at the content of a 530k-files HDFS
> >> folder?
> >> (yes, I know it is a bad idea to have such setup, but that’s the status
> >> and I’d like to debug it)
> >> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> >> folder/“
> >>
> >> -ls goes out of memory, -count with the folder/* goes out of memory …
> >> I’d like at least at the first 10 file names, see the size, maybe open
> one
> >>
> >> thanks,
> >> G.
> >
> >
>
>
>
> --
> Harsh J
>

Re: listing a 530k files directory

Posted by Harsh J <ha...@cloudera.com>.
The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
instead, for FsShell and other client applications such as "hadoop
fs"/"hdfs dfs"/"hadoop jar", etc..

On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
<bh...@gmail.com> wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running the ls
> command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>
>> Hi,
>> do you have an idea on how to look at the content of a 530k-files HDFS
>> folder?
>> (yes, I know it is a bad idea to have such setup, but that’s the status
>> and I’d like to debug it)
>> and the only tool that doesn’t go out of memory is "hdfs dfs -count
>> folder/“
>>
>> -ls goes out of memory, -count with the folder/* goes out of memory …
>> I’d like at least at the first 10 file names, see the size, maybe open one
>>
>> thanks,
>> G.
>
>



-- 
Harsh J

Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
forgot to mention… it is CDH 4.6.0

On 30 May 2014, at 15:08, Guido Serra <ze...@fsfe.org> wrote:

> guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/                                                                                        
> 14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediately.
> 14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 fail over attempts. Trying to fail over after sleeping for 935ms.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 fail over attempts. Trying to fail over after sleeping for 5408ms.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 fail over attempts. Trying to fail over after sleeping for 14316ms.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 fail over attempts. Trying to fail over after sleeping for 8216ms.
> 14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 fail over attempts. Trying to fail over after sleeping for 18917ms.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 fail over attempts. Trying to fail over after sleeping for 16386ms.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 fail over attempts. Trying to fail over after sleeping for 20387ms.
> ^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. Not retrying because failovers (15) exceeded maximum allowed (15)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
> 
>         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
>         at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
>         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
>         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
>         at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>         at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>         at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
> ls: Operation category READ is not supported in state standby
> 
> On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
>> Listing such a directory should not be a big problem. Can you cut and paste the command output. 
>> 
>> Which release are you using?
>> 
>> Sent from phone
>> 
>> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
>> 
>>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
>>> 
>>> thanks anyhow
>>> 
>>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>>> Hi Guido,
>>>> 
>>>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>>>> 
>>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>> 
>>>> - Bharath
>>>> 
>>>> 
>>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>>> Hi,
>>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>>> 
>>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>>> 
>>>> thanks,
>>>> G.
>>>> 
>>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
forgot to mention… it is CDH 4.6.0

On 30 May 2014, at 15:08, Guido Serra <ze...@fsfe.org> wrote:

> guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/                                                                                        
> 14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediately.
> 14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 fail over attempts. Trying to fail over after sleeping for 935ms.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 fail over attempts. Trying to fail over after sleeping for 5408ms.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 fail over attempts. Trying to fail over after sleeping for 14316ms.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 fail over attempts. Trying to fail over after sleeping for 8216ms.
> 14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 fail over attempts. Trying to fail over after sleeping for 18917ms.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 fail over attempts. Trying to fail over after sleeping for 16386ms.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 fail over attempts. Trying to fail over after sleeping for 20387ms.
> ^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. Not retrying because failovers (15) exceeded maximum allowed (15)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
> 
>         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
>         at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
>         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
>         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
>         at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>         at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>         at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
> ls: Operation category READ is not supported in state standby
> 
> On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
>> Listing such a directory should not be a big problem. Can you cut and paste the command output. 
>> 
>> Which release are you using?
>> 
>> Sent from phone
>> 
>> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
>> 
>>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
>>> 
>>> thanks anyhow
>>> 
>>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>>> Hi Guido,
>>>> 
>>>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>>>> 
>>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>> 
>>>> - Bharath
>>>> 
>>>> 
>>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>>> Hi,
>>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>>> 
>>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>>> 
>>>> thanks,
>>>> G.
>>>> 
>>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
forgot to mention… it is CDH 4.6.0

On 30 May 2014, at 15:08, Guido Serra <ze...@fsfe.org> wrote:

> guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/                                                                                        
> 14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediately.
> 14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 fail over attempts. Trying to fail over after sleeping for 935ms.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 fail over attempts. Trying to fail over after sleeping for 5408ms.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 fail over attempts. Trying to fail over after sleeping for 14316ms.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 fail over attempts. Trying to fail over after sleeping for 8216ms.
> 14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 fail over attempts. Trying to fail over after sleeping for 18917ms.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 fail over attempts. Trying to fail over after sleeping for 16386ms.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 fail over attempts. Trying to fail over after sleeping for 20387ms.
> ^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. Not retrying because failovers (15) exceeded maximum allowed (15)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
> 
>         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
>         at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
>         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
>         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
>         at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>         at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>         at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
> ls: Operation category READ is not supported in state standby
> 
> On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
>> Listing such a directory should not be a big problem. Can you cut and paste the command output. 
>> 
>> Which release are you using?
>> 
>> Sent from phone
>> 
>> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
>> 
>>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
>>> 
>>> thanks anyhow
>>> 
>>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>>> Hi Guido,
>>>> 
>>>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>>>> 
>>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>> 
>>>> - Bharath
>>>> 
>>>> 
>>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>>> Hi,
>>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>>> 
>>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>>> 
>>>> thanks,
>>>> G.
>>>> 
>>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
forgot to mention… it is CDH 4.6.0

On 30 May 2014, at 15:08, Guido Serra <ze...@fsfe.org> wrote:

> guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/                                                                                        
> 14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediately.
> 14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 fail over attempts. Trying to fail over after sleeping for 935ms.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 fail over attempts. Trying to fail over after sleeping for 5408ms.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 fail over attempts. Trying to fail over after sleeping for 14316ms.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 fail over attempts. Trying to fail over after sleeping for 8216ms.
> 14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 fail over attempts. Trying to fail over after sleeping for 18917ms.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 fail over attempts. Trying to fail over after sleeping for 16386ms.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 fail over attempts. Trying to fail over after sleeping for 20387ms.
> ^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking getListing of class ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying to fail over immediately.
> 14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. Not retrying because failovers (15) exceeded maximum allowed (15)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
> 
>         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
>         at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
>         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
>         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
>         at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>         at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>         at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
> ls: Operation category READ is not supported in state standby
> 
> On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
>> Listing such a directory should not be a big problem. Can you cut and paste the command output. 
>> 
>> Which release are you using?
>> 
>> Sent from phone
>> 
>> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
>> 
>>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
>>> 
>>> thanks anyhow
>>> 
>>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>>> Hi Guido,
>>>> 
>>>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>>>> 
>>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>> 
>>>> - Bharath
>>>> 
>>>> 
>>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>>> Hi,
>>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>>> 
>>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>>> 
>>>> thanks,
>>>> G.
>>>> 
>>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/
14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying 
to fail over immediately.
14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 
fail over attempts. Trying to fail over after sleeping for 935ms.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 
fail over attempts. Trying to fail over after sleeping for 5408ms.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 
fail over attempts. Trying to fail over after sleeping for 14316ms.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 
fail over attempts. Trying to fail over after sleeping for 8216ms.
14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 
fail over attempts. Trying to fail over after sleeping for 18917ms.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 
fail over attempts. Trying to fail over after sleeping for 16386ms.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 
fail over attempts. Trying to fail over immediately.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 
fail over attempts. Trying to fail over after sleeping for 20387ms.
^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: 
Exception while invoking getListing of class 
ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying 
to fail over immediately.
14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while 
invoking class 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. 
Not retrying because failovers (15) exceeded maximum allowed (15)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby
         at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
         at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:415)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)

         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
         at 
org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
         at 
org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
         at 
org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
         at 
org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
ls: Operation category READ is not supported in state standby

On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
> Listing such a directory should not be a big problem. Can you cut and 
> paste the command output.
>
> Which release are you using?
>
> Sent from phone
>
> On May 30, 2014, at 5:49 AM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll 
>> ... "GC overhead limit exceed")
>>
>> thanks anyhow
>>
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>> Hi Guido,
>>>
>>> You can set client side heap in HADOOP_OPTS variable before running 
>>> the ls command.
>>>
>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>
>>> - Bharath
>>>
>>>
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
>>> <ma...@fsfe.org>> wrote:
>>>
>>>     Hi,
>>>     do you have an idea on how to look at the content of a
>>>     530k-files HDFS folder?
>>>     (yes, I know it is a bad idea to have such setup, but that’s the
>>>     status and I’d like to debug it)
>>>     and the only tool that doesn’t go out of memory is "hdfs dfs
>>>     -count folder/“
>>>
>>>     -ls goes out of memory, -count with the folder/* goes out of
>>>     memory …
>>>     I’d like at least at the first 10 file names, see the size,
>>>     maybe open one
>>>
>>>     thanks,
>>>     G.
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/
14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying 
to fail over immediately.
14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 
fail over attempts. Trying to fail over after sleeping for 935ms.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 
fail over attempts. Trying to fail over after sleeping for 5408ms.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 
fail over attempts. Trying to fail over after sleeping for 14316ms.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 
fail over attempts. Trying to fail over after sleeping for 8216ms.
14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 
fail over attempts. Trying to fail over after sleeping for 18917ms.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 
fail over attempts. Trying to fail over after sleeping for 16386ms.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 
fail over attempts. Trying to fail over immediately.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 
fail over attempts. Trying to fail over after sleeping for 20387ms.
^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: 
Exception while invoking getListing of class 
ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying 
to fail over immediately.
14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while 
invoking class 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. 
Not retrying because failovers (15) exceeded maximum allowed (15)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby
         at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
         at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:415)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)

         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
         at 
org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
         at 
org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
         at 
org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
         at 
org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
ls: Operation category READ is not supported in state standby

On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
> Listing such a directory should not be a big problem. Can you cut and 
> paste the command output.
>
> Which release are you using?
>
> Sent from phone
>
> On May 30, 2014, at 5:49 AM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll 
>> ... "GC overhead limit exceed")
>>
>> thanks anyhow
>>
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>> Hi Guido,
>>>
>>> You can set client side heap in HADOOP_OPTS variable before running 
>>> the ls command.
>>>
>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>
>>> - Bharath
>>>
>>>
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
>>> <ma...@fsfe.org>> wrote:
>>>
>>>     Hi,
>>>     do you have an idea on how to look at the content of a
>>>     530k-files HDFS folder?
>>>     (yes, I know it is a bad idea to have such setup, but that’s the
>>>     status and I’d like to debug it)
>>>     and the only tool that doesn’t go out of memory is "hdfs dfs
>>>     -count folder/“
>>>
>>>     -ls goes out of memory, -count with the folder/* goes out of
>>>     memory …
>>>     I’d like at least at the first 10 file names, see the size,
>>>     maybe open one
>>>
>>>     thanks,
>>>     G.
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/
14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying 
to fail over immediately.
14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 
fail over attempts. Trying to fail over after sleeping for 935ms.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 
fail over attempts. Trying to fail over after sleeping for 5408ms.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 
fail over attempts. Trying to fail over after sleeping for 14316ms.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 
fail over attempts. Trying to fail over after sleeping for 8216ms.
14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 
fail over attempts. Trying to fail over after sleeping for 18917ms.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 
fail over attempts. Trying to fail over after sleeping for 16386ms.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 
fail over attempts. Trying to fail over immediately.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 
fail over attempts. Trying to fail over after sleeping for 20387ms.
^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: 
Exception while invoking getListing of class 
ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying 
to fail over immediately.
14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while 
invoking class 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. 
Not retrying because failovers (15) exceeded maximum allowed (15)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby
         at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
         at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:415)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)

         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
         at 
org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
         at 
org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
         at 
org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
         at 
org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
ls: Operation category READ is not supported in state standby

On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
> Listing such a directory should not be a big problem. Can you cut and 
> paste the command output.
>
> Which release are you using?
>
> Sent from phone
>
> On May 30, 2014, at 5:49 AM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll 
>> ... "GC overhead limit exceed")
>>
>> thanks anyhow
>>
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>> Hi Guido,
>>>
>>> You can set client side heap in HADOOP_OPTS variable before running 
>>> the ls command.
>>>
>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>
>>> - Bharath
>>>
>>>
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
>>> <ma...@fsfe.org>> wrote:
>>>
>>>     Hi,
>>>     do you have an idea on how to look at the content of a
>>>     530k-files HDFS folder?
>>>     (yes, I know it is a bad idea to have such setup, but that’s the
>>>     status and I’d like to debug it)
>>>     and the only tool that doesn’t go out of memory is "hdfs dfs
>>>     -count folder/“
>>>
>>>     -ls goes out of memory, -count with the folder/* goes out of
>>>     memory …
>>>     I’d like at least at the first 10 file names, see the size,
>>>     maybe open one
>>>
>>>     thanks,
>>>     G.
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
guido@hd11 ~ $ export HADOOP_OPTS=-Xmx3g;hdfs dfs -ls /logs/2014-05-28/
14/05/30 13:05:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB. Trying 
to fail over immediately.
14/05/30 13:05:45 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 1 
fail over attempts. Trying to fail over after sleeping for 935ms.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 2 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:48 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 3 
fail over attempts. Trying to fail over after sleeping for 5408ms.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 4 
fail over attempts. Trying to fail over immediately.
14/05/30 13:05:55 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 5 
fail over attempts. Trying to fail over after sleeping for 14316ms.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 6 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:12 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 7 
fail over attempts. Trying to fail over after sleeping for 8216ms.
14/05/30 13:06:22 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 8 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:23 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 9 
fail over attempts. Trying to fail over after sleeping for 18917ms.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 10 
fail over attempts. Trying to fail over immediately.
14/05/30 13:06:44 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 11 
fail over attempts. Trying to fail over after sleeping for 16386ms.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 12 
fail over attempts. Trying to fail over immediately.
14/05/30 13:07:03 WARN retry.RetryInvocationHandler: Exception while 
invoking getListing of class ClientNamenodeProtocolTranslatorPB after 13 
fail over attempts. Trying to fail over after sleeping for 20387ms.
^[[B^[[B^[[B^[[B14/05/30 13:07:26 WARN retry.RetryInvocationHandler: 
Exception while invoking getListing of class 
ClientNamenodeProtocolTranslatorPB after 14 fail over attempts. Trying 
to fail over immediately.
14/05/30 13:07:26 WARN retry.RetryInvocationHandler: Exception while 
invoking class 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing. 
Not retrying because failovers (15) exceeded maximum allowed (15)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby
         at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1416)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:969)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3542)
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3530)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:682)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:433)
         at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44972)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:415)
         at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)

         at org.apache.hadoop.ipc.Client.call(Client.java:1238)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:441)
         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1526)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1509)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:437)
         at 
org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:213)
         at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:337)
         at org.apache.hadoop.fs.shell.Ls.processPathArgument(Ls.java:89)
         at 
org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
         at 
org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
         at 
org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
         at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
         at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
         at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
ls: Operation category READ is not supported in state standby

On 05/30/2014 03:03 PM, Suresh Srinivas wrote:
> Listing such a directory should not be a big problem. Can you cut and 
> paste the command output.
>
> Which release are you using?
>
> Sent from phone
>
> On May 30, 2014, at 5:49 AM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>> already tried, didn't work (24cores at 100% and a-lot-memory, stilll 
>> ... "GC overhead limit exceed")
>>
>> thanks anyhow
>>
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>>> Hi Guido,
>>>
>>> You can set client side heap in HADOOP_OPTS variable before running 
>>> the ls command.
>>>
>>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>>>
>>> - Bharath
>>>
>>>
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
>>> <ma...@fsfe.org>> wrote:
>>>
>>>     Hi,
>>>     do you have an idea on how to look at the content of a
>>>     530k-files HDFS folder?
>>>     (yes, I know it is a bad idea to have such setup, but that’s the
>>>     status and I’d like to debug it)
>>>     and the only tool that doesn’t go out of memory is "hdfs dfs
>>>     -count folder/“
>>>
>>>     -ls goes out of memory, -count with the folder/* goes out of
>>>     memory …
>>>     I’d like at least at the first 10 file names, see the size,
>>>     maybe open one
>>>
>>>     thanks,
>>>     G.
>>>
>>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: listing a 530k files directory

Posted by Suresh Srinivas <su...@hortonworks.com>.
Listing such a directory should not be a big problem. Can you cut and paste the command output. 

Which release are you using?

Sent from phone

> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
> 
> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
> 
> thanks anyhow
> 
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>> Hi Guido,
>> 
>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>> 
>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>> 
>> - Bharath
>> 
>> 
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>> Hi,
>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>> 
>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>> 
>>> thanks,
>>> G.
>> 
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: listing a 530k files directory

Posted by Suresh Srinivas <su...@hortonworks.com>.
Listing such a directory should not be a big problem. Can you cut and paste the command output. 

Which release are you using?

Sent from phone

> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
> 
> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
> 
> thanks anyhow
> 
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>> Hi Guido,
>> 
>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>> 
>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>> 
>> - Bharath
>> 
>> 
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>> Hi,
>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>> 
>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>> 
>>> thanks,
>>> G.
>> 
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: listing a 530k files directory

Posted by Suresh Srinivas <su...@hortonworks.com>.
Listing such a directory should not be a big problem. Can you cut and paste the command output. 

Which release are you using?

Sent from phone

> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
> 
> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
> 
> thanks anyhow
> 
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>> Hi Guido,
>> 
>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>> 
>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>> 
>> - Bharath
>> 
>> 
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>> Hi,
>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>> 
>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>> 
>>> thanks,
>>> G.
>> 
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: listing a 530k files directory

Posted by Suresh Srinivas <su...@hortonworks.com>.
Listing such a directory should not be a big problem. Can you cut and paste the command output. 

Which release are you using?

Sent from phone

> On May 30, 2014, at 5:49 AM, Guido Serra <ze...@fsfe.org> wrote:
> 
> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC overhead limit exceed")
> 
> thanks anyhow
> 
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>> Hi Guido,
>> 
>> You can set client side heap in HADOOP_OPTS variable before running the ls command.
>> 
>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>> 
>> - Bharath
>> 
>> 
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>> Hi,
>>> do you have an idea on how to look at the content of a 530k-files HDFS folder?
>>> (yes, I know it is a bad idea to have such setup, but that’s the status and I’d like to debug it)
>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>> 
>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>> 
>>> thanks,
>>> G.
>> 
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... 
"GC overhead limit exceed")

thanks anyhow

On 05/30/2014 02:43 PM, bharath vissapragada wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running 
> the ls command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>     Hi,
>     do you have an idea on how to look at the content of a 530k-files
>     HDFS folder?
>     (yes, I know it is a bad idea to have such setup, but that’s the
>     status and I’d like to debug it)
>     and the only tool that doesn’t go out of memory is "hdfs dfs
>     -count folder/“
>
>     -ls goes out of memory, -count with the folder/* goes out of memory …
>     I’d like at least at the first 10 file names, see the size, maybe
>     open one
>
>     thanks,
>     G.
>
>


Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... 
"GC overhead limit exceed")

thanks anyhow

On 05/30/2014 02:43 PM, bharath vissapragada wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running 
> the ls command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>     Hi,
>     do you have an idea on how to look at the content of a 530k-files
>     HDFS folder?
>     (yes, I know it is a bad idea to have such setup, but that’s the
>     status and I’d like to debug it)
>     and the only tool that doesn’t go out of memory is "hdfs dfs
>     -count folder/“
>
>     -ls goes out of memory, -count with the folder/* goes out of memory …
>     I’d like at least at the first 10 file names, see the size, maybe
>     open one
>
>     thanks,
>     G.
>
>


Re: listing a 530k files directory

Posted by Harsh J <ha...@cloudera.com>.
The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
instead, for FsShell and other client applications such as "hadoop
fs"/"hdfs dfs"/"hadoop jar", etc..

On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
<bh...@gmail.com> wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running the ls
> command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>
>> Hi,
>> do you have an idea on how to look at the content of a 530k-files HDFS
>> folder?
>> (yes, I know it is a bad idea to have such setup, but that’s the status
>> and I’d like to debug it)
>> and the only tool that doesn’t go out of memory is "hdfs dfs -count
>> folder/“
>>
>> -ls goes out of memory, -count with the folder/* goes out of memory …
>> I’d like at least at the first 10 file names, see the size, maybe open one
>>
>> thanks,
>> G.
>
>



-- 
Harsh J

Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... 
"GC overhead limit exceed")

thanks anyhow

On 05/30/2014 02:43 PM, bharath vissapragada wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running 
> the ls command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>     Hi,
>     do you have an idea on how to look at the content of a 530k-files
>     HDFS folder?
>     (yes, I know it is a bad idea to have such setup, but that’s the
>     status and I’d like to debug it)
>     and the only tool that doesn’t go out of memory is "hdfs dfs
>     -count folder/“
>
>     -ls goes out of memory, -count with the folder/* goes out of memory …
>     I’d like at least at the first 10 file names, see the size, maybe
>     open one
>
>     thanks,
>     G.
>
>


Re: listing a 530k files directory

Posted by Harsh J <ha...@cloudera.com>.
The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
instead, for FsShell and other client applications such as "hadoop
fs"/"hdfs dfs"/"hadoop jar", etc..

On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
<bh...@gmail.com> wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running the ls
> command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>
>> Hi,
>> do you have an idea on how to look at the content of a 530k-files HDFS
>> folder?
>> (yes, I know it is a bad idea to have such setup, but that’s the status
>> and I’d like to debug it)
>> and the only tool that doesn’t go out of memory is "hdfs dfs -count
>> folder/“
>>
>> -ls goes out of memory, -count with the folder/* goes out of memory …
>> I’d like at least at the first 10 file names, see the size, maybe open one
>>
>> thanks,
>> G.
>
>



-- 
Harsh J

Re: listing a 530k files directory

Posted by Harsh J <ha...@cloudera.com>.
The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
instead, for FsShell and other client applications such as "hadoop
fs"/"hdfs dfs"/"hadoop jar", etc..

On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
<bh...@gmail.com> wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running the ls
> command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:
>>
>> Hi,
>> do you have an idea on how to look at the content of a 530k-files HDFS
>> folder?
>> (yes, I know it is a bad idea to have such setup, but that’s the status
>> and I’d like to debug it)
>> and the only tool that doesn’t go out of memory is "hdfs dfs -count
>> folder/“
>>
>> -ls goes out of memory, -count with the folder/* goes out of memory …
>> I’d like at least at the first 10 file names, see the size, maybe open one
>>
>> thanks,
>> G.
>
>



-- 
Harsh J

Re: listing a 530k files directory

Posted by Guido Serra <ze...@fsfe.org>.
already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... 
"GC overhead limit exceed")

thanks anyhow

On 05/30/2014 02:43 PM, bharath vissapragada wrote:
> Hi Guido,
>
> You can set client side heap in HADOOP_OPTS variable before running 
> the ls command.
>
> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>
> - Bharath
>
>
> On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org 
> <ma...@fsfe.org>> wrote:
>
>     Hi,
>     do you have an idea on how to look at the content of a 530k-files
>     HDFS folder?
>     (yes, I know it is a bad idea to have such setup, but that’s the
>     status and I’d like to debug it)
>     and the only tool that doesn’t go out of memory is "hdfs dfs
>     -count folder/“
>
>     -ls goes out of memory, -count with the folder/* goes out of memory …
>     I’d like at least at the first 10 file names, see the size, maybe
>     open one
>
>     thanks,
>     G.
>
>


Re: listing a 530k files directory

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Guido,

You can set client side heap in HADOOP_OPTS variable before running the ls
command.

export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /

- Bharath


On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:

> Hi,
> do you have an idea on how to look at the content of a 530k-files HDFS
> folder?
> (yes, I know it is a bad idea to have such setup, but that’s the status
> and I’d like to debug it)
> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> folder/“
>
> -ls goes out of memory, -count with the folder/* goes out of memory …
> I’d like at least at the first 10 file names, see the size, maybe open one
>
> thanks,
> G.

Re: listing a 530k files directory

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Guido,

You can set client side heap in HADOOP_OPTS variable before running the ls
command.

export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /

- Bharath


On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:

> Hi,
> do you have an idea on how to look at the content of a 530k-files HDFS
> folder?
> (yes, I know it is a bad idea to have such setup, but that’s the status
> and I’d like to debug it)
> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> folder/“
>
> -ls goes out of memory, -count with the folder/* goes out of memory …
> I’d like at least at the first 10 file names, see the size, maybe open one
>
> thanks,
> G.

Re: listing a 530k files directory

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Guido,

You can set client side heap in HADOOP_OPTS variable before running the ls
command.

export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /

- Bharath


On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:

> Hi,
> do you have an idea on how to look at the content of a 530k-files HDFS
> folder?
> (yes, I know it is a bad idea to have such setup, but that’s the status
> and I’d like to debug it)
> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> folder/“
>
> -ls goes out of memory, -count with the folder/* goes out of memory …
> I’d like at least at the first 10 file names, see the size, maybe open one
>
> thanks,
> G.

Re: listing a 530k files directory

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Guido,

You can set client side heap in HADOOP_OPTS variable before running the ls
command.

export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /

- Bharath


On Fri, May 30, 2014 at 5:22 PM, Guido Serra <ze...@fsfe.org> wrote:

> Hi,
> do you have an idea on how to look at the content of a 530k-files HDFS
> folder?
> (yes, I know it is a bad idea to have such setup, but that’s the status
> and I’d like to debug it)
> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> folder/“
>
> -ls goes out of memory, -count with the folder/* goes out of memory …
> I’d like at least at the first 10 file names, see the size, maybe open one
>
> thanks,
> G.