You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Marko Dinic <ma...@nissatech.com> on 2015/05/27 15:02:54 UTC
Files in distributed cache
Hello,
I'm new to Hadoop and a bit used by one thing about distributed cache -
when do files added to distributed cache get deleted?
I'm concretely interested in Hadoop 0.20.2.
I read the following from Hadoop the definitive guide "Files are deleted to
make room for a new file when the cache exceeds a certain size---10 GB
by default", but I'm confused - do files in distributed cache get
deleted only when the threshold of 10GB is exceeded, or they are deleted
upon job termination?
The thing is, I'm working on a multi tenant cluster and currently only
5GB is provided for me on the HDFS.
I'm getting the following exception
15/05/27 11:16:57 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.hdfs.protocol.DSQuotaExceededException:
org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace
quota of /user/fitman.whirlpool is exceeded: quota=5368709120 diskspace
consumed=5.2g
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3778)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3640)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2400(DFSClient.java:2846)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3041)
Caused by: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace
quota of /user/fitman.whirlpool is exceeded: quota=5368709120 diskspace
consumed=5.2g
at
org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:149)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1085)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:903)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:288)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1752)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1597)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:771)
at sun.reflect.GeneratedMethodAccessor2199.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1439)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1435)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1433)
at org.apache.hadoop.ipc.Client.call(Client.java:1150)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3773)
... 3 more
Since there's not much data on HDFS (which I can see using HDFS or the
client that was assigned to me), I'm guessing that files in distributed
cache accumulate overflowing the 5GBs, so I'm getting the exception.
Can someone please explain what happens with this files? And this is not
the problem, does anyone have an idea why am I getting this exception.
Best regards,
Marko