You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Lu.Qin" <lu...@gmail.com> on 2015/01/22 07:45:12 UTC

why a error about replicated

Hi,I have a Accumulo clusters and it run 10 days ,but it show me many errors now.


2015-01-22 13:04:21,161 [hdfs.DFSClient] WARN : Error while syncing
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/+9997/226dce4f-4e14-4704-b811-532afe0b0fb3 could only be replicated to 0 nodes instead
of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)


    at org.apache.hadoop.ipc.Client.call(Client.java:1411)
    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)


I use hadoop fs to put a file into hadoop ,and it works good,and the file has 2 replicates.Why accumulo can not work ?


And I see there are so many file only 0B in /accumulo/wal/***/,why?


Thanks.

Re: why a error about replicated

Posted by John Vines <vi...@apache.org>.
Even if it's not interceptable, we should put in a ticket to hdfs to see
about improving either the message itself or the catchability of it.

On Thu, Jan 22, 2015 at 11:09 AM, Josh Elser <jo...@gmail.com> wrote:

> I know I've run into it before (hence why I brought it up) -- I also don't
> remember 100% if there's another "common" reason for having non-zero
> datanodes participating with none excluded.
>
> I'm also not sure how this manifests itself in code, but, assuming it's
> something identifiable, we could try to catch this case and give a better
> error message (and maybe clean up the loads of empty files we create while
> we spin trying to "make" the file).
>
> Mike Drob wrote:
>
>> Has this error come up before? Is there room for us to intercept that
>> stack trace and provide a "check that HDFS has space left" message? This
>> might be especially relevant after we;ve removed the hadoop info box on
>> the monitor.
>>
>> On Thu, Jan 22, 2015 at 8:30 AM, Josh Elser <josh.elser@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>     How much free space do you still have in HDFS? If hdfs doesn't have
>>     enough free space to make the file, I believe you'll see the car
>>     that you have outlined. The way we create the file will also end up
>>     requiring at least one GB with the default configuration.
>>
>>     Also make sure to take into account any reserved percent of hdfs
>>     when considering the hdfs usage.
>>
>>     On Jan 22, 2015 1:46 AM, "Lu.Qin" <luq.java@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>
>>         Hi,I have a Accumulo clusters and it run 10 days ,but it show me
>>         many errors now.
>>
>>         2015-01-22 13:04:21,161 [hdfs.DFSClient] WARN : Error while
>> syncing
>>         org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
>>         /accumulo/wal/+9997/226dce4f-4e14-4704-b811-532afe0b0fb3 could
>>         only be replicated to 0 nodes instead
>>           of minReplication (=1).  There are 3 datanode(s) running and
>>         no node(s) are excluded in this operation.
>>                  at
>>         org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.
>> chooseTarget(BlockManager.java:1471)
>>                  at
>>         org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
>> getAdditionalBlock(FSNamesystem.java:2791)
>>                  at
>>         org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.
>> addBlock(NameNodeRpcServer.java:606)
>>                  at
>>         org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSi
>> deTranslatorPB.addBlock(ClientNamenodeProtocolServerSi
>> deTranslatorPB.java:455)
>>                  at
>>         org.apache.hadoop.hdfs.protocol.proto.
>> ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(
>> ClientNamenodeProtocolProtos.java)
>>                  at
>>         org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>>                  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>                  at
>>         org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>>                  at
>>         org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>>                  at java.security.AccessController.doPrivileged(Native
>>         Method)
>>                  at javax.security.auth.Subject.doAs(Subject.java:415)
>>                  at
>>         org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1614)
>>                  at
>>         org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>>
>>                  at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>>                  at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>>                  at
>>         org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
>> invoke(ProtobufRpcEngine.java:206)
>>                  at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>>                  at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown
>>         Source)
>>                  at
>>         sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>>                  at java.lang.reflect.Method.invoke(Method.java:606)
>>                  at
>>         org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(
>> RetryInvocationHandler.java:187)
>>                  at
>>         org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
>> RetryInvocationHandler.java:102)
>>                  at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>>                  at
>>         org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslat
>> orPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
>>                  at
>>         org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> locateFollowingBlock(DFSOutputStream.java:1449)
>>                  at
>>         org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> nextBlockOutputStream(DFSOutputStream.java:1270)
>>                  at
>>         org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> run(DFSOutputStream.java:526)
>>
>>         I use hadoop fs to put a file into hadoop ,and it works good,and
>>         the file has 2 replicates.Why accumulo can not work ?
>>
>>         And I see there are so many file only 0B in
>> /accumulo/wal/***/,why?
>>
>>         Thanks.
>>
>>
>>

Re: why a error about replicated

Posted by Josh Elser <jo...@gmail.com>.
I know I've run into it before (hence why I brought it up) -- I also 
don't remember 100% if there's another "common" reason for having 
non-zero datanodes participating with none excluded.

I'm also not sure how this manifests itself in code, but, assuming it's 
something identifiable, we could try to catch this case and give a 
better error message (and maybe clean up the loads of empty files we 
create while we spin trying to "make" the file).

Mike Drob wrote:
> Has this error come up before? Is there room for us to intercept that
> stack trace and provide a "check that HDFS has space left" message? This
> might be especially relevant after we;ve removed the hadoop info box on
> the monitor.
>
> On Thu, Jan 22, 2015 at 8:30 AM, Josh Elser <josh.elser@gmail.com
> <ma...@gmail.com>> wrote:
>
>     How much free space do you still have in HDFS? If hdfs doesn't have
>     enough free space to make the file, I believe you'll see the car
>     that you have outlined. The way we create the file will also end up
>     requiring at least one GB with the default configuration.
>
>     Also make sure to take into account any reserved percent of hdfs
>     when considering the hdfs usage.
>
>     On Jan 22, 2015 1:46 AM, "Lu.Qin" <luq.java@gmail.com
>     <ma...@gmail.com>> wrote:
>
>
>         Hi,I have a Accumulo clusters and it run 10 days ,but it show me
>         many errors now.
>
>         2015-01-22 13:04:21,161 [hdfs.DFSClient] WARN : Error while syncing
>         org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
>         /accumulo/wal/+9997/226dce4f-4e14-4704-b811-532afe0b0fb3 could
>         only be replicated to 0 nodes instead
>           of minReplication (=1).  There are 3 datanode(s) running and
>         no node(s) are excluded in this operation.
>                  at
>         org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
>                  at
>         org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
>                  at
>         org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
>                  at
>         org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
>                  at
>         org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>                  at
>         org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>                  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>                  at
>         org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>                  at
>         org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>                  at java.security.AccessController.doPrivileged(Native
>         Method)
>                  at javax.security.auth.Subject.doAs(Subject.java:415)
>                  at
>         org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>                  at
>         org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>                  at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>                  at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>                  at
>         org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>                  at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>                  at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown
>         Source)
>                  at
>         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                  at java.lang.reflect.Method.invoke(Method.java:606)
>                  at
>         org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>                  at
>         org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>                  at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>                  at
>         org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
>                  at
>         org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
>                  at
>         org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
>                  at
>         org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
>
>         I use hadoop fs to put a file into hadoop ,and it works good,and
>         the file has 2 replicates.Why accumulo can not work ?
>
>         And I see there are so many file only 0B in /accumulo/wal/***/,why?
>
>         Thanks.
>
>

Re: why a error about replicated

Posted by Mike Drob <ma...@cloudera.com>.
Has this error come up before? Is there room for us to intercept that stack
trace and provide a "check that HDFS has space left" message? This might be
especially relevant after we;ve removed the hadoop info box on the monitor.

On Thu, Jan 22, 2015 at 8:30 AM, Josh Elser <jo...@gmail.com> wrote:

> How much free space do you still have in HDFS? If hdfs doesn't have enough
> free space to make the file, I believe you'll see the car that you have
> outlined. The way we create the file will also end up requiring at least
> one GB with the default configuration.
>
> Also make sure to take into account any reserved percent of hdfs when
> considering the hdfs usage.
> On Jan 22, 2015 1:46 AM, "Lu.Qin" <lu...@gmail.com> wrote:
>
>>
>> Hi,I have a Accumulo clusters and it run 10 days ,but it show me many
>> errors now.
>>
>> 2015-01-22 13:04:21,161 [hdfs.DFSClient] WARN : Error while syncing
>> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
>> /accumulo/wal/+9997/226dce4f-4e14-4704-b811-532afe0b0fb3 could only be
>> replicated to 0 nodes instead
>>  of minReplication (=1).  There are 3 datanode(s) running and no node(s)
>> are excluded in this operation.
>>         at
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>         at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>         at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
>>
>> I use hadoop fs to put a file into hadoop ,and it works good,and the file
>> has 2 replicates.Why accumulo can not work ?
>>
>> And I see there are so many file only 0B in /accumulo/wal/***/,why?
>>
>> Thanks.
>>
>

Re: why a error about replicated

Posted by Josh Elser <jo...@gmail.com>.
How much free space do you still have in HDFS? If hdfs doesn't have enough
free space to make the file, I believe you'll see the car that you have
outlined. The way we create the file will also end up requiring at least
one GB with the default configuration.

Also make sure to take into account any reserved percent of hdfs when
considering the hdfs usage.
On Jan 22, 2015 1:46 AM, "Lu.Qin" <lu...@gmail.com> wrote:

>
> Hi,I have a Accumulo clusters and it run 10 days ,but it show me many
> errors now.
>
> 2015-01-22 13:04:21,161 [hdfs.DFSClient] WARN : Error while syncing
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /accumulo/wal/+9997/226dce4f-4e14-4704-b811-532afe0b0fb3 could only be
> replicated to 0 nodes instead
>  of minReplication (=1).  There are 3 datanode(s) running and no node(s)
> are excluded in this operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
>
> I use hadoop fs to put a file into hadoop ,and it works good,and the file
> has 2 replicates.Why accumulo can not work ?
>
> And I see there are so many file only 0B in /accumulo/wal/***/,why?
>
> Thanks.
>