You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Eric Newton (JIRA)" <ji...@apache.org> on 2015/05/06 16:42:00 UTC

[jira] [Updated] (ACCUMULO-3775) Root tablet had 6,974 walogs

     [ https://issues.apache.org/jira/browse/ACCUMULO-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Newton updated ACCUMULO-3775:
----------------------------------
    Attachment: ACCUMULO_3775-01.patch

> Root tablet had 6,974 walogs
> ----------------------------
>
>                 Key: ACCUMULO-3775
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3775
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: Same as ACCUMULO-3774
>            Reporter: Keith Turner
>            Priority: Blocker
>             Fix For: 1.7.0
>
>         Attachments: ACCUMULO_3775-01.patch
>
>
> Before the deadlock described in ACCUMULO-3774, the root tablet recovered 6,974  walogs.   Almost all of theses were empty.  Before the tserver was killed there were thousands of messages like the following (I think this was caused by datanode agitation).  
> {noformat}
> 2015-05-05 18:02:43,236 [log.TabletServerLogger] INFO : Using next log hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,236 [log.TabletServerLogger] DEBUG: Creating next WAL
> 2015-05-05 18:02:43,236 [tserver.TabletServer] INFO : Writing log marker for level ROOT hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,236 [log.DfsLogger] DEBUG: Address is worker10:9997
> 2015-05-05 18:02:43,236 [log.DfsLogger] DEBUG: DfsLogger.open() begin
> 2015-05-05 18:02:43,236 [util.MetadataTableUtil] DEBUG: Adding log entry hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,237 [fs.VolumeManagerImpl] DEBUG: creating hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1 with CreateFlag set: [CREATE, SYNC_BLOCK]
> 2015-05-05 18:02:43,246 [tserver.TabletServer] INFO : Writing log marker for level NORMAL hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,247 [util.MetadataTableUtil] DEBUG: Adding log entry hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c
> 2015-05-05 18:02:43,247 [log.DfsLogger] DEBUG: No enciphering, using raw output stream
> 2015-05-05 18:02:43,247 [log.DfsLogger] DEBUG: Got new write-ahead log: worker10:9997/hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,250 [hdfs.DFSClient] WARN : DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/worker10+9997/a13aee79-c313-4298-b55a-8ec58ffb977c could only be replicated to 2 nodes instead of minReplication (=3).  There are 16 datanode(s) running and n
> o node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3067)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1407)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy16.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
> {noformat}
> {noformat}
> 2015-05-05 18:02:43,352 [log.TabletServerLogger] INFO : Using next log hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,352 [log.TabletServerLogger] DEBUG: Creating next WAL
> 2015-05-05 18:02:43,352 [tserver.TabletServer] INFO : Writing log marker for level ROOT hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,352 [log.DfsLogger] DEBUG: Address is worker10:9997
> 2015-05-05 18:02:43,352 [log.DfsLogger] DEBUG: DfsLogger.open() begin
> 2015-05-05 18:02:43,353 [util.MetadataTableUtil] DEBUG: Adding log entry hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1
> 2015-05-05 18:02:43,353 [fs.VolumeManagerImpl] DEBUG: creating hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/1810b018-26e3-4728-bbab-e3d901e3edd3 with CreateFlag set: [CREATE, SYNC_BLOCK]
> 2015-05-05 18:02:43,362 [log.DfsLogger] DEBUG: No enciphering, using raw output stream
> 2015-05-05 18:02:43,362 [log.DfsLogger] DEBUG: Got new write-ahead log: worker10:9997/hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/1810b018-26e3-4728-bbab-e3d901e3edd3
> 2015-05-05 18:02:43,366 [log.TabletServerLogger] DEBUG: Created next WAL hdfs://10.1.5.21:10000/accumulo/wal/worker10+9997/1810b018-26e3-4728-bbab-e3d901e3edd3
> 2015-05-05 18:02:43,366 [hdfs.DFSClient] WARN : DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /accumulo/wal/worker10+9997/295244ee-c9e3-404f-a3d8-9569e41ba8e1 could only be replicated to 2 nodes instead of minReplication (=3).  There are 16 datanode(s) running and no node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3067)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1407)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy16.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)