You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Sammi Chen (Jira)" <ji...@apache.org> on 2020/06/28 12:59:00 UTC

[jira] [Updated] (HDDS-3889) Replication container failed for chunks dir not found

     [ https://issues.apache.org/jira/browse/HDDS-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammi Chen updated HDDS-3889:
-----------------------------
    Description: 
Container 90 doesn't have chunks directory.  All its got is deleted keys in its meta data db. 

DB dump:
#BYTESUSED : ÿÿÿÿÿep
#PENDINGDELETEBLOCKCOUNT : 
#delTX# : 
#deleted#103509887816499255 : 103509887816499255
#deleted#103509908566048958 : 103509908566048958
#deleted#103509932455690539 : 103509932455690539
#deleted#103509966401831405 : 103509966401831405


LOG:
2020-06-23 21:51:02,165 [grpc-default-executor-968] INFO org.apache.hadoop.ozone.container.replication.GrpcOutputStream: Sent 1568934 bytes for container 90
2020-06-23 21:51:02,165 [grpc-default-executor-968] ERROR org.apache.hadoop.ozone.container.replication.GrpcReplicationService: Error streaming container 90
java.nio.file.NoSuchFileException: /data7/hdds/hdds/326a5fe1-e63c-44b6-a57e-2f858fe4eaa7/current/containerDir0/90/chunks
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
        at java.nio.file.Files.newDirectoryStream(Files.java:457)
        at java.nio.file.Files.list(Files.java:3451)
        at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.includePath(TarContainerPacker.java:212)
        at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.pack(TarContainerPacker.java:158)
        at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.exportContainerData(KeyValueContainer.java:535)
        at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.exportContainer(KeyValueHandler.java:929)
        at org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.exportContainer(ContainerController.java:145)
        at org.apache.hadoop.ozone.container.replication.OnDemandContainerReplicationSource.copyData(OnDemandContainerReplicationSource.java:59)
        at org.apache.hadoop.ozone.container.replication.GrpcReplicationService.download(GrpcReplicationService.java:56)
        at org.apache.hadoop.hdds.protocol.datanode.proto.IntraDatanodeProtocolServiceGrpc$MethodHandlers.invoke(IntraDatanodeProtocolServiceGrpc.java:219)
        at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172)
        at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331)
        at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:817)
        at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


Quick solution: keep chunks directory when all block chunks and chunk chunks files deleted. 
Long term solution: garbage collect containers which no longer has user data. 


  was:
Container 90 doesn't have chunks directory.  All its got is deleted keys in its meta data db. 

DB dump:
#BYTESUSED : ÿÿÿÿÿep
#PENDINGDELETEBLOCKCOUNT : 
#delTX# : 
#deleted#103509887816499255 : 103509887816499255
#deleted#103509908566048958 : 103509908566048958
#deleted#103509932455690539 : 103509932455690539
#deleted#103509966401831405 : 103509966401831405


LOG:
2020-06-23 21:51:02,165 [grpc-default-executor-968] INFO org.apache.hadoop.ozone.container.replication.GrpcOutputStream: Sent 1568934 bytes for container 90
2020-06-23 21:51:02,165 [grpc-default-executor-968] ERROR org.apache.hadoop.ozone.container.replication.GrpcReplicationService: Error streaming container 90
java.nio.file.NoSuchFileException: /data7/hdds/hdds/326a5fe1-e63c-44b6-a57e-2f858fe4eaa7/current/containerDir0/90/chunks
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
        at java.nio.file.Files.newDirectoryStream(Files.java:457)
        at java.nio.file.Files.list(Files.java:3451)
        at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.includePath(TarContainerPacker.java:212)
        at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.pack(TarContainerPacker.java:158)
        at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.exportContainerData(KeyValueContainer.java:535)
        at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.exportContainer(KeyValueHandler.java:929)
        at org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.exportContainer(ContainerController.java:145)
        at org.apache.hadoop.ozone.container.replication.OnDemandContainerReplicationSource.copyData(OnDemandContainerReplicationSource.java:59)
        at org.apache.hadoop.ozone.container.replication.GrpcReplicationService.download(GrpcReplicationService.java:56)
        at org.apache.hadoop.hdds.protocol.datanode.proto.IntraDatanodeProtocolServiceGrpc$MethodHandlers.invoke(IntraDatanodeProtocolServiceGrpc.java:219)
        at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172)
        at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331)
        at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:817)
        at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)



> Replication container failed for chunks dir not found 
> ------------------------------------------------------
>
>                 Key: HDDS-3889
>                 URL: https://issues.apache.org/jira/browse/HDDS-3889
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Priority: Major
>
> Container 90 doesn't have chunks directory.  All its got is deleted keys in its meta data db. 
> DB dump:
> #BYTESUSED : ÿÿÿÿÿep
> #PENDINGDELETEBLOCKCOUNT : 
> #delTX# : 
> #deleted#103509887816499255 : 103509887816499255
> #deleted#103509908566048958 : 103509908566048958
> #deleted#103509932455690539 : 103509932455690539
> #deleted#103509966401831405 : 103509966401831405
> LOG:
> 2020-06-23 21:51:02,165 [grpc-default-executor-968] INFO org.apache.hadoop.ozone.container.replication.GrpcOutputStream: Sent 1568934 bytes for container 90
> 2020-06-23 21:51:02,165 [grpc-default-executor-968] ERROR org.apache.hadoop.ozone.container.replication.GrpcReplicationService: Error streaming container 90
> java.nio.file.NoSuchFileException: /data7/hdds/hdds/326a5fe1-e63c-44b6-a57e-2f858fe4eaa7/current/containerDir0/90/chunks
>         at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>         at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>         at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>         at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
>         at java.nio.file.Files.newDirectoryStream(Files.java:457)
>         at java.nio.file.Files.list(Files.java:3451)
>         at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.includePath(TarContainerPacker.java:212)
>         at org.apache.hadoop.ozone.container.keyvalue.TarContainerPacker.pack(TarContainerPacker.java:158)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.exportContainerData(KeyValueContainer.java:535)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.exportContainer(KeyValueHandler.java:929)
>         at org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.exportContainer(ContainerController.java:145)
>         at org.apache.hadoop.ozone.container.replication.OnDemandContainerReplicationSource.copyData(OnDemandContainerReplicationSource.java:59)
>         at org.apache.hadoop.ozone.container.replication.GrpcReplicationService.download(GrpcReplicationService.java:56)
>         at org.apache.hadoop.hdds.protocol.datanode.proto.IntraDatanodeProtocolServiceGrpc$MethodHandlers.invoke(IntraDatanodeProtocolServiceGrpc.java:219)
>         at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:817)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>         at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Quick solution: keep chunks directory when all block chunks and chunk chunks files deleted. 
> Long term solution: garbage collect containers which no longer has user data. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org