You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Uma Maheswara Rao G (Jira)" <ji...@apache.org> on 2023/02/15 18:45:00 UTC

[jira] [Commented] (HDDS-7948) ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask: Failed ECReconstructionCommand due to StorageContainerException: ContainerID # does not exist

    [ https://issues.apache.org/jira/browse/HDDS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689304#comment-17689304 ] 

Uma Maheswara Rao G commented on HDDS-7948:
-------------------------------------------

[~ghuangups] Container can be deleted while reconstruction in progress as they both are async ops. Eventually, things should be normal. Have you checked the cluster anything abnormal after this?

cc: [~sodonnell] 

> ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask: Failed ECReconstructionCommand due to StorageContainerException: ContainerID # does not exist
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-7948
>                 URL: https://issues.apache.org/jira/browse/HDDS-7948
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: SCM
>            Reporter: George Huang
>            Priority: Major
>
>  
>  
> {code:java}
> 2023-02-08 22:24:23,526 ERROR org.apache.hadoop.hdds.scm.XceiverClientGrpc: Failed to execute command ListBlock on the pipeline Pipeline[ Id: 28fc5b89-9b2e-4000-8a98-4fd30ca7268a, Nodes: 28fc5b89-9b2e-4000-8a98-4fd30ca7268a(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), ReplicationConfig: EC/ECReplicationConfig{data=6, parity=3, ecChunkSize=1048576, codec=rs}, State:CLOSED, leaderId:, CreationTimestamp2023-02-08T22:24:22.119-08:00[America/Los_Angeles]].
> 2023-02-08 22:24:23,527 WARN org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask: Failed ECReconstructionCommand{containerID=51849, replication=rs-6-3-1048576, missingIndexes=[2], sources={1=5a441e9b-9190-40b6-b06a-6630994a95be(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 3=c8b694b6-a240-4371-b58d-f4a77fe74967(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 4=8332408c-48d5-49b2-88ab-e8dc8b3e3c44(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 5=173c02ec-f4a0-4edb-8761-a87ce06ab33c(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 6=26caa768-cd1f-4c58-a37a-7f286c948a1c(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 7=28fc5b89-9b2e-4000-8a98-4fd30ca7268a(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 8=5177742b-a41d-4226-bc10-2e6e6eb43f01(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx), 9=11fd47b8-1249-4c04-b652-12fc383eb7c6(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx)}, targets={2=905af0a7-4069-4c58-97a1-4ba308ac1aac(xxxxxx.xxx.xxxxxxxx.com/xx.xx.xxx.xx)}} after 6005 ms
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: ContainerID 51849 does not exist
>         at org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:631)
>         at org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambda$getValidatorList$0(ContainerProtocolCalls.java:638)
>         at org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:410)
>         at org.apache.hadoop.hdds.scm.XceiverClientGrpc.lambda$sendCommandWithTraceIDAndRetry$0(XceiverClientGrpc.java:349)
>         at org.apache.hadoop.hdds.tracing.TracingUtil.executeInSpan(TracingUtil.java:177)
>         at org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:151)
>         at org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:343)
>         at org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:324)
>         at org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.listBlock(ContainerProtocolCalls.java:120)
>         at org.apache.hadoop.ozone.container.ec.reconstruction.ECContainerOperationClient.listBlock(ECContainerOperationClient.java:87)
>         at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.getBlockDataMap(ECReconstructionCoordinator.java:432)
>         at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECContainerGroup(ECReconstructionCoordinator.java:144)
>         at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask.run(ECReconstructionCoordinatorTask.java:91)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)A
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org