You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Sammi Chen (Jira)" <ji...@apache.org> on 2023/11/29 02:14:00 UTC

[jira] [Commented] (HDDS-3945) ContainerReplicaNotFoundException when remove a replica in ContainerReportHandler

    [ https://issues.apache.org/jira/browse/HDDS-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790866#comment-17790866 ] 

Sammi Chen commented on HDDS-3945:
----------------------------------

Right, [~dteng] . It's not observed later. And since the code is changed a lot since that. Let's close it now.

> ContainerReplicaNotFoundException when remove a replica in ContainerReportHandler
> ---------------------------------------------------------------------------------
>
>                 Key: HDDS-3945
>                 URL: https://issues.apache.org/jira/browse/HDDS-3945
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Assignee: Dave Teng
>            Priority: Major
>
> It's not easy to produce.  
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Container #54339 is over replicated. Expected replica count is 3, but found 16.
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 826dda09-1259-4c5c-9a80-56b985665dc4{ip: 9.180.6.157, host: host-9-180-6-157, networkLocation: /rack10, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 6f87886a-745b-4eb6-9b4b-54e1f909f20c{ip: 9.180.13.218, host: host-9-180-13-218, networkLocation: /rack2, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode d3336357-8920-4a4e-a12f-e57da1640c4d{ip: 9.180.20.94, host: host-9-180-20-94, networkLocation: /rack1, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 7b4edd6e-5787-4574-9928-810514a05d2b{ip: 9.179.142.222, host: host222, networkLocation: /rack2, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 5b36ed4f-4a6b-4014-b181-235789956d34{ip: 9.180.8.67, host: host-9-180-8-67, networkLocation: /rack10, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode d35f7754-3914-4e3a-ac91-4ae26e08e8a7{ip: 9.180.19.144, host: host-9-180-19-144, networkLocation: /rack3, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode db854037-4846-4093-89de-e492e0f14239{ip: 9.179.142.198, host: host198, networkLocation: /rack3, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 228dacd3-36cf-4473-93ec-c06a739a8a2d{ip: 9.180.8.87, host: host-9-180-8-87, networkLocation: /rack10, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 2e1b2fdd-f8fb-4252-bfc1-31d5339681be{ip: 9.179.144.104, host: host-9-179-144-104, networkLocation: /rack2, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 1904b912-998d-43ba-9e54-f7e7c40c1759{ip: 9.180.21.100, host: host-9-180-21-100, networkLocation: /rack2, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode dd64e953-bdef-4dae-a4c5-51aa7114ea0a{ip: 9.180.8.40, host: host-9-180-8-40, networkLocation: /rack10, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 47cdfded-e88f-44f3-81b9-4f95e65e364f{ip: 9.180.8.78, host: host-9-180-8-78, networkLocation: /rack10, certSerialId: null}
> 2020-07-04 16:14:19,820 [ReplicationMonitor] INFO org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete container command for container #54339 to datanode 11974d80-c4ff-4963-81fa-873888feaa24{ip: 9.180.8.58, host: host-9-180-8-58, networkLocation: /rack10, certSerialId: null}
> 2020-07-04 16:18:29,709 [EventQueue-ContainerReportForContainerReportHandler] ERROR org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Exception while processing container report for container 54339 from datanode 7b4edd6e-5787-4574-9928-810514a05d2b{ip: 9.179.142.222, host: host222, networkLocation: /rack2, certSerialId: null}.
> org.apache.hadoop.hdds.scm.container.ContainerReplicaNotFoundException: Container #54339, replica: ContainerReplica{containerID=#54339, datanodeDetails=7b4edd6e-5787-4574-9928-810514a05d2b{ip: 9.179.142.222, host: host222, networkLocation: /rack2, certSerialId: null}, placeOfBirth=ca0dedd0-f586-4f99-986b-3a953dfc2dde, sequenceId=4249}
>         at org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.removeContainerReplica(ContainerStateMap.java:256)
>         at org.apache.hadoop.hdds.scm.container.ContainerStateManager.removeContainerReplica(ContainerStateManager.java:534)
>         at org.apache.hadoop.hdds.scm.container.SCMContainerManager.removeContainerReplica(SCMContainerManager.java:560)
>         at org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerReplica(AbstractContainerReportHandler.java:234)
>         at org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:81)
>         at org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:163)
>         at org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:131)
>         at org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:51)
>         at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org