You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Siddharth Wagle (Jira)" <ji...@apache.org> on 2020/01/29 05:57:00 UTC

[jira] [Commented] (HDDS-2794) Failed to close QUASI_CLOSED container

    [ https://issues.apache.org/jira/browse/HDDS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025627#comment-17025627 ] 

Siddharth Wagle commented on HDDS-2794:
---------------------------------------

Hi [~Sammi], the above error can probably be ignored I think, cc: [~nanda]

Explanation of QUASI_CLOSED state:
{noformat}
For Raft-replicas that achieve consistency via quorum consensus, the very act of closing a container requires a quorum of nodes. In the presence of node failures and network partitions, consensus may not always be achievable. However, we cannot leave containers in an open state indefinitely. Containers must be closed so they can be safely re-replicated and brought to full replication for fault tolerance.

The quasi-closed state was introduced to solve this problem. An open container may be marked as quasi-closed without quorum consensus, thus making it immutable and available for re-replication. A separate quasi-closed state makes it explicit that the replica was closed without consensus and may not have the most up to date state for the container.

If the missing replicas become available in the future, e.g. if a network partition is healed and we achieve a quorum of replicas, we can unambiguously close the ones with the highest BCS and delete the rest
{noformat}





> Failed to close QUASI_CLOSED container
> --------------------------------------
>
>                 Key: HDDS-2794
>                 URL: https://issues.apache.org/jira/browse/HDDS-2794
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Priority: Critical
>
> 2019-12-24 20:19:55,154 INFO org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Process replica:ContainerReplica{containerID=#283, datanodeDetails=ed90869c-317e-4303-8922-9fa83a3983cb{ip: 10.120.113.172, host: host172, networkLocation: /rack2, certSerialId: null}, placeOfBirth=ed90869c-317e-4303-8922-9fa83a3983cb, sequenceId=2342, state=QUASI_CLOSED}
> 2019-12-24 20:20:02,258 INFO org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Process replica:ContainerReplica{containerID=#283, datanodeDetails=1da74a1d-f64d-4ad4-b04c-85f26687e683{ip: 10.121.124.44, host: host044, networkLocation: /rack2, certSerialId: null}, placeOfBirth=1da74a1d-f64d-4ad4-b04c-85f26687e683, sequenceId=2209, state=UNHEALTHY}
> 2019-12-24 20:20:03,167 INFO org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Process replica:ContainerReplica{containerID=#283, datanodeDetails=b65b0b6c-b0bb-429f-a23d-467c72d4b85c{ip: 10.120.139.111, host: host111, networkLocation: /rack1, certSerialId: null}, placeOfBirth=b65b0b6c-b0bb-429f-a23d-467c72d4b85c, sequenceId=2209, state=UNHEALTHY}
> 2019-12-24 20:20:03,168 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : #283
> 2019-12-24 20:20:03,169 WARN org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Cannot close container #283, which is in QUASI_CLOSED state.
> ozone scmcli container list -s=283
> {
>   "state" : "QUASI_CLOSED",
>   "replicationFactor" : "THREE",
>   "replicationType" : "RATIS",
>   "usedBytes" : 872715244,
>   "numberOfKeys" : 9,
>   "lastUsed" : 14385015083,
>   "stateEnterTime" : 14313955037,
>   "owner" : "d0e31665-ba27-45ad-b576-67cd1bccc50b",
>   "containerID" : 283,
>   "deleteTransactionId" : 0,
>   "sequenceId" : 0,
>   "open" : false
> }
> ozone scmcli container info 283
> Loaded properties from hadoop-metrics2.properties
> Scheduled Metric snapshot period at 10 second(s).
> XceiverClientMetrics metrics system started
> Container id: 283
> Container State: CLOSED
> Container Path: /data5/hdds/df508c61-3ae7-413f-ab9d-e00d9125de70/current/containerDir0/283/metadata
> Container Metadata: 
> Datanodes: [host172]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org