You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Duong (Jira)" <ji...@apache.org> on 2023/01/06 17:49:00 UTC
[jira] [Assigned] (HDDS-7738) SCM terminates when adding container to a closed pipeline
[ https://issues.apache.org/jira/browse/HDDS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duong reassigned HDDS-7738:
---------------------------
Assignee: Duong
> SCM terminates when adding container to a closed pipeline
> ---------------------------------------------------------
>
> Key: HDDS-7738
> URL: https://issues.apache.org/jira/browse/HDDS-7738
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Duong
> Assignee: Duong
> Priority: Critical
>
> This is similar to HDDS-5843, but in a different scenario.
>
> An Ozone customer encountered this issue after a container (c1) is allocated with a newly created pipeline (p1). The chain of events is as follows:
> # SCM processes pipeline creation transaction *p1* => *p1* is {*}created{*}.
> # SCM received a request to close p1 from a data node (see the previous comment)
> => *p1* is {*}closed{*}.
> => SCM also tried to find and close relevant containers, at this point, container *c1* doesn't *exist* yet, so it {*}can't be closed{*}.
> # SCM processes the container *c1* allocation transaction => failed because *p1* is *closed* already.
> => SCM terminates and both transactions #1 and #3 are not committed (as Ratis commits transactions in chunks).
> Because the transactions are not committed, whenever SCM restarts, it got through the same step #1 and #3 and terminates again.
> Solution: SCM should not terminate when adding a container with a closed pipeline. The fix is similar to HDDS-5843.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org