You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Pratyush Bhatt (Jira)" <ji...@apache.org> on 2023/10/13 05:47:00 UTC

[jira] [Comment Edited] (HDDS-9437) Intermittent failure in test-scm-decommission

    [ https://issues.apache.org/jira/browse/HDDS-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774797#comment-17774797 ] 

Pratyush Bhatt edited comment on HDDS-9437 at 10/13/23 5:46 AM:
----------------------------------------------------------------

Had a discussion with Attila offline, 
Checking from both the logs you attached, [1|https://github.com/adoroszlai/ozone-build-results/blob/master/2023/10/11/25931/acceptance-HA-secure/output.log] and [2|https://github.com/adoroszlai/ozone-build-results/blob/master/2023/10/11/25918/acceptance-HA-secure/output.log], logs are like this for that test:
{code:java}
Decommissioned Scm 02553093-757f-4f09-88c2-97cfa9cf0c62
ozone admin scm decommission --nodeid=02553093-757f-4f09-88c2-97cfa9cf0c62 | grep Decommissioned succeed
==============================================================================
Scm-Decommission :: Test Ozone SCM Decommissioning 
==============================================================================
Decommission SCM Primordial Node jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode1_1_HddsDatanodeService.stack
jstack 8 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode2_1_HddsDatanodeService.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode3_1_HddsDatanodeService.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode4_1_HddsDatanodeService.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_om1_1_OzoneManagerStarter.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_om2_1_OzoneManagerStarter.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_om3_1_OzoneManagerStarter.stack
jstack 6 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_recon_1_ReconServer.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_s3g_1_Gateway.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_scm1.org_1_StorageContainerManagerStarter.stack
jstack 6 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_scm2.org_1_StorageContainerManagerStarter.stack
Error response from daemon: Container bb20faaac3a3f5513efe916ea8f3cc2775091c2cbd06a1f1c83f56e899c6faeb is not running{code}

It doesn’t tell that the test failed.
Problem is the test is decommissioning the node that we have exec’ed from i.e. [scm3.org|http://scm3.org/]

Now will be running the robot test in s3g instead.


was (Author: JIRAUSER286694):
Had a discussion with Attila offline, 
Checking from both the logs you attached, [1|https://github.com/adoroszlai/ozone-build-results/blob/master/2023/10/11/25931/acceptance-HA-secure/output.log] and [2|https://github.com/adoroszlai/ozone-build-results/blob/master/2023/10/11/25918/acceptance-HA-secure/output.log], logs are like this for that test:
Decommissioned Scm 02553093-757f-4f09-88c2-97cfa9cf0c62
ozone admin scm decommission --nodeid=02553093-757f-4f09-88c2-97cfa9cf0c62 | grep Decommissioned succeed
==============================================================================
Scm-Decommission :: Test Ozone SCM Decommissioning                            
==============================================================================
Decommission SCM Primordial Node                                      jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode1_1_HddsDatanodeService.stack
jstack 8 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode2_1_HddsDatanodeService.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode3_1_HddsDatanodeService.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_datanode4_1_HddsDatanodeService.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_om1_1_OzoneManagerStarter.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_om2_1_OzoneManagerStarter.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_om3_1_OzoneManagerStarter.stack
jstack 6 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_recon_1_ReconServer.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_s3g_1_Gateway.stack
jstack 7 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_scm1.org_1_StorageContainerManagerStarter.stack
jstack 6 > /home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozonesecure-ha/result/ozonesecure-ha_scm2.org_1_StorageContainerManagerStarter.stack
Error response from daemon: Container bb20faaac3a3f5513efe916ea8f3cc2775091c2cbd06a1f1c83f56e899c6faeb is not running
It doesn’t tell that the test failed.
Problem is the test is decommissioning the node that we have exec’ed from i.e. [scm3.org|http://scm3.org/]

Now will be running the robot test in s3g instead.

> Intermittent failure in test-scm-decommission
> ---------------------------------------------
>
>                 Key: HDDS-9437
>                 URL: https://issues.apache.org/jira/browse/HDDS-9437
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: test
>    Affects Versions: 1.4.0
>            Reporter: Attila Doroszlai
>            Assignee: Pratyush Bhatt
>            Priority: Major
>
> {code}
> Decommissioned Scm 02553093-757f-4f09-88c2-97cfa9cf0c62
> ozone admin scm decommission --nodeid=02553093-757f-4f09-88c2-97cfa9cf0c62 | grep Decommissioned succeed
> ==============================================================================
> Scm-Decommission :: Test Ozone SCM Decommissioning                            
> ==============================================================================
> Decommission SCM Primordial Node                                      jstack ...
> {code}
> * https://github.com/adoroszlai/ozone-build-results/blob/master/2023/10/11/25918/acceptance-HA-secure/output.log
> * https://github.com/adoroszlai/ozone-build-results/blob/master/2023/10/11/25931/acceptance-HA-secure/output.log
> CC [~Sammi] [~sgal] [~nanda]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org