You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ethan Rose (Jira)" <ji...@apache.org> on 2021/10/20 20:38:10 UTC

[jira] [Updated] (HDDS-3358) Intermittent test failure related to a race conditon during PipelineManager close

     [ https://issues.apache.org/jira/browse/HDDS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Rose updated HDDS-3358:
-----------------------------
    Target Version/s: 1.3.0  (was: 1.2.0)

I am managing the 1.2.0 release and we currently have more than 600 issues targeted for 1.2.0. I am moving the target field to 1.3.0.

If you are actively working on this jira and believe this should be targeted for the 1.2.0 release, Please reach out to me via Apache email or Slack.

> Intermittent test failure related to a race conditon during PipelineManager close
> ---------------------------------------------------------------------------------
>
>                 Key: HDDS-3358
>                 URL: https://issues.apache.org/jira/browse/HDDS-3358
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: test
>            Reporter: Marton Elek
>            Assignee: Marton Elek
>            Priority: Major
>              Labels: TriagePending, flaky-test, ozone-flaky-test
>         Attachments: org.apache.hadoop.hdds.scm.node.TestSCMNodeManager-output.txt
>
>
> The test which is failed:
> TestSCMNodeManager
> The end of the log is:
> {code}
> 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution message 19844615-0d70-4172-8c34-96e5b7295ef2{ip: 196.189.243.187, host: localhost-196.189.243.187, networkLocation: /default-rack, certSerialId: null}
> java.lang.NullPointerException
>         at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380)
>         at org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63)
>         at org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38)
>         at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2020-04-08 10:49:44,544 INFO  node.StaleNodeHandler (StaleNodeHandler.java:onMessage(58)) - Datanode 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null} moved to stale state. Finalizing its pipelines [PipelineID=fd1f9e92-2f90-43e7-8406-94ba6ac356b0, PipelineID=8d380e3c-b632-4bda-aa7a-554774fba09d]
> 2020-04-08 10:49:44,544 INFO  pipeline.SCMPipelineManager (SCMPipelineManager.java:finalizeAndDestroyPipeline(373)) - Destroying pipeline:Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, State:ALLOCATED, leaderId:null, CreationTimestamp2020-04-08T10:49:37.441Z]
> 2020-04-08 10:49:44,544 INFO  pipeline.PipelineStateManager (PipelineStateManager.java:finalizePipeline(120)) - Pipeline Pipeline[ Id: fd1f9e92-2f90-43e7-8406-94ba6ac356b0, Nodes: 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, State:CLOSED, leaderId:null, CreationTimestamp2020-04-08T10:49:37.441Z] moved to CLOSED state
> 2020-04-08 10:49:44,544 ERROR events.SingleThreadExecutor (SingleThreadExecutor.java:lambda$onMessage$1(84)) - Error on execution message 0914e56d-c7f8-4e0a-8fd1-845a9806172b{ip: 57.46.156.17, host: localhost-57.46.156.17, networkLocation: /default-rack, certSerialId: null}
> java.lang.NullPointerException
>         at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.finalizeAndDestroyPipeline(SCMPipelineManager.java:380)
>         at org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:63)
>         at org.apache.hadoop.hdds.scm.node.StaleNodeHandler.onMessage(StaleNodeHandler.java:38)
>         at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2020-04-08 10:49:44,544 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$close$4(208)) - Send pipeline:PipelineID=e0e155c6-9fbe-46a7-b742-e805ea9baacf close command to datanode 30a24b04-1289-4c30-a28a-034edfe29e3d
> 2020-04-08 10:49:44,545 WARN  events.EventQueue (EventQueue.java:fireEvent(151)) - Processing of TypedEvent{payloadType=CommandForDatanode, name='Datanode_Command'} is skipped, EventQueue is not running
> 2020-04-08 10:49:44,544 INFO  node.StaleNodeHandler (StaleNodeHandler.java:onMessage(58)) - Datanode 59bdd26b-05da-47d1-8c3f-8350d55d7299{ip: 248.147.58.17, host: localhost-248.147.58.17, networkLocation: /default-rack, certSerialId: null} moved to stale state. Finalizing its pipelines [PipelineID=17b032b7-b9c4-41eb-bba6-50106881886d, PipelineID=60de1ca6-4115-415b-bbf1-06b86113df94]
> 2020-04-08 10:49:44,576 WARN  server.ServerUtils (ServerUtils.java:getScmDbDir(148)) - ozone.scm.db.dirs is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
> 2020-04-08 10:49:44,579 WARN  server.ServerUtils (ServerUtils.java:getScmDbDir(148)) - ozone.scm.db.dirs is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
> 2020-04-08 10:49:44,579 WARN  db.DBDefinition (DBDefinition.java:createDBStoreBuilder(63)) - ozone.scm.db.dirs is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org