You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Anu Engineer (Jira)" <ji...@apache.org> on 2019/12/04 20:25:00 UTC

[jira] [Resolved] (HDDS-2646) Start acceptance tests only if at least one THREE pipeline is available

     [ https://issues.apache.org/jira/browse/HDDS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anu Engineer resolved HDDS-2646.
--------------------------------
    Fix Version/s: 0.5.0
       Resolution: Fixed

Committed to master. Thanks for the contribution.

> Start acceptance tests only if at least one THREE pipeline is available
> -----------------------------------------------------------------------
>
>                 Key: HDDS-2646
>                 URL: https://issues.apache.org/jira/browse/HDDS-2646
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Marton Elek
>            Assignee: Marton Elek
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.5.0
>
>         Attachments: docker-ozoneperf-ozoneperf-basic-scm.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> After HDDS-2034 (or even before?) pipeline creation (or the status transition from ALLOCATE to OPEN) requires at least one pipeline report from all of the datanodes. Which means that the cluster might not be usable even if it's out from the safe mode AND there are at least three datanodes.
> It makes all the acceptance tests unstable.
> For example in [this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319] run.
> {code:java}
> scm_1         | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to datanode 548f146f-2166-440a-b9f1-83086591ae26
> scm_1         | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
> scm_1         | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb create command to datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
> scm_1         | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: Created pipeline Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip: 172.24.0.2, host: ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
> scm_1         | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}
> scm_1         | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}
> scm_1         | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}
> scm_1         | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 548f146f-2166-440a-b9f1-83086591ae26{ip: 172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack, certSerialId: null}
> scm_1         | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: Pipeline THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack, certSerialId: null} {code}
> As you can see the pipeline is created but the the cluster is not usable as it's not yet reporter back by datanode_2:
> {code:java}
> scm_1         | 2019-11-28 11:23:13,879 WARN block.BlockManagerImpl: Pipeline creation failed for type:RATIS factor:THREE. Retrying get pipelines c
> all once.
> scm_1         | org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot create pipeline of factor 3 using 0 nodes.{code}
>  The quick fix is to configure all the compose clusters to wait until one pipeline is available. This can be done by adjusting the number of the required datanodes:
> {code:java}
> // We only care about THREE replica pipeline
> int minHealthyPipelines = minDatanodes /
>     HddsProtos.ReplicationFactor.THREE_VALUE; {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org