You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Attila Doroszlai (Jira)" <ji...@apache.org> on 2021/08/19 13:37:00 UTC

[jira] [Comment Edited] (HDDS-3907) Intermittent failure in writing data in acceptance test

    [ https://issues.apache.org/jira/browse/HDDS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401690#comment-17401690 ] 

Attila Doroszlai edited comment on HDDS-3907 at 8/19/21, 1:36 PM:
------------------------------------------------------------------

This is still happening, see https://github.com/elek/ozone-build-results/tree/master/2021/08/19/9810/acceptance-secure for logs.

{code:title=https://github.com/apache/ozone/runs/3368353893#step:5:126}
Start freon testing                                                   | FAIL |
{code}

{code:title=robot log.html}
07:19:23.258	INFO	Running command 'ozone freon randomkeys --num-of-volumes 5 --num-of-buckets 5 --num-of-keys 5 --num-of-threads 1 --replication-type RATIS --factor THREE --validate-writes 2>&1'.	
07:24:23.225	FAIL	Test timeout 5 minutes exceeded.
{code}

{code}
datanode_3  | 2021-08-19 05:20:09,598 [java.util.concurrent.ThreadPoolExecutor$Worker@5f5ccab7[State = -1, empty queue]] WARN server.GrpcLogAppender: 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899->25dd9de7-1caa-448d-a35a-2b29afced1cc-GrpcLogAppender:  appendEntries Timeout, request=AppendEntriesRequest:cid=8,entriesCount=1,lastEntry=(t:3, i:0)
...
datanode_3  | 2021-08-19 05:23:56,577 [Thread-181] INFO client.GrpcClientProtocolService: Failed RaftClientRequest:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899, cid=102, seq=0, Watch-ALL_COMMITTED(131), Message:<EMPTY>, reply=RaftClientReply:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899, cid=102, FAILED org.apache.ratis.protocol.exceptions.NotReplicatedException: Request with call Id 102 and log index 131 is not yet replicated to ALL_COMMITTED, logIndex=131, commits[1c7f86b2-ded3-441b-9f20-84ba3ff60d2d:c132, 64230e6f-d613-4ced-8084-22c404c29d15:c132, 25dd9de7-1caa-448d-a35a-2b29afced1cc:c127]
{code}

{code}
datanode_2  | 2021-08-19 05:18:42,242 [Command processor thread] WARN commandhandler.CreatePipelineCommandHandler: Add group failed for 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d{ip: 172.18.0.9, host: ozonesecure_datanode_3.ozonesecure_default, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default-rack, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}
datanode_2  | java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
{code}


was (Author: adoroszlai):
This is still happening, see https://github.com/elek/ozone-build-results/tree/master/2021/08/19/9810/acceptance-secure for logs.

{code:title=https://github.com/apache/ozone/runs/3368353893#step:5:126}
Start freon testing                                                   | FAIL |
{code}

{code}
datanode_3  | 2021-08-19 05:20:09,598 [java.util.concurrent.ThreadPoolExecutor$Worker@5f5ccab7[State = -1, empty queue]] WARN server.GrpcLogAppender: 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899->25dd9de7-1caa-448d-a35a-2b29afced1cc-GrpcLogAppender:  appendEntries Timeout, request=AppendEntriesRequest:cid=8,entriesCount=1,lastEntry=(t:3, i:0)
...
datanode_3  | 2021-08-19 05:23:56,577 [Thread-181] INFO client.GrpcClientProtocolService: Failed RaftClientRequest:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899, cid=102, seq=0, Watch-ALL_COMMITTED(131), Message:<EMPTY>, reply=RaftClientReply:client-14C4D4C86555->1c7f86b2-ded3-441b-9f20-84ba3ff60d2d@group-74FBCD15D899, cid=102, FAILED org.apache.ratis.protocol.exceptions.NotReplicatedException: Request with call Id 102 and log index 131 is not yet replicated to ALL_COMMITTED, logIndex=131, commits[1c7f86b2-ded3-441b-9f20-84ba3ff60d2d:c132, 64230e6f-d613-4ced-8084-22c404c29d15:c132, 25dd9de7-1caa-448d-a35a-2b29afced1cc:c127]
{code}

{code}
datanode_2  | 2021-08-19 05:18:42,242 [Command processor thread] WARN commandhandler.CreatePipelineCommandHandler: Add group failed for 1c7f86b2-ded3-441b-9f20-84ba3ff60d2d{ip: 172.18.0.9, host: ozonesecure_datanode_3.ozonesecure_default, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default-rack, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}
datanode_2  | java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
{code}

> Intermittent failure in writing data in acceptance test
> -------------------------------------------------------
>
>                 Key: HDDS-3907
>                 URL: https://issues.apache.org/jira/browse/HDDS-3907
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Marton Elek
>            Priority: Blocker
>
> Examples:
> https://github.com/elek/ozone-build-results/tree/master/2020/06/30/1318/acceptance
> https://github.com/elek/ozone-build-results/tree/master/2020/06/30/1321/acceptance
> https://github.com/elek/ozone-build-results/tree/master/2020/06/30/1334/acceptance
> Some strange errors:
> {code}
> scm_1         | 2020-06-30 19:17:50,787 [RatisPipelineUtilsThread] ERROR pipeline.SCMPipelineManager: Failed to create pipeline of type RATIS and factor ONE. Exception: Cannot create pipeline of factor 1 using 0 nodes. Used 6 nodes. Healthy nodes 6
> scm_1         | 2020-06-30 19:17:50,788 [RatisPipelineUtilsThread] ERROR pipeline.SCMPipelineManager: Failed to create pipeline of type RATIS and factor THREE. Exception: Pipeline creation failed because nodes are engaged in other pipelines and every node can only be engaged in max 2 pipelines. Required 3. Found 0
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org