You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Aleksey Plekhanov (JIRA)" <ji...@apache.org> on 2018/05/03 11:33:00 UTC

[jira] [Comment Edited] (IGNITE-8400) Flaky failure of IgniteTopologyValidatorGridSplitCacheTest.testTopologyValidatorWithCacheGroup (Grid is in invalid state)

    [ https://issues.apache.org/jira/browse/IGNITE-8400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456600#comment-16456600 ] 

Aleksey Plekhanov edited comment on IGNITE-8400 at 5/3/18 11:32 AM:
--------------------------------------------------------------------

Node is dropped out of topology because another node (previous in the ring) in some cases send message to this node and can't get reply within given failure detection timeout. To solve this I set reconnect count to 2 (this change also disables failure detection timeout and sets separate timeouts for each IO method invocation). I also remove {{sleep()}} in {{checkSegmented}} since this doesn't affect test logic, but brings extra delay to test (with disabled failure detection timeout test run longer).
Looped test runs on TC [1] after this fix doesn't contain {{Grid is in invalid state}} error anymore. But there are still {{Test has been timed out}} error sometimes (with current implementation this error also fired). I think another ticket should be filled for {{Test has been timed out}} error after merge of this ticket and new test failure statistics collected.

[1] https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache3&branch_IgniteTests24Java8=pull%2F3930%2Fhead&tab=buildTypeStatusDiv



was (Author: alex_pl):
Node is dropped out of topology because another node (previous in the ring) in some cases can't send message to this node and get reply within given failure detection timeout. To solve this I set reconnect count to 2 (this change also disables failure detection timeout and sets separate timeouts for each IO method invocation). I also remove {{sleep()}} in {{checkSegmented}} since this doesn't affect test logic, but brings extra delay to test (with disabled failure detection timeout test run longer).
Looped test runs on TC [1] after this fix doesn't contain {{Grid is in invalid state}} error anymore. But there are still {{Test has been timed out}} error sometimes (with current implementation this error also fired). I think another ticket should be filled for {{Test has been timed out}} error after merge of this ticket and new test failure statistics collected.

[1] https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache3&branch_IgniteTests24Java8=pull%2F3930%2Fhead&tab=buildTypeStatusDiv


> Flaky failure of IgniteTopologyValidatorGridSplitCacheTest.testTopologyValidatorWithCacheGroup (Grid is in invalid state)
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8400
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8400
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Aleksey Plekhanov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.6
>
>
> Test fails sometimes on TeamCity with exception:
> {noformat}
> java.lang.IllegalStateException: Grid is in invalid state to perform this operation. It either not started yet or has already being or have stopped [igniteInstanceName=cache.IgniteTopologyValidatorGridSplitCacheTest6, state=STOPPED]
> {noformat}
> Before this exception node is dropped out of topology by coordinator:
> {noformat}
> [tcp-disco-msg-worker-#7831%cache.IgniteTopologyValidatorGridSplitCacheTest6%][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi] Node is out of topology (probably, due to short-time network problems).
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)