You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mirza Aliev (Jira)" <ji...@apache.org> on 2023/05/03 12:50:00 UTC

[jira] [Updated] (IGNITE-19255) Fix broken unit tests in distribution-zones module

     [ https://issues.apache.org/jira/browse/IGNITE-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mirza Aliev updated IGNITE-19255:
---------------------------------
    Description: 
In IGNITE-19105 I've changed some internal shenanigans of the MetaStorageManager (without affecting its API in any way). After that, nearly all unit tests in the {{distribution-zones}} module started to fail. Turns out it happened because of extensive mock usages that emulate behavior of the Meta Storage. So I decided to replace it with the {{StandaloneMetaStorageManager}} implementation and all hell broke loose: many tests emulate Meta Storage incorrectly, a lot of races appeared, because many methods became truly asynchronous.

This situation is very frustrating: a different component internals were changed with no API changes and a completely unrelated module is not longer able to pass its tests. Though I fixed most of the failures, some tests are still failing and I'm going to try to describe, what's wrong with them:

*{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationAfterScaleUpTriggeredOnNewCluster}}* - this test tests a scenario when we start a node after logical topology was updated. I don't know how realistic is this scenario, but the problem is that "data nodes" don't get populated with the logical topology nodes on {{distributionZoneManager}} start, because {{scheduleTimers}} method, that get's invoked from the Meta Storage Watch, doesn't go inside the {{if (!addedNodes.isEmpty() && autoAdjustScaleUp != INFINITE_TIMER_VALUE)}} branch.

*{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationForDefaultZoneAfterScaleUpTriggered}}* - same issue as above.

*{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationForDefaultZoneAfterScaleDownTriggered}}* - same issue as above.

*{{DistributionZoneManagerScaleUpTest#testUpdateZoneScaleUpTriggersDataNodePropagation}}* - this test fails with the following assertion error: {_}Expected revision that is greater or equal to already seen meta storage events.{_}. This is because TestConfigurationStorage does not use the same revision as the Meta Storage, therefore their revisions can't be compared directly. This should either be converted to an integration test or it should use `DistributedConfigurationStrorage` instead.
(ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerScaleUpTest#testUpdateZoneScaleDownTriggersDataNodePropagation}}* - same issue as above. (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerScaleUpTest#testDropZoneDoNotPropagateDataNodesAfterScaleUp}}* - this test is flaky, because notifications from test configuration storage and from Meta Storage Watches are not related to each other (unlike real-life Distributed Configuration Storage which is built on top of Watches), so notifications from the configuration storage and Meta Storage can arrive in a undetermined order. (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerScaleUpTest#testDropZoneDoNotPropagateDataNodesAfterScaleDown}}* - same issue as above.
(ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerWatchListenerTest#testDataNodesOfDefaultZoneUpdatedOnWatchListenerEvent}}* - this test is flaky, probably due to some races between Watch and Configuration Listener execution (sometimes a retry on {{invoke}} happens and {{Mockito#verify}} fails). (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

 

*New tests* from [https://github.com/gridgain/apache-ignite-3/tree/ignite-18756]

*DistributionZoneAwaitDataNodesTest#testRemoveZoneWhileAwaitingDataNodes* - this test must remove the zone after MetastorageTopologyListener updates the topVerTracker and before 
MetastorageDataNodesListener updates scaleUpRevisionTracker/scaleDownRevisionTracker. Now it's impossible to do it with StandaloneMetaStorageManager. (https://issues.apache.org/jira/browse/IGNITE-19343)
*DistributionZoneAwaitDataNodesTest#testScaleUpScaleDownAreChangedWhileAwaitingDataNodes* - same issue as above but here we need to update scaleUp and scaleDown instead of removing the zone. (https://issues.apache.org/jira/browse/IGNITE-19343)

  was:
In IGNITE-19105 I've changed some internal shenanigans of the MetaStorageManager (without affecting its API in any way). After that, nearly all unit tests in the {{distribution-zones}} module started to fail. Turns out it happened because of extensive mock usages that emulate behavior of the Meta Storage. So I decided to replace it with the {{StandaloneMetaStorageManager}} implementation and all hell broke loose: many tests emulate Meta Storage incorrectly, a lot of races appeared, because many methods became truly asynchronous.

This situation is very frustrating: a different component internals were changed with no API changes and a completely unrelated module is not longer able to pass its tests. Though I fixed most of the failures, some tests are still failing and I'm going to try to describe, what's wrong with them:

*{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationAfterScaleUpTriggeredOnNewCluster}}* - this test tests a scenario when we start a node after logical topology was updated. I don't know how realistic is this scenario, but the problem is that "data nodes" don't get populated with the logical topology nodes on {{distributionZoneManager}} start, because {{scheduleTimers}} method, that get's invoked from the Meta Storage Watch, doesn't go inside the {{if (!addedNodes.isEmpty() && autoAdjustScaleUp != INFINITE_TIMER_VALUE)}} branch.

*{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationForDefaultZoneAfterScaleUpTriggered}}* - same issue as above.

*{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationForDefaultZoneAfterScaleDownTriggered}}* - same issue as above.

*{{DistributionZoneManagerScaleUpTest#testUpdateZoneScaleUpTriggersDataNodePropagation}}* - this test fails with the following assertion error: {_}Expected revision that is greater or equal to already seen meta storage events.{_}. This is because TestConfigurationStorage does not use the same revision as the Meta Storage, therefore their revisions can't be compared directly. This should either be converted to an integration test or it should use `DistributedConfigurationStrorage` instead.
(ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerScaleUpTest#testUpdateZoneScaleDownTriggersDataNodePropagation}}* - same issue as above. (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerScaleUpTest#testDropZoneDoNotPropagateDataNodesAfterScaleUp}}* - this test is flaky, because notifications from test configuration storage and from Meta Storage Watches are not related to each other (unlike real-life Distributed Configuration Storage which is built on top of Watches), so notifications from the configuration storage and Meta Storage can arrive in a undetermined order. (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerScaleUpTest#testDropZoneDoNotPropagateDataNodesAfterScaleDown}}* - same issue as above.
(ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)

*{{DistributionZoneManagerWatchListenerTest#testDataNodesOfDefaultZoneUpdatedOnWatchListenerEvent}}* - this test is flaky, probably due to some races between Watch and Configuration Listener execution (sometimes a retry on {{invoke}} happens and {{Mockito#verify}} fails).

 

*New tests* from [https://github.com/gridgain/apache-ignite-3/tree/ignite-18756]

*DistributionZoneAwaitDataNodesTest#testRemoveZoneWhileAwaitingDataNodes* - this test must remove the zone after MetastorageTopologyListener updates the topVerTracker and before 
MetastorageDataNodesListener updates scaleUpRevisionTracker/scaleDownRevisionTracker. Now it's impossible to do it with StandaloneMetaStorageManager. (https://issues.apache.org/jira/browse/IGNITE-19343)
*DistributionZoneAwaitDataNodesTest#testScaleUpScaleDownAreChangedWhileAwaitingDataNodes* - same issue as above but here we need to update scaleUp and scaleDown instead of removing the zone. (https://issues.apache.org/jira/browse/IGNITE-19343)


> Fix broken unit tests in distribution-zones module
> --------------------------------------------------
>
>                 Key: IGNITE-19255
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19255
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Aleksandr Polovtcev
>            Assignee: Mirza Aliev
>            Priority: Blocker
>              Labels: ignite-3
>
> In IGNITE-19105 I've changed some internal shenanigans of the MetaStorageManager (without affecting its API in any way). After that, nearly all unit tests in the {{distribution-zones}} module started to fail. Turns out it happened because of extensive mock usages that emulate behavior of the Meta Storage. So I decided to replace it with the {{StandaloneMetaStorageManager}} implementation and all hell broke loose: many tests emulate Meta Storage incorrectly, a lot of races appeared, because many methods became truly asynchronous.
> This situation is very frustrating: a different component internals were changed with no API changes and a completely unrelated module is not longer able to pass its tests. Though I fixed most of the failures, some tests are still failing and I'm going to try to describe, what's wrong with them:
> *{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationAfterScaleUpTriggeredOnNewCluster}}* - this test tests a scenario when we start a node after logical topology was updated. I don't know how realistic is this scenario, but the problem is that "data nodes" don't get populated with the logical topology nodes on {{distributionZoneManager}} start, because {{scheduleTimers}} method, that get's invoked from the Meta Storage Watch, doesn't go inside the {{if (!addedNodes.isEmpty() && autoAdjustScaleUp != INFINITE_TIMER_VALUE)}} branch.
> *{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationForDefaultZoneAfterScaleUpTriggered}}* - same issue as above.
> *{{DistributionZoneManagerScaleUpTest#testDataNodesPropagationForDefaultZoneAfterScaleDownTriggered}}* - same issue as above.
> *{{DistributionZoneManagerScaleUpTest#testUpdateZoneScaleUpTriggersDataNodePropagation}}* - this test fails with the following assertion error: {_}Expected revision that is greater or equal to already seen meta storage events.{_}. This is because TestConfigurationStorage does not use the same revision as the Meta Storage, therefore their revisions can't be compared directly. This should either be converted to an integration test or it should use `DistributedConfigurationStrorage` instead.
> (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)
> *{{DistributionZoneManagerScaleUpTest#testUpdateZoneScaleDownTriggersDataNodePropagation}}* - same issue as above. (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)
> *{{DistributionZoneManagerScaleUpTest#testDropZoneDoNotPropagateDataNodesAfterScaleUp}}* - this test is flaky, because notifications from test configuration storage and from Meta Storage Watches are not related to each other (unlike real-life Distributed Configuration Storage which is built on top of Watches), so notifications from the configuration storage and Meta Storage can arrive in a undetermined order. (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)
> *{{DistributionZoneManagerScaleUpTest#testDropZoneDoNotPropagateDataNodesAfterScaleDown}}* - same issue as above.
> (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)
> *{{DistributionZoneManagerWatchListenerTest#testDataNodesOfDefaultZoneUpdatedOnWatchListenerEvent}}* - this test is flaky, probably due to some races between Watch and Configuration Listener execution (sometimes a retry on {{invoke}} happens and {{Mockito#verify}} fails). (ticket is created https://issues.apache.org/jira/browse/IGNITE-19342)
>  
> *New tests* from [https://github.com/gridgain/apache-ignite-3/tree/ignite-18756]
> *DistributionZoneAwaitDataNodesTest#testRemoveZoneWhileAwaitingDataNodes* - this test must remove the zone after MetastorageTopologyListener updates the topVerTracker and before 
> MetastorageDataNodesListener updates scaleUpRevisionTracker/scaleDownRevisionTracker. Now it's impossible to do it with StandaloneMetaStorageManager. (https://issues.apache.org/jira/browse/IGNITE-19343)
> *DistributionZoneAwaitDataNodesTest#testScaleUpScaleDownAreChangedWhileAwaitingDataNodes* - same issue as above but here we need to update scaleUp and scaleDown instead of removing the zone. (https://issues.apache.org/jira/browse/IGNITE-19343)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)