You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Andres de la Peña (Jira)" <ji...@apache.org> on 2022/06/13 14:28:00 UTC

[jira] [Comment Edited] (CASSANDRA-17307) Test Failure: org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityV30Test.testAvailability

    [ https://issues.apache.org/jira/browse/CASSANDRA-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553609#comment-17553609 ] 

Andres de la Peña edited comment on CASSANDRA-17307 at 6/13/22 2:27 PM:
------------------------------------------------------------------------

[Repeated runs|https://app.circleci.com/pipelines/github/adelapena/cassandra/1680/workflows/f8c846a3-7768-4326-b7cb-e25da4dab2d7/jobs/17618] of this test hit both the timeout mentioned on the description and the failure reported on CASSANDRA-17641, CASSANDRA-17642, CASSANDRA-17651 and CASSANDRA-17652.

The test combines queries expected to succeed with writes expected to timeout. One one hand, the queries that should succeed require a timeout config large enough to avoid accidentally timeouts queries due to a slow env. One the other hand, the queries that should timeout benefit from a not-so-long timeout config so the entire test doesn't hit a JUnit timeout.

I guess that what we can do here is increasing the query timeout config, so the queries expected to succeed don't timeout on the coordinator, and also split the test into multiple classes so there are less queries per test. That ways the queries that should timeout on the coordinator wouldn't produce a Junit timeout.

The problem is that the {{ShutdownException}} reported on the other tickets is far more common than the timeout, so it's difficult to solve the timeout without fixing the {{ShutdownException}} bug before.


was (Author: adelapena):
[Repeated runs|https://app.circleci.com/pipelines/github/adelapena/cassandra/1680/workflows/f8c846a3-7768-4326-b7cb-e25da4dab2d7/jobs/17618] of this test hit both the timeout mentioned on the description and the failure reported on CASSANDRA-17641, CASSANDRA-17642, CASSANDRA-17651 and CASSANDRA-17652.

The test combines queries expected to succeed with writes expected to timeout. One one hand, the queries that should succeed require a timeout config large enough to avoid accidentally timeouts queries due to a slow env. One the other hand, the queries that should timeout benefit from a not-so-long timeout config so the entire test doesn't hit a JUnit timeout. 

I guess that what we can do here is increasing the query timeout config, so the queries expected to succeed don't timeout on the coordinator, and also split the test into multiple classes so there are less queries per test. That ways the queries that should timeout on the coordinator wouldn't produce a Junit timeout. 

The problem is that the {{ShutdownException}} reported on the other tickets is far more common than the timeout, so it's difficult to solve the timeout problem without fixing the {{ShutdownException}} problem before.

> Test Failure: org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityV30Test.testAvailability
> ----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17307
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17307
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/java
>            Reporter: Josh McKenzie
>            Assignee: Andres de la Peña
>            Priority: Normal
>             Fix For: 4.1-beta, 4.x
>
>
> No known failures. Flakiness 0%, Stability 100%
> Error Message
> Unexpected error while reading in case write-read consistency QUORUM-QUORUM with not upgraded coordinator and 1 nodes down
> {code}
> Stacktrace
> junit.framework.AssertionFailedError: Unexpected error while reading in case write-read consistency QUORUM-QUORUM with not upgraded coordinator and 1 nodes down
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase$Tester.test(MixedModeAvailabilityTestBase.java:142)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase.lambda$testAvailability$2(MixedModeAvailabilityTestBase.java:91)
> 	at org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:231)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase.testAvailability(MixedModeAvailabilityTestBase.java:93)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase.testAvailability(MixedModeAvailabilityTestBase.java:62)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase.testAvailability(MixedModeAvailabilityTestBase.java:56)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityV30Test.testAvailability(MixedModeAvailabilityV30Test.java:33)
> Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 1 responses.
> 	at org.apache.cassandra.distributed.impl.IsolatedExecutor.waitOn(IsolatedExecutor.java:218)
> 	at org.apache.cassandra.distributed.impl.IsolatedExecutor.lambda$sync$5(IsolatedExecutor.java:109)
> 	at org.apache.cassandra.distributed.impl.Coordinator.executeWithResult(Coordinator.java:69)
> 	at org.apache.cassandra.distributed.api.ICoordinator.execute(ICoordinator.java:32)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase$Tester.lambda$test$1(MixedModeAvailabilityTestBase.java:135)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase$Tester.maybeFail(MixedModeAvailabilityTestBase.java:155)
> 	at org.apache.cassandra.distributed.upgrade.MixedModeAvailabilityTestBase$Tester.test(MixedModeAvailabilityTestBase.java:134)
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 1 responses.
> 	at org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:136)
> 	at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:142)
> 	at org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
> 	at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1833)
> 	at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1782)
> 	at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1720)
> 	at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1629)
> 	at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1166)
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:302)
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:263)
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:115)
> 	at org.apache.cassandra.distributed.impl.Coordinator.executeInternal(Coordinator.java:107)
> 	at org.apache.cassandra.distributed.impl.Coordinator.lambda$executeWithResult$0(Coordinator.java:69)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83)
> 	at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org