You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Adam Holmberg (Jira)" <ji...@apache.org> on 2020/08/25 14:47:00 UTC

[jira] [Comment Edited] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

    [ https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184097#comment-17184097 ] 

Adam Holmberg edited comment on CASSANDRA-15958 at 8/25/20, 2:46 PM:
---------------------------------------------------------------------

Thanks, Yifan, for taking a look. I'll revisit that before submitting for review.

In the mean time, I looked some at the other flakiness.

{noformat}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
	at org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$36(ConnectionTest.java:700)
	at org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$40(ConnectionTest.java:730)
	at org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:263)
	at org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
	at org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
	at org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$36(ConnectionTest.java:693)
{noformat}

Timeout waiting for message to be purged. It will never purge.

Identified another race condition where the next expiration can be set to Long.MAX_VALUE, and pruning will never happen.

[Periodic pruning|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L203-L209] runs in the background. It might run against an empty queue, producing a [pruner|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L288] with {{earliestExpiresAt = Long.MAX_VALUE}}. In the mean time a message is added and the current deadline matches that message. If our nowNanos was established more than the clock resolution after the message expire time was set, the earliestExpiresAt time is [set|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L228-L229] to {{Long.MAX_VALUE}}.

I have another tweak for this issue. Still looking at some other related unit tests.


was (Author: aholmber):
Thanks, Yifan, for taking a look. I'll revisit that before submitting for review.

In the mean time, I looked some at the other flakiness.

{noformat}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
	at org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$36(ConnectionTest.java:700)
	at org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$40(ConnectionTest.java:730)
	at org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:263)
	at org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
	at org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
	at org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$36(ConnectionTest.java:693)
{noformat}

Timeout waiting for message to be purged. It will never purge.

Identified another race condition where the next expiration can be set to Long.MAX_VALUE, and pruning will never happen.

[Periodic pruning|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L203-L209] runs in the background. It might run against an empty queue, producing a [pruner|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L288] with {{earliestExpiresAt = Long.MAX_VALUE}}. In the mean time a message is added and the current deadline matches that message. If our nowNanos is was established more than the clock resolution after the message expire time was set, the earliestExpiresAt time is [set|https://github.com/aholmberg/cassandra/blob/eb1c9bf59c7faba3e65be7fc45359611c7242672/src/java/org/apache/cassandra/net/OutboundMessageQueue.java#L228-L229] to {{Long.MAX_VALUE}}.

I have another tweak for this issue. Still looking at some other related unit tests.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-15958
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/unit
>            Reporter: David Capwell
>            Assignee: Adam Holmberg
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> Build: https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
> 	at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
> 	at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
> 	at org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
> 	at org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
> 	at org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org