You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Caleb Rackliffe (Jira)" <ji...@apache.org> on 2021/11/10 02:25:00 UTC

[jira] [Comment Edited] (CASSANDRA-17039) Flaky RepairJobTest.testNoTreesRetainedAfterDifference

    [ https://issues.apache.org/jira/browse/CASSANDRA-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441464#comment-17441464 ] 

Caleb Rackliffe edited comment on CASSANDRA-17039 at 11/10/21, 2:24 AM:
------------------------------------------------------------------------

The memory meter reports completely different numbers between J8 and J11 for the retained size of {{MeasureableRepairSession}} across the same exact C* codebase:

J8
{noformat}
INFO  [main] 2021-11-09 20:13:02,767 RepairJob.java:268 - Created 2 sync tasks based on 3 merkle tree responses for 6e524549-dbfd-4143-83b6-8a9920594f9e (took: 13ms)
INFO  [RepairJobTask:1] 2021-11-09 20:13:02,876 SyncTask.java:89 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Endpoints /127.0.0.1:7012 and /127.0.0.2:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:02,876 SyncTask.java:89 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Endpoints /127.0.0.1:7012 and /127.0.0.3:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:02,878 SymmetricRemoteSyncTask.java:68 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.3:7012)
INFO  [RepairJobTask:1] 2021-11-09 20:13:02,878 SymmetricRemoteSyncTask.java:68 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.2:7012)
ERROR [main] 2021-11-09 20:13:03,123 SubstituteLogger.java:265 - Size with trees: 9574096
DEBUG [RepairJobTask:2] 2021-11-09 20:13:03,125 RepairSession.java:241 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Repair completed between /127.0.0.1:7012 and /127.0.0.3:7012 on Standard1
DEBUG [RepairJobTask:1] 2021-11-09 20:13:03,125 RepairSession.java:241 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Repair completed between /127.0.0.1:7012 and /127.0.0.2:7012 on Standard1
ERROR [main] 2021-11-09 20:13:03,275 SubstituteLogger.java:265 - Size without trees: 1863000
{noformat}

J11
{noformat}
INFO  [main] 2021-11-09 20:13:26,155 RepairJob.java:268 - Created 2 sync tasks based on 3 merkle tree responses for f9e90e8c-6ae6-4ca9-ba24-f40750d8b0f8 (took: 11ms)
INFO  [RepairJobTask:1] 2021-11-09 20:13:26,204 SyncTask.java:89 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Endpoints /127.0.0.1:7012 and /127.0.0.2:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:26,204 SyncTask.java:89 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Endpoints /127.0.0.1:7012 and /127.0.0.3:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:26,205 SymmetricRemoteSyncTask.java:68 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.3:7012)
INFO  [RepairJobTask:1] 2021-11-09 20:13:26,205 SymmetricRemoteSyncTask.java:68 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.2:7012)
ERROR [main] 2021-11-09 20:13:26,641 SubstituteLogger.java:265 - Size with trees: 16202960
DEBUG [RepairJobTask:1] 2021-11-09 20:13:26,643 RepairSession.java:241 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Repair completed between /127.0.0.1:7012 and /127.0.0.2:7012 on Standard1
DEBUG [RepairJobTask:2] 2021-11-09 20:13:26,643 RepairSession.java:241 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Repair completed between /127.0.0.1:7012 and /127.0.0.3:7012 on Standard1
ERROR [main] 2021-11-09 20:13:48,363 SubstituteLogger.java:265 - Size without trees: 8447392
{noformat}

It's almost as if only one of the two threads released the relevant memory...


was (Author: maedhroz):
The memory meter reports completely different numbers between J8 and J11 for the retained size of {{MeasureableRepairSession}} across the same exact C* codebase:

J8
{noformat}
INFO  [main] 2021-11-09 20:13:02,767 RepairJob.java:268 - Created 2 sync tasks based on 3 merkle tree responses for 6e524549-dbfd-4143-83b6-8a9920594f9e (took: 13ms)
INFO  [RepairJobTask:1] 2021-11-09 20:13:02,876 SyncTask.java:89 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Endpoints /127.0.0.1:7012 and /127.0.0.2:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:02,876 SyncTask.java:89 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Endpoints /127.0.0.1:7012 and /127.0.0.3:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:02,878 SymmetricRemoteSyncTask.java:68 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.3:7012)
INFO  [RepairJobTask:1] 2021-11-09 20:13:02,878 SymmetricRemoteSyncTask.java:68 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.2:7012)
ERROR [main] 2021-11-09 20:13:03,123 SubstituteLogger.java:265 - Size with trees: 9574096
DEBUG [RepairJobTask:2] 2021-11-09 20:13:03,125 RepairSession.java:241 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Repair completed between /127.0.0.1:7012 and /127.0.0.3:7012 on Standard1
DEBUG [RepairJobTask:1] 2021-11-09 20:13:03,125 RepairSession.java:241 - [repair #bba3e650-41cb-11ec-86cb-e76f756cd20a] Repair completed between /127.0.0.1:7012 and /127.0.0.2:7012 on Standard1
ERROR [main] 2021-11-09 20:13:03,275 SubstituteLogger.java:265 - Size without trees: 1863000
{noformat}

J11
{noformat}
INFO  [main] 2021-11-09 20:13:26,155 RepairJob.java:268 - Created 2 sync tasks based on 3 merkle tree responses for f9e90e8c-6ae6-4ca9-ba24-f40750d8b0f8 (took: 11ms)
INFO  [RepairJobTask:1] 2021-11-09 20:13:26,204 SyncTask.java:89 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Endpoints /127.0.0.1:7012 and /127.0.0.2:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:26,204 SyncTask.java:89 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Endpoints /127.0.0.1:7012 and /127.0.0.3:7012 have 1 range(s) out of sync for Standard1
INFO  [RepairJobTask:2] 2021-11-09 20:13:26,205 SymmetricRemoteSyncTask.java:68 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.3:7012)
INFO  [RepairJobTask:1] 2021-11-09 20:13:26,205 SymmetricRemoteSyncTask.java:68 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Forwarding streaming repair of 1 ranges to /127.0.0.1:7012 (to be streamed with /127.0.0.2:7012)
ERROR [main] 2021-11-09 20:13:26,641 SubstituteLogger.java:265 - Size with trees: 16202960
DEBUG [RepairJobTask:1] 2021-11-09 20:13:26,643 RepairSession.java:241 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Repair completed between /127.0.0.1:7012 and /127.0.0.2:7012 on Standard1
DEBUG [RepairJobTask:2] 2021-11-09 20:13:26,643 RepairSession.java:241 - [repair #c9a25bb0-41cb-11ec-bba2-f9bd30f7271a] Repair completed between /127.0.0.1:7012 and /127.0.0.3:7012 on Standard1
ERROR [main] 2021-11-09 20:13:48,363 SubstituteLogger.java:265 - Size without trees: 8447392
{noformat}

> Flaky RepairJobTest.testNoTreesRetainedAfterDifference
> ------------------------------------------------------
>
>                 Key: CASSANDRA-17039
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17039
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Brandon Williams
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>
> Sometimes fails an assertion:
> {noformat}
> Expecting:
>  <10000L>
> to be less than:
>  <10000L> 
> {noformat}
> https://app.circleci.com/pipelines/github/driftx/cassandra/269/workflows/f2b0a738-0785-4011-9ac1-071837dc9170/jobs/2049/tests#failed-test-1



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org