You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Berenguer Blasi (Jira)" <ji...@apache.org> on 2020/06/11 10:23:00 UTC

[jira] [Comment Edited] (CASSANDRA-15863) Boostrap resume and TestReplaceAddress fixes

    [ https://issues.apache.org/jira/browse/CASSANDRA-15863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133115#comment-17133115 ] 

Berenguer Blasi edited comment on CASSANDRA-15863 at 6/11/20, 10:22 AM:
------------------------------------------------------------------------

This ticket fixes a number of failures so here's some direction for reviewers:

*test_resume_failed_replace, test_restart_failed_replace_with_reset_resume_state & test_resume_failed_replace*

This test fails waiting for [this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164] log trace. This is never reached bc on the test we are failing bootstrap and thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we [exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568] before we reach that point.

The solution is to replace the nodes [without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482] waiting for that log trace and checking in an [alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485] way the bootstrap status.

 

*test_resume_failed_replace*

Once the above was fixed we would never hit the resume complete [log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507]. This is bc {{StorageService#resumeBoostrap}} [here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625] would throw an exception starting the daemon. That exception was being swallowed, now it is getting logged. Also I had to add a native transport [init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623] to avoid said exception and the daemon to start correctly. I am worried about any side effects of this extra native transport init, so sbdy with a broader knowledge of the codebase should chime in.

 

*test_replace_nonexistent_node, test_replace_first_boot, test_replace_shutdown_node & test_replace_stopped_node*

These in the end turned out to be failures based on the logging messages having changed throughout versions.


was (Author: bereng):
This ticket fixes a number of failures so here's some direction for reviewers:

 

*test_resume_failed_replace, test_restart_failed_replace_with_reset_resume_state & test_resume_failed_replace*

This test fails waiting for [this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164] log trace. This is never reached bc on the test we are failing bootstrap and thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we [exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568] before we reach that point.

The solution is to replace the nodes [without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482] waiting for that log trace and checking in an [alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485] way the bootstrap status.

 

*test_resume_failed_replace*

Once the above was fixed we would never hit the resume complete [log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507]. This is bc {{StorageService#resumeBoostrap}} [here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625] would throw an exception starting the daemon. That exception was being swallowed, now it is getting logged. Also I had to add a native transport [init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623] to avoid said exception and the daemon to start correctly. I am worried about any side effects of this extra native transport init, so sbdy with a broader knowledge of the codebase should chime in.

 

*test_replace_nonexistent_node, test_replace_first_boot, test_replace_shutdown_node & test_replace_stopped_node*

These in the end turned out to be failures based on the logging messages having changed throughout versions.

> Boostrap resume and TestReplaceAddress fixes
> --------------------------------------------
>
>                 Key: CASSANDRA-15863
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15863
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Bootstrap and Decommission, Test/dtest
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.0-alpha
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has been [broken|https://ci-cassandra.apache.org/job/Cassandra-trunk/159/testReport/dtest-large.replace_address_test/TestReplaceAddress/test_restart_failed_replace/history/] for ages



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org