You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Blake Eggleston (Jira)" <ji...@apache.org> on 2020/10/09 21:12:00 UTC

[jira] [Commented] (CASSANDRA-15158) Wait for schema agreement rather than in flight schema requests when bootstrapping

    [ https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211409#comment-17211409 ] 

Blake Eggleston commented on CASSANDRA-15158:
---------------------------------------------

Ok, I've fixed all the dtest issues and ported the 3.11 fix to 3.0 and trunk.

The 3.11 branch with misc fixes is here: https://github.com/bdeggleston/cassandra/tree/15158-3.11

Most of the fixes are self explanatory, the less obvious ones are:

* wait for gossip to settle before waiting on schemas. The original patch accidentally removed the wait on the schema version to be non-empty, so this both a fix and a change. Waiting for gossip makes it more likely that we've seen all current schema versions before we begin waiting instead of just waiting for the first schema to be received, which was the effect of waiting on Schema.instance.isEmpty
* don't check liveness in shouldPullFromEndpoint. There were cases where the node wasn't considered alive until after `reportEndpointVersion` had been called (because it happens as part of the current gossip update).
* move migration start inside StorageService.joinRing because in-jvm dtests don't use CassandraDaemon
* don't fail maybePullSchema if version info has no endpoints. We automatically call that after completing a pull, so a version will eventually have no endpoints left on it

The squashed ports are here:
| [3.0|https://github.com/bdeggleston/cassandra/tree/15158-3.0] | [circle|https://app.circleci.com/pipelines/github/bdeggleston/cassandra?branch=15158-3.0] |
| [3.11|https://github.com/bdeggleston/cassandra/tree/15158-3.11-squashed] | [circle|https://app.circleci.com/pipelines/github/bdeggleston/cassandra?branch=15158-3.11-squashed] |
| [trunk|https://github.com/bdeggleston/cassandra/tree/15158-trunk] | [circle|https://app.circleci.com/pipelines/github/bdeggleston/cassandra?branch=15158-trunk] |


> Wait for schema agreement rather than in flight schema requests when bootstrapping
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15158
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Cluster/Schema
>            Reporter: Vincent White
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of in-flight schema pull requests, and we don't proceed with bootstrapping/stream until all the latches are released (or we timeout waiting for each one). One issue with this is that if we have a large schema, or the retrieval of the schema from the other nodes was unexpectedly slow then we have no explicit check in place to ensure we have actually received a schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the node to wait on each latche longer, there are cases where this doesn't help because the callbacks for the schema pull requests have expired off the messaging service's callback map (org.apache.cassandra.net.MessagingService#callbacks) after request_timeout_in_ms (default 10 seconds) before the other nodes were able to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the rest of the live nodes before proceeding with bootstrapping. It also adds a check to prevent the new node from flooding existing nodes with simultaneous schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters getting stuck for extended amounts of time as they wait `migration_task_wait_in_seconds` on each of the latches left orphaned by the timed out callbacks.
>  
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org