You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Blake Eggleston (Jira)" <ji...@apache.org> on 2020/08/04 21:30:00 UTC

[jira] [Commented] (CASSANDRA-15158) Wait for schema agreement rather than in flight schema requests when bootstrapping

    [ https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171139#comment-17171139 ] 

Blake Eggleston commented on CASSANDRA-15158:
---------------------------------------------

{quote}
I am not completely sure why are we pulling again here. I would rewrite the whole solution in a such way that this Callable just does one thing on a successful response (merging of a schema) and the actual "retry" would be handled from outside. The reader has to make quite a mental exercise to visualise that this callback might actually call another callback in it until some "version" is completed etc ... At least for me, it was quite tedious to track.


{quote}
In the case of a successful pull, we won't pull again. Response and fail both call pullComplete, but an additional pull is only called if it's called from fail.

I get that this can be a bit difficult to follow, but I'm not sure there's a better approach, given the schema pulls are completely event driven during normal runtime. If we miss a schema change during normal runtime (not bootstrap), there's nothing waiting on schema convergence that would enable us to retry from the outside.

There is a periodic task that pulls schema for outstanding versions that don't have any in flight requests^[1]^, but it only runs once a minute, and we need to be more proactive about learning about schema updates since we'll be unable to serve some reads and writes until we're up to date.
{quote}TBH that is quite counterintuitive too
{quote}
Could you expand on what's counterintuitive about it? If the endpoint's schema version has changed, we need to disassociate it with it's previously reported version. I have added a comment saying as much.
{quote}The test has failed for me (repeatedly):
{quote}
Thanks, it should be passing now.

[1] This handles the case where all nodes reporting a given version are on a different version so we can't pull schema from them, and acts as a hedge against any bugs in this implementation that might cause us to not schedule schema pulls as intended

> Wait for schema agreement rather than in flight schema requests when bootstrapping
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15158
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Cluster/Schema
>            Reporter: Vincent White
>            Assignee: Blake Eggleston
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of in-flight schema pull requests, and we don't proceed with bootstrapping/stream until all the latches are released (or we timeout waiting for each one). One issue with this is that if we have a large schema, or the retrieval of the schema from the other nodes was unexpectedly slow then we have no explicit check in place to ensure we have actually received a schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the node to wait on each latche longer, there are cases where this doesn't help because the callbacks for the schema pull requests have expired off the messaging service's callback map (org.apache.cassandra.net.MessagingService#callbacks) after request_timeout_in_ms (default 10 seconds) before the other nodes were able to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the rest of the live nodes before proceeding with bootstrapping. It also adds a check to prevent the new node from flooding existing nodes with simultaneous schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters getting stuck for extended amounts of time as they wait `migration_task_wait_in_seconds` on each of the latches left orphaned by the timed out callbacks.
>  
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org