You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Caleb Rackliffe (Jira)" <ji...@apache.org> on 2021/09/24 05:52:00 UTC

[jira] [Commented] (CASSANDRA-16856) Prevent broken concurrent schema pulls

    [ https://issues.apache.org/jira/browse/CASSANDRA-16856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419581#comment-17419581 ] 

Caleb Rackliffe commented on CASSANDRA-16856:
---------------------------------------------

I've been looking at the 4.0 & trunk versions of this patch, and I'm having a hard time putting things together in my head. Reading the description above, it seems like the approach was going to be a.) synchronize {{SchemaKeyspace.convertSchemaToMutations()}}, effectively serializing requests handled by {{SchemaPullVerbHandler}} and b.) synchronize {{SchemaKeyspace.applyChanges()}} (I'm guessing?), which is where mutations to the schema keyspace are actually applied. In other words, the idea was to not allow concurrent reads and writes on the state protected by {{SchemaKeyspace}}. (Would we also need to synchronize {{truncate()}}?)

It seems like only "a" was done here, and not "b".

CC [~bereng] [~brandon.williams]

> Prevent broken concurrent schema pulls
> --------------------------------------
>
>                 Key: CASSANDRA-16856
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16856
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 4.1, 3.11.x, 4.0.x
>
>
> There's a race condition around pulling schema changes, that can occur in case the schema changes push/propagation mechanism is not immediately effective (e.g. because of network delay, or because of the pulling node being down, etc.).
> If schema changes happen on node 1, these changes do not reach node 2 immediately through the SCHEMA.PUSH mechanism, and are first recognized during gossiping, the corresponding SCHEMA.PULL request from node 2 can catch the node 1 schema in the middle of it being modified by another schema change request. This can easily lead to problems (e.g. if a new table is being added, and the node 2 request reads the changes that need to be applied to  system_schema.tables, but not the ones that need to be applied to system_schema.columns).
> This PR addresses that by synchronizing the SCHEMA.PULL "RPC call" executed in node 1 by a request from node 2 with the method for applying schema changes in node 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org