You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Pavel Yaskevich (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2012/01/06 14:39:39 UTC

[jira] [Issue Comment Edited] (CASSANDRA-1391) Allow Concurrent Schema Migrations

    [ https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181316#comment-13181316 ] 

Pavel Yaskevich edited comment on CASSANDRA-1391 at 1/6/12 1:39 PM:
--------------------------------------------------------------------

Let me get this clear - migrations use apply/diff internally for their actions upon KEYSPACE_CF. 0003 patch introduces content-based schema version which is calculated from KEYSPACES_CF.

KEYSPACES_CF layout

{noformat}
name: { // key
  'keyspace': str,
  'comparator': str,
  ... 
  'columns': { // composite!
    column name: {
      'validation_class': str,
      'index_type': str,
      'index_name': str,
      'index_options': { }
    }
  }
}
{noformat}

Current schema distribution is switched to be pull oriented: node A, let's call it coordinator, applies migration locally and gossips its new (content-based) version to the ring. Node B checks if it's current version differs from new version of Node A and if so, it makes a migration request to coordinator by sending MIGRATION_REQUEST message with list of its local migrations attached. Coordinator upon receiving that message makes a diff between B migrations and its local and replies to B with missing migrations. The last thing for B to do is just deserialize received migrations and apply them one-by-one. Upon startup node uses onAlive gossip handler to check versions on other nodes and request missing migrations if needed. 

It feels to be better than sending the whole KEYSPACES_CF on each schema change and let receiver to decide what actions to do upon it. Migrations are good fit for that part because they know exactly what to do related to specific changes e.g. when Add Keyspace issued - load new CF definitions to the schema, open table, create corresponding directories etc.
                
      was (Author: xedin):
    Let me get this clear - migrations use apply/diff internally for their actions upon KEYSPACE_CF. 0003 patch introduces content-based schema version which is calculated from KEYSPACES_CF.

KEYSPACES_CF layout

{noformat}
name: { // key
  'keyspace': str,
  'comparator': str,
  ... 
  'columns': { // composite!
    column name: {
      'validation_class': str,
      'index_type': str,
      'index_name': str,
      'index_options': { }
    }
  }
}
{noformat}

Current schema distribution is switched to be pull oriented: node A, let's call it coordinator, applies migration locally and gossips its new (content-based) version to the ring. Node B checks if it's current version differs from new version of Node A and if so, it makes a migration request to coordinator by sending MIGRATION_REQUEST message with list of its local migrations attached. Coordinator upon receiving that message makes a diff between B migrations and its local and replies to B with missing migrations. The last thing for B to do is just deserialize received migrations and apply them one-by-one. Upon startup node uses onAlive gossip handler to check versions on other nodes and request missing migrations if needed. 

It feels to be better than sending the whole KEYSPACES_CF on each schema change and let receiver to decide what actions to do upon it.
                  
> Allow Concurrent Schema Migrations
> ----------------------------------
>
>                 Key: CASSANDRA-1391
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Stu Hood
>            Assignee: Pavel Yaskevich
>             Fix For: 1.1
>
>         Attachments: 0001-new-migration-schema-and-avro-methods-cleanup.patch, 0002-avro-removal.patch, 0003-oldVersion-removed-new-migration-distribution-schema.patch, CASSANDRA-1391.patch
>
>
> CASSANDRA-1292 fixed multiple migrations started from the same node to properly queue themselves, but it is still possible for migrations initiated on different nodes to conflict and leave the cluster in a bad state. Since the system_add/drop/rename methods are accessible directly from the client API, they should be completely safe for concurrent use.
> It should be possible to allow for most types of concurrent migrations by converting the UUID schema ID into a VersionVectorClock (as provided by CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira