You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yifan Cai (Jira)" <ji...@apache.org> on 2020/05/15 21:23:00 UTC

[jira] [Commented] (CASSANDRA-15807) Support multiple keyspaces in Cassandra-Diff

    [ https://issues.apache.org/jira/browse/CASSANDRA-15807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108672#comment-17108672 ] 

Yifan Cai commented on CASSANDRA-15807:
---------------------------------------

PR: https://github.com/apache/cassandra-diff/pull/8
Code: https://github.com/yifan-c/cassandra-diff/tree/support-multiple-keyspaces

At a high level, in order to support comparing tables among multiple keyspaces, it now uses <keyspace, table> pair to locate the table. The compare logic remains the same. 

The changes is relatively invasive as it changes the input format and the schema for job metadata. Given the tool is at its early stage, I think it is OK. Feel free to raise it if any objections.

End-to-end testing performed locally to verify the spark job runs and api service is working as expected. 

Changed items:
* Replaced discrete inputs, keyspace and tables, with one list of keyspace table pairs. Now a table is identified by keyspace and table name.
* The scheme of JobMetadataDb is updated correspondingly to replace all 'table_name' with 'keyspace_table_name' and replace table_names with keyspace_table_names
* DBService in api service is changed to make sure it can query with the new schema.
* Updated the ReadMe with an example of running with multiple keyspaces.

> Support multiple keyspaces in Cassandra-Diff
> --------------------------------------------
>
>                 Key: CASSANDRA-15807
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15807
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tool/diff
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>
> Adding the support to run diff comparison of tables across multiple keyspaces to avoid bringing up multiple spark jobs and give a better view of data diffs in the whole cluster. 
> Cassandra-diff currently only support compare multiple tables under one single keyspace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org