You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jonathan Baynes <Jo...@tradeweb.com> on 2017/07/07 14:48:12 UTC

Move Production data to Development Cluster

Hi,

Can anyone help me. I'm trying (and failing) to move my 3 node C* data from my Production Environment to my Development 3 node cluster.

Here is the fine print...

Oracle Linux 7.3
C* 3.0.11

3 Nodes ((virtual Nodes 256))
1 Keyspace (replication factor 3) Quorum Consistency
1 table

Snapshot taken on each node.

Attempt 1

I've tried the following (http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html)



  1.  From the old cluster, retrieve the list of tokens associated with each node's IP:

$ nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs



I've done this for all 3 nodes, placed them together in one string

  1.  In the cassandra.yaml<http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml> file for each node in the new cluster, add the list of tokens you obtained in the previous step to the initial_token<http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__initial_token> parameter using the same num_tokens setting as in the old cluster.
Added all the tokens from step one

  1.  Make any other necessary changes in the new cluster's cassandra.yaml<http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml> and property files so that the new nodes match the old cluster settings. Make sure the seed nodes are set for the new cluster.
  2.  Clear the system table data from each new node:

$ sudo rm -rf /var/lib/cassandra/data/system/*

This allows the new nodes to use the initial tokens defined in the cassandra.yaml<http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml> when they restart.

  1.  Start each node using the specified list of token ranges in new cluster's cassandra.yaml<http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml>:

initial_token: -9211270970129494930, -9138351317258731895, -8980763462514965928, ...

  1.  Create schema in the new cluster. All the schemas from the old cluster must be reproduced in the new cluster.
  2.  Stop the node<http://docs.datastax.com/en/cassandra/3.0/cassandra/initialize/referenceStartStopTOC.html>. Using nodetool refresh is unsafe because files within the data directory of a running node can be silently overwritten by identically named just-flushed SSTables from memtable flushes or compaction. Copying files into the data directory and restarting the node will not work for the same reason.
  3.  Restore the SSTable files snapshotted<http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupSnapshotRestore.html> from the old cluster onto the new cluster using the same directories, while noting that the UUID component of target directory names has changed. Without restoration, the new cluster will not have data to read upon restart.
  4.  Restart the node.

When  I restart I get errors in the Yaml file pointing to the token ranges.

If I take out the tokens for 2 of the nodes and just use one node(s) tokens I can restore the data, but I get only a third of the row count id expect.

I then noticed I was only restoring  the snapshot from that one node, so , that made sense....

So then I took all of the snapshots, from all of the nodes, placed them into a folder, re added all the tokens and re ran the process, but I get the token range error in the yaml again.


Attempt 2

So then I tried  SSTableLoader from the same folder with all the (3 nodes snapshots) and then I get corruption on the SSTable..


Advice

I've tried this so many ways its getting confusing.. what's right?

Can anyone give me some pointers as to the best route to migrate data from cluster to cluster? The Documentation is so vague and not detailed enough?

Do I copy the snapshots from all the nodes?
Do I just work on one node at a time?

Any suggestions please??

Thanks
J



Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  *  1 Fore Street Avenue  *  London EC2Y 9DT
P +44 (0)20 77760988  *  F +44 (0)20 7776 3201  *  M +44 (0)xxxx xxxxxx
Jonathan.Baynes@tradeweb.com

[cid:image001.jpg@01CD26AD.4165F110]<http://www.tradeweb.com/>   follow us:  [cid:image002.jpg@01CD26AD.4165F110] <https://www.linkedin.com/company/tradeweb?trk=top_nav_home>    [cid:image003.jpg@01CD26AD.4165F110] <http://www.twitter.com/Tradeweb>
-----------------------------------------
A leading marketplace<http://www.tradeweb.com/About-Us/Awards/> for electronic fixed income, derivatives and ETF trading


________________________________________________________________________

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy it. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Tradeweb reserves the right to monitor all e-mail communications through its networks. If you do not wish to receive marketing emails about our products / services, please let us know by contacting us, either by email at contactus@tradeweb.com or by writing to us at the registered office of Tradeweb in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ www.tradeweb.com.

Re: Move Production data to Development Cluster

Posted by Pranay akula <pr...@gmail.com>.
Hello *Jonathan,*

As both clusters size is same.

 Do I copy the snapshots from all the nodes?  yes this will work, just make
sure that ur copying data to nodes with assiociated tokens.


Thanks
Pranay.

On Fri, Jul 7, 2017 at 10:48 AM, Jonathan Baynes <
Jonathan.Baynes@tradeweb.com> wrote:

> Hi,
>
>
>
> Can anyone help me. I’m trying (and failing) to move my 3 node C* data
> from my Production Environment to my Development 3 node cluster.
>
>
>
> Here is the fine print…
>
>
>
> Oracle Linux 7.3
>
> C* 3.0.11
>
>
>
> 3 Nodes ((virtual Nodes 256))
>
> 1 Keyspace (replication factor 3) Quorum Consistency
>
> 1 table
>
>
>
> Snapshot taken on each node.
>
>
>
> *Attempt 1*
>
>
>
> I’ve tried the following (http://docs.datastax.com/en/
> cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html)
>
>
>
>
>
>    1. From the old cluster, retrieve the list of tokens associated with
>    each node's IP:
>
> $ nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs
>
>
>
> I’ve done this for all 3 nodes, placed them together in one string
>
>
>    1. In the cassandra.yaml
>    <http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml> file
>    for each node in the new cluster, add the list of tokens you obtained in
>    the previous step to the initial_token
>    <http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__initial_token> parameter
>    using the same *num_tokens* setting as in the old cluster.
>
> Added all the tokens from step one
>
>    1. Make any other necessary changes in the new cluster's cassandra.yaml
>    <http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml> and
>    property files so that the new nodes match the old cluster settings. Make
>    sure the seed nodes are set for the new cluster.
>    2. Clear the system table data from each new node:
>
> $ sudo rm -rf /var/lib/cassandra/data/system/*
>
> This allows the new nodes to use the initial tokens defined in the
> cassandra.yaml
> <http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml> when
> they restart.
>
>    1. Start each node using the specified list of token ranges in new
>    cluster's cassandra.yaml
>    <http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html#opsSnapshotRestoreNewCluster__cassandrayaml>
>    :
>
> initial_token: -9211270970129494930, -9138351317258731895, -8980763462514965928, ...
>
>
>    1. Create schema in the new cluster. All the schemas from the old
>    cluster must be reproduced in the new cluster.
>    2. Stop the node
>    <http://docs.datastax.com/en/cassandra/3.0/cassandra/initialize/referenceStartStopTOC.html>.
>    Using nodetool refresh is unsafe because files within the data
>    directory of a running node can be silently overwritten by identically
>    named just-flushed SSTables from memtable flushes or compaction. Copying
>    files into the data directory and restarting the node will not work for the
>    same reason.
>    3. Restore the SSTable files snapshotted
>    <http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupSnapshotRestore.html> from
>    the old cluster onto the new cluster using the same directories, while
>    noting that the UUID component of target directory names has changed.
>    Without restoration, the new cluster will not have data to read upon
>    restart.
>    4. Restart the node.
>
>
>
> When  I restart I get errors in the Yaml file pointing to the token ranges.
>
>
>
> If I take out the tokens for 2 of the nodes and just use one node(s)
> tokens I can restore the data, but I get only a third of the row count id
> expect.
>
>
>
> I then noticed I was only restoring  the snapshot from that one node, so ,
> that made sense….
>
>
>
> So then I took all of the snapshots, from all of the nodes, placed them
> into a folder, re added all the tokens and re ran the process, but I get
> the token range error in the yaml again.
>
>
>
>
>
> *Attempt 2*
>
>
>
> So then I tried  SSTableLoader from the same folder with all the (3 nodes
> snapshots) and then I get corruption on the SSTable..
>
>
>
>
>
> *Advice*
>
>
>
> I’ve tried this so many ways its getting confusing.. what’s right?
>
>
>
> Can anyone give me some pointers as to the best route to migrate data from
> cluster to cluster? The Documentation is so vague and not detailed enough?
>
>
>
> Do I copy the snapshots from all the nodes?
>
> Do I just work on one node at a time?
>
>
>
> Any suggestions please??
>
>
>
> Thanks
>
> J
>
>
>
>
>
>
>
> *Jonathan Baynes*
>
> DBA
> Tradeweb Europe Limited
>
> Moor Place  •  1 Fore Street Avenue  •  London EC2Y 9DT
> P +44 (0)20 77760988 <+44%2020%207776%200988>  •  F +44 (0)20 7776 3201
> <+44%2020%207776%203201>  •  M +44 (0)xxxx xxxxxx
>
> Jonathan.Baynes@tradeweb.com
>
>
>
> [image: cid:image001.jpg@01CD26AD.4165F110] <http://www.tradeweb.com/>
> follow us:  *[image: cid:image002.jpg@01CD26AD.4165F110]*
> <https://www.linkedin.com/company/tradeweb?trk=top_nav_home>   [image:
> cid:image003.jpg@01CD26AD.4165F110] <http://www.twitter.com/Tradeweb>
>
> —————————————————————————————————————————
>
> A leading marketplace <http://www.tradeweb.com/About-Us/Awards/> for
> electronic fixed income, derivatives and ETF trading
>
>
>
> ________________________________________________________________________
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contactus@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT. To see our privacy policy, visit our website @ www.tradeweb.com.
>