You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by onmstester onmstester <on...@zoho.com> on 2018/04/08 09:15:40 UTC

copy from one table to another

Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.



Sent using Zoho Mail






Re: copy from one table to another

Posted by Christophe Schmitz <ch...@instaclustr.com>.
If you need this kind of logic, you might want to consider using Spark.
It's often used for data migration.
You could load your list of partition_key in a Spark RDD, then
use joinWithCassandraTable, and write the result back to your destination
table.
Just before the join, you could use repartitionByCassandraReplica on your
RDD to have better data locality.
This documentation can be helpful:
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md#performing-efficient-joins-with-cassandra-tables-since-12

Hope it helps

Cheers,
Christophe

On 9 April 2018 at 13:09, onmstester onmstester <on...@zoho.com> wrote:

> Thank you all
> I need something like this:
> insert into table test2 select * from test1 where
> partition_key='SOME_KEYS';
> The problem with copying sstable is that original table contains some
> billions of records and i only want some hundred millions of records from
> the table, so after copy/pasting big sstables in so many nodes i should
> wait for a deletion that would take so long to response:
> delete from test2 where partition_key != 'SOME_KEYS'
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
> ---- On Mon, 09 Apr 2018 06:14:02 +0430 *Dmitry Saprykin
> <saprykin.dmitry@gmail.com <sa...@gmail.com>>* wrote ----
>
> IMHO The best step by step description of what you need to do is here
>
> https://issues.apache.org/jira/browse/CASSANDRA-1585?
> focusedCommentId=13488959&page=com.atlassian.jira.
> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13488959
>
> The only difference is that you need to copy data from one table only. I
> did it for a whole keyspace.
>
>
>
>
> On Sun, Apr 8, 2018 at 3:06 PM Jean Carlo <je...@gmail.com>
> wrote:
>
> You can use the same procedure to restore a table from snapshot from
> datastax webpage
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/
> operations/ops_backup_snapshot_restore_t.html
> Just two modifications.
>
> after step 5, modify the name of the sstables to add the name of the table
> you want to copy to.
>
> and in the step 6 copy the sstables to the right directory corresponding
> to the tale you want to copy to.
>
> Be sure you have an snapshot of the table source and ignore step 4 of
> course
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin <saprykin.dmitry@gmail.com
> > wrote:
>
> You can copy hardlinks to ALL SSTables from old to new table and then
> delete part of data you do not need in a new one.
>
> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>
> wrote:
>
> If it for testing and you don’t need any specific data, just copy a set of
> sstables with all files of that sequence and move to target tables
> directory and rename it.
>
> Restart target node or run nodetool refresh
>
> Sent from my iPhone
>
> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>
> wrote:
>
> Is there any way to copy some part of a table to another table in
> cassandra? A large amount of data should be copied so i don't want to fetch
> data to client and stream it back to cassandra using cql.
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>
>


-- 

*Christophe Schmitz - **VP Consulting*

AU: +61 4 03751980 / FR: +33 7 82022899

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>. This email has been sent on behalf
of Instaclustr Pty. Limited (Australia) and Instaclustr Inc (USA). This
email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: copy from one table to another

Posted by onmstester onmstester <on...@zoho.com>.
Thank you all

I need something like this:

insert into table test2 select * from test1 where partition_key='SOME_KEYS';

The problem with copying sstable is that original table contains some billions of records and i only want some hundred millions of records from the table, so after copy/pasting big sstables in so many nodes i should wait for a deletion that would take so long to response:

delete from test2 where partition_key != 'SOME_KEYS'


Sent using Zoho Mail






---- On Mon, 09 Apr 2018 06:14:02 +0430 Dmitry Saprykin &lt;saprykin.dmitry@gmail.com&gt; wrote ----




IMHO The best step by step description of what you need to do is here



https://issues.apache.org/jira/browse/CASSANDRA-1585?focusedCommentId=13488959&amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13488959



The only difference is that you need to copy data from one table only. I did it for a whole keyspace.










On Sun, Apr 8, 2018 at 3:06 PM Jean Carlo &lt;jean.jeancarl48@gmail.com&gt; wrote:






You can use the same procedure to restore a table from snapshot from datastax webpage 



https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html


Just two modifications.




after step 5, modify the name of the sstables to add the name of the table you want to copy to.




and in the step 6 copy the sstables to the right directory corresponding to the tale you want to copy to.




Be sure you have an snapshot of the table source and ignore step 4 of course 






Saludos



Jean Carlo


"The best way to predict the future is to invent it" Alan Kay






On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin &lt;saprykin.dmitry@gmail.com&gt; wrote:

You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.



On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth &lt;nitankainth@gmail.com&gt; wrote:

If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.



Restart target node or run nodetool refresh 



Sent from my iPhone



On Apr 8, 2018, at 4:15 AM, onmstester onmstester &lt;onmstester@zoho.com&gt; wrote:


Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.



Sent using Zoho Mail



















Re: copy from one table to another

Posted by Dmitry Saprykin <sa...@gmail.com>.
IMHO The best step by step description of what you need to do is here

https://issues.apache.org/jira/browse/CASSANDRA-1585?focusedCommentId=13488959&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13488959

The only difference is that you need to copy data from one table only. I
did it for a whole keyspace.




On Sun, Apr 8, 2018 at 3:06 PM Jean Carlo <je...@gmail.com> wrote:

> You can use the same procedure to restore a table from snapshot from
> datastax webpage
>
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html
>
> Just two modifications.
>
> after step 5, modify the name of the sstables to add the name of the table
> you want to copy to.
>
> and in the step 6 copy the sstables to the right directory corresponding
> to the tale you want to copy to.
>
>
> Be sure you have an snapshot of the table source and ignore step 4 of
> course
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin <saprykin.dmitry@gmail.com
> > wrote:
>
>> You can copy hardlinks to ALL SSTables from old to new table and then
>> delete part of data you do not need in a new one.
>>
>> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>
>> wrote:
>>
>>> If it for testing and you don’t need any specific data, just copy a set
>>> of sstables with all files of that sequence and move to target tables
>>> directory and rename it.
>>>
>>> Restart target node or run nodetool refresh
>>>
>>> Sent from my iPhone
>>>
>>> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>
>>> wrote:
>>>
>>> Is there any way to copy some part of a table to another table in
>>> cassandra? A large amount of data should be copied so i don't want to fetch
>>> data to client and stream it back to cassandra using cql.
>>>
>>> Sent using Zoho Mail <https://www.zoho.com/mail/>
>>>
>>>
>>>
>>
>

Re: copy from one table to another

Posted by Jean Carlo <je...@gmail.com>.
You can use the same procedure to restore a table from snapshot from
datastax webpage

https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html

Just two modifications.

after step 5, modify the name of the sstables to add the name of the table
you want to copy to.

and in the step 6 copy the sstables to the right directory corresponding to
the tale you want to copy to.


Be sure you have an snapshot of the table source and ignore step 4 of
course


Saludos

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

On Sun, Apr 8, 2018 at 6:33 PM, Dmitry Saprykin <sa...@gmail.com>
wrote:

> You can copy hardlinks to ALL SSTables from old to new table and then
> delete part of data you do not need in a new one.
>
> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>
> wrote:
>
>> If it for testing and you don’t need any specific data, just copy a set
>> of sstables with all files of that sequence and move to target tables
>> directory and rename it.
>>
>> Restart target node or run nodetool refresh
>>
>> Sent from my iPhone
>>
>> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>
>> wrote:
>>
>> Is there any way to copy some part of a table to another table in
>> cassandra? A large amount of data should be copied so i don't want to fetch
>> data to client and stream it back to cassandra using cql.
>>
>> Sent using Zoho Mail <https://www.zoho.com/mail/>
>>
>>
>>
>

Re: copy from one table to another

Posted by Kyrylo Lebediev <Ky...@epam.com>.
Thank you,  Rahul!
________________________________
From: Rahul Singh <ra...@gmail.com>
Sent: Saturday, April 21, 2018 3:02:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

That’s correct.

On Apr 21, 2018, 5:05 AM -0400, Kyrylo Lebediev <Ky...@epam.com>, wrote:

You mean that correct table UUID should be specified as suffix in directory name?
For example:


Table:


cqlsh> select id from system_schema.tables where keyspace_name='test' and table_name='usr';

 id
--------------------------------------
 ea2f6da0-f931-11e7-8224-43ca70555242


Directory name:
./data/test/usr-ea2f6da0f93111e7822443ca70555242


Correct?


Regards,

Kyrill

________________________________
From: Rahul Singh <ra...@gmail.com>
Sent: Thursday, April 19, 2018 10:53:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

Each table has a different Guid — doing a hard link may work as long as the sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:

The table is too large to be copied fast/effectively , so I'd like to leverage immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one (OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt <keyspace> OldTable' on all nodes -- this will create point in time state of OldTable

3) on each node run:
       for each file in OldTable ttt snapshot directory:

             ln ..../<keyspace>/OldTable-<uuid>/snapshots/ttt/<keyspace>_OldTable_xxxxxx ...../<keyspace>/Newtable/<keyspace>_NewTable_xxxxx

     then:
     nodetool refresh <keyspace> NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this case?


Thanks,

Kyrill


________________________________
From: Rahul Singh <ra...@gmail.com>
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?

Regards,
Kyrill
________________________________
From: Ali Hubail <Al...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.


Kyrylo Lebediev <Ky...@epam.com>

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
        "user@cassandra.apache.org" <us...@cassandra.apache.org>,
cc

Subject
        Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some hidden issues?

Regards,
Kyrill

________________________________

From: Dmitry Saprykin <sa...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.

Sent using Zoho Mail<https://www.zoho.com/mail/>




Re: copy from one table to another

Posted by Rahul Singh <ra...@gmail.com>.
That’s correct.

On Apr 21, 2018, 5:05 AM -0400, Kyrylo Lebediev <Ky...@epam.com>, wrote:
> You mean that correct table UUID should be specified as suffix in directory name?
> For example:
>
> Table:
>
> cqlsh> select id from system_schema.tables where keyspace_name='test' and table_name='usr';
>
>  id
> --------------------------------------
>  ea2f6da0-f931-11e7-8224-43ca70555242
>
>
> Directory name:
> ./data/test/usr-ea2f6da0f93111e7822443ca70555242
>
> Correct?
>
> Regards,
> Kyrill
> From: Rahul Singh <ra...@gmail.com>
> Sent: Thursday, April 19, 2018 10:53:11 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> Each table has a different Guid — doing a hard link may work as long as the sstable dir’s guid is he same as the newly created table in the system schema.
>
> --
> Rahul Singh
> rahul.singh@anant.us
>
> Anant Corporation
>
> On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
> > The table is too large to be copied fast/effectively , so I'd like to leverage immutableness  property of SSTables.
> >
> > My idea is to:
> > 1) create new empty table (NewTable) with the same structure as existing one (OldTable)
> > 2) at some time run simultaneous 'nodetool snapshot -t ttt <keyspace> OldTable' on all nodes -- this will create point in time state of OldTable
> > 3) on each node run:
> >        for each file in OldTable ttt snapshot directory:
> >              ln ..../<keyspace>/OldTable-<uuid>/snapshots/ttt/<keyspace>_OldTable_xxxxxx ...../<keyspace>/Newtable/<keyspace>_NewTable_xxxxx
> >      then:
> >      nodetool refresh <keyspace> NewTable
> > 4) nodetool repair NewTable
> > 5) Use OldTable and NewTable independently (Read/Write)
> >
> > Are there any issues with using hardlinks (ln) instead of copying (cp) in this case?
> >
> > Thanks,
> > Kyrill
> >
> > From: Rahul Singh <ra...@gmail.com>
> > Sent: Wednesday, April 18, 2018 2:07:17 AM
> > To: user@cassandra.apache.org
> > Subject: Re: copy from one table to another
> >
> > 1. Make a new table with the same schema.
> > For each node
> > 2. Shutdown node
> > 3. Copy data from Source sstable dir to new sstable dir.
> >
> > This will do what you want.
> >
> > --
> > Rahul Singh
> > rahul.singh@anant.us
> >
> > Anant Corporation
> >
> > On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
> > > Thanks,  Ali.
> > > I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?
> > >
> > > Regards,
> > > Kyrill
> > > From: Ali Hubail <Al...@petrolink.com>
> > > Sent: Monday, April 16, 2018 6:51:51 PM
> > > To: user@cassandra.apache.org
> > > Subject: Re: copy from one table to another
> > >
> > > If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
> > > once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
> > > check this out:
> > > https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
> > >
> > > I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
> > > obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
> > > take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.
> > >
> > > Ali Hubail
> > >
> > > Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.
> > >
> > >
> > > Kyrylo Lebediev <Ky...@epam.com>
> > > 04/16/2018 10:37 AM
> > > Please respond to
> > > user@cassandra.apache.org
> > >
> > > To
> > > "user@cassandra.apache.org" <us...@cassandra.apache.org>,
> > > cc
> > > Subject
> > > Re: copy from one table to another
> > >
> > >
> > >
> > >
> > >
> > > Any issues if we:
> > >
> > > 1) create an new empty table with the same structure as the old one
> > > 2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
> > > 3) run nodetool refresh -- newkeyspacename newtable
> > >
> > > and then query/modify both tables independently/simultaneously?
> > >
> > > In theory, as SSTables are immutable, this should work, but could there be some hidden issues?
> > >
> > > Regards,
> > > Kyrill
> > >
> > > From: Dmitry Saprykin <sa...@gmail.com>
> > > Sent: Sunday, April 8, 2018 7:33:03 PM
> > > To: user@cassandra.apache.org
> > > Subject: Re: copy from one table to another
> > >
> > > You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.
> > >
> > > On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com> wrote:
> > > If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.
> > >
> > > Restart target node or run nodetool refresh
> > >
> > > Sent from my iPhone
> > >
> > > On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com> wrote:
> > >
> > > Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.
> > >
> > > Sent using Zoho Mail
> > >
> > >
> > >

Re: copy from one table to another

Posted by Kyrylo Lebediev <Ky...@epam.com>.
You mean that correct table UUID should be specified as suffix in directory name?
For example:


Table:


cqlsh> select id from system_schema.tables where keyspace_name='test' and table_name='usr';

 id
--------------------------------------
 ea2f6da0-f931-11e7-8224-43ca70555242


Directory name:
./data/test/usr-ea2f6da0f93111e7822443ca70555242


Correct?


Regards,

Kyrill

________________________________
From: Rahul Singh <ra...@gmail.com>
Sent: Thursday, April 19, 2018 10:53:11 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

Each table has a different Guid — doing a hard link may work as long as the sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:

The table is too large to be copied fast/effectively , so I'd like to leverage immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one (OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt <keyspace> OldTable' on all nodes -- this will create point in time state of OldTable

3) on each node run:
       for each file in OldTable ttt snapshot directory:

             ln ..../<keyspace>/OldTable-<uuid>/snapshots/ttt/<keyspace>_OldTable_xxxxxx ...../<keyspace>/Newtable/<keyspace>_NewTable_xxxxx

     then:
     nodetool refresh <keyspace> NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this case?


Thanks,

Kyrill


________________________________
From: Rahul Singh <ra...@gmail.com>
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?

Regards,
Kyrill
________________________________
From: Ali Hubail <Al...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.


Kyrylo Lebediev <Ky...@epam.com>

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
        "user@cassandra.apache.org" <us...@cassandra.apache.org>,
cc

Subject
        Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some hidden issues?

Regards,
Kyrill

________________________________

From: Dmitry Saprykin <sa...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.

Sent using Zoho Mail<https://www.zoho.com/mail/>




Re: copy from one table to another

Posted by Rahul Singh <ra...@gmail.com>.
Each table has a different Guid — doing a hard link may work as long as the sstable dir’s guid is he same as the newly created table in the system schema.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 19, 2018, 10:41 AM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
> The table is too large to be copied fast/effectively , so I'd like to leverage immutableness  property of SSTables.
>
> My idea is to:
> 1) create new empty table (NewTable) with the same structure as existing one (OldTable)
> 2) at some time run simultaneous 'nodetool snapshot -t ttt <keyspace> OldTable' on all nodes -- this will create point in time state of OldTable
> 3) on each node run:
>        for each file in OldTable ttt snapshot directory:
>              ln ..../<keyspace>/OldTable-<uuid>/snapshots/ttt/<keyspace>_OldTable_xxxxxx ...../<keyspace>/Newtable/<keyspace>_NewTable_xxxxx
>      then:
>      nodetool refresh <keyspace> NewTable
> 4) nodetool repair NewTable
> 5) Use OldTable and NewTable independently (Read/Write)
>
> Are there any issues with using hardlinks (ln) instead of copying (cp) in this case?
>
> Thanks,
> Kyrill
>
> From: Rahul Singh <ra...@gmail.com>
> Sent: Wednesday, April 18, 2018 2:07:17 AM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> 1. Make a new table with the same schema.
> For each node
> 2. Shutdown node
> 3. Copy data from Source sstable dir to new sstable dir.
>
> This will do what you want.
>
> --
> Rahul Singh
> rahul.singh@anant.us
>
> Anant Corporation
>
> On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
> > Thanks,  Ali.
> > I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?
> >
> > Regards,
> > Kyrill
> > From: Ali Hubail <Al...@petrolink.com>
> > Sent: Monday, April 16, 2018 6:51:51 PM
> > To: user@cassandra.apache.org
> > Subject: Re: copy from one table to another
> >
> > If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
> > once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
> > check this out:
> > https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
> >
> > I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
> > obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
> > take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.
> >
> > Ali Hubail
> >
> > Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.
> >
> >
> > Kyrylo Lebediev <Ky...@epam.com>
> > 04/16/2018 10:37 AM
> > Please respond to
> > user@cassandra.apache.org
> >
> > To
> > "user@cassandra.apache.org" <us...@cassandra.apache.org>,
> > cc
> > Subject
> > Re: copy from one table to another
> >
> >
> >
> >
> >
> > Any issues if we:
> >
> > 1) create an new empty table with the same structure as the old one
> > 2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
> > 3) run nodetool refresh -- newkeyspacename newtable
> >
> > and then query/modify both tables independently/simultaneously?
> >
> > In theory, as SSTables are immutable, this should work, but could there be some hidden issues?
> >
> > Regards,
> > Kyrill
> >
> > From: Dmitry Saprykin <sa...@gmail.com>
> > Sent: Sunday, April 8, 2018 7:33:03 PM
> > To: user@cassandra.apache.org
> > Subject: Re: copy from one table to another
> >
> > You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.
> >
> > On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com> wrote:
> > If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.
> >
> > Restart target node or run nodetool refresh
> >
> > Sent from my iPhone
> >
> > On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com> wrote:
> >
> > Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.
> >
> > Sent using Zoho Mail
> >
> >
> >

Re: copy from one table to another

Posted by Kyrylo Lebediev <Ky...@epam.com>.
The table is too large to be copied fast/effectively , so I'd like to leverage immutableness  property of SSTables.

My idea is to:

1) create new empty table (NewTable) with the same structure as existing one (OldTable)
2) at some time run simultaneous 'nodetool snapshot -t ttt <keyspace> OldTable' on all nodes -- this will create point in time state of OldTable

3) on each node run:
       for each file in OldTable ttt snapshot directory:

             ln ..../<keyspace>/OldTable-<uuid>/snapshots/ttt/<keyspace>_OldTable_xxxxxx ...../<keyspace>/Newtable/<keyspace>_NewTable_xxxxx

     then:
     nodetool refresh <keyspace> NewTable

4) nodetool repair NewTable
5) Use OldTable and NewTable independently (Read/Write)


Are there any issues with using hardlinks (ln) instead of copying (cp) in this case?


Thanks,

Kyrill


________________________________
From: Rahul Singh <ra...@gmail.com>
Sent: Wednesday, April 18, 2018 2:07:17 AM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?

Regards,
Kyrill
________________________________
From: Ali Hubail <Al...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.


Kyrylo Lebediev <Ky...@epam.com>

04/16/2018 10:37 AM

Please respond to
user@cassandra.apache.org




To
        "user@cassandra.apache.org" <us...@cassandra.apache.org>,
cc

Subject
        Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some hidden issues?

Regards,
Kyrill

________________________________

From: Dmitry Saprykin <sa...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.

Sent using Zoho Mail<https://www.zoho.com/mail/>




Re: copy from one table to another

Posted by Rahul Singh <ra...@gmail.com>.
1. Make a new table with the same schema.
For each node
2. Shutdown node
3. Copy data from Source sstable dir to new sstable dir.

This will do what you want.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Apr 16, 2018, 4:21 PM -0500, Kyrylo Lebediev <Ky...@epam.com>, wrote:
> Thanks,  Ali.
> I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?
>
> Regards,
> Kyrill
> From: Ali Hubail <Al...@petrolink.com>
> Sent: Monday, April 16, 2018 6:51:51 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
> once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
> check this out:
> https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated
>
> I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
> obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
> take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.
>
> Ali Hubail
>
> Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.
>
>
> Kyrylo Lebediev <Ky...@epam.com>
> 04/16/2018 10:37 AM
> Please respond to
> user@cassandra.apache.org
>
> To
> "user@cassandra.apache.org" <us...@cassandra.apache.org>,
> cc
> Subject
> Re: copy from one table to another
>
>
>
>
>
> Any issues if we:
>
> 1) create an new empty table with the same structure as the old one
> 2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
> 3) run nodetool refresh -- newkeyspacename newtable
>
> and then query/modify both tables independently/simultaneously?
>
> In theory, as SSTables are immutable, this should work, but could there be some hidden issues?
>
> Regards,
> Kyrill
>
> From: Dmitry Saprykin <sa...@gmail.com>
> Sent: Sunday, April 8, 2018 7:33:03 PM
> To: user@cassandra.apache.org
> Subject: Re: copy from one table to another
>
> You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.
>
> On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com> wrote:
> If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.
>
> Restart target node or run nodetool refresh
>
> Sent from my iPhone
>
> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com> wrote:
>
> Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.
>
> Sent using Zoho Mail
>
>
>

Re: copy from one table to another

Posted by Kyrylo Lebediev <Ky...@epam.com>.
Thanks,  Ali.
I just need to copy a large table in production without actual copying by using hardlinks. After this both tables should be used independently (RW). Is this a supported way or not?

Regards,
Kyrill
________________________________
From: Ali Hubail <Al...@petrolink.com>
Sent: Monday, April 16, 2018 6:51:51 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

If you want to copy a portion of the data to another table, you can also use sstable cql writer. It is more of an advanced feature and can be tricky, but doable.
once you write the new sstables, you can then use the sstableloader to stream the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in order to purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster and more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and utilize writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded.


Kyrylo Lebediev <Ky...@epam.com>

04/16/2018 10:37 AM
Please respond to
user@cassandra.apache.org




To
        "user@cassandra.apache.org" <us...@cassandra.apache.org>,
cc

Subject
        Re: copy from one table to another







Any issues if we:

1) create an new empty table with the same structure as the old one
2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be some hidden issues?

Regards,
Kyrill

________________________________

From: Dmitry Saprykin <sa...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.

Sent using Zoho Mail<https://www.zoho.com/mail/>




Re: copy from one table to another

Posted by Ali Hubail <Al...@petrolink.com>.
If you want to copy a portion of the data to another table, you can also 
use sstable cql writer. It is more of an advanced feature and can be 
tricky, but doable.
once you write the new sstables, you can then use the sstableloader to 
stream the new data into the new table.
check this out:
https://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

I have recently used this to clean up 500 GB worth of sstable data in 
order to purge tombstones that were mistakenly generated by the client.
obviously this is not as fast as hardlinks + refresh, but it's much faster 
and more efficient than using cql to copy data accross the tables.
take advantage of CQLSSTableWriter.builder.sorted() if you can, and 
utilize writetime if you have to.

Ali Hubail

Confidentiality warning: This message and any attachments are intended 
only for the persons to whom this message is addressed, are confidential, 
and may be privileged. If you are not the intended recipient, you are 
hereby notified that any review, retransmission, conversion to hard copy, 
copying, modification, circulation or other use of this message and any 
attachments is strictly prohibited. If you receive this message in error, 
please notify the sender immediately by return email, and delete this 
message and any attachments from your system. Petrolink International 
Limited its subsidiaries, holding companies and affiliates disclaims all 
responsibility from and accepts no liability whatsoever for the 
consequences of any unauthorized person acting, or refraining from acting, 
on any information contained in this message. For security purposes, staff 
training, to assist in resolving complaints and to improve our customer 
service, email communications may be monitored and telephone calls may be 
recorded.



Kyrylo Lebediev <Ky...@epam.com> 
04/16/2018 10:37 AM
Please respond to
user@cassandra.apache.org


To
"user@cassandra.apache.org" <us...@cassandra.apache.org>, 
cc

Subject
Re: copy from one table to another






Any issues if we:

1) create an new empty table with the same structure as the old one 
2) create hardlinks ("ln without -s"): 
.../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> 
.../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-* 
3) run nodetool refresh -- newkeyspacename newtable

and then query/modify both tables independently/simultaneously?

In theory, as SSTables are immutable, this should work, but could there be 
some hidden issues? 

Regards, 
Kyrill

From: Dmitry Saprykin <sa...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another 
 
You can copy hardlinks to ALL SSTables from old to new table and then 
delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com> 
wrote:
If it for testing and you don’t need any specific data, just copy a set of 
sstables with all files of that sequence and move to target tables 
directory and rename it. 

Restart target node or run nodetool refresh 

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com> 
wrote:

Is there any way to copy some part of a table to another table in 
cassandra? A large amount of data should be copied so i don't want to 
fetch data to client and stream it back to cassandra using cql.

Sent using Zoho Mail





Re: copy from one table to another

Posted by Kyrylo Lebediev <Ky...@epam.com>.
Any issues if we:


1) create an new empty table with the same structure as the old one

2) create hardlinks ("ln without -s"): .../<newtable>-<newuuid>/<newkeyspacename>-<newtable>-* ---> .../<oldtable>-<olduuid>/<oldkeyspacename>-<oldtable>-*

3) run nodetool refresh -- newkeyspacename newtable


and then query/modify both tables independently/simultaneously?


In theory, as SSTables are immutable, this should work, but could there be some hidden issues?


Regards,

Kyrill

________________________________
From: Dmitry Saprykin <sa...@gmail.com>
Sent: Sunday, April 8, 2018 7:33:03 PM
To: user@cassandra.apache.org
Subject: Re: copy from one table to another

You can copy hardlinks to ALL SSTables from old to new table and then delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com>> wrote:
If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.

Restart target node or run nodetool refresh

Sent from my iPhone

On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>> wrote:

Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.


Sent using Zoho Mail<https://www.zoho.com/mail/>




Re: copy from one table to another

Posted by Dmitry Saprykin <sa...@gmail.com>.
You can copy hardlinks to ALL SSTables from old to new table and then
delete part of data you do not need in a new one.

On Sun, Apr 8, 2018 at 10:20 AM, Nitan Kainth <ni...@gmail.com> wrote:

> If it for testing and you don’t need any specific data, just copy a set of
> sstables with all files of that sequence and move to target tables
> directory and rename it.
>
> Restart target node or run nodetool refresh
>
> Sent from my iPhone
>
> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com>
> wrote:
>
> Is there any way to copy some part of a table to another table in
> cassandra? A large amount of data should be copied so i don't want to fetch
> data to client and stream it back to cassandra using cql.
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>

Re: copy from one table to another

Posted by Nitan Kainth <ni...@gmail.com>.
If it for testing and you don’t need any specific data, just copy a set of sstables with all files of that sequence and move to target tables directory and rename it.

Restart target node or run nodetool refresh 

Sent from my iPhone

> On Apr 8, 2018, at 4:15 AM, onmstester onmstester <on...@zoho.com> wrote:
> 
> Is there any way to copy some part of a table to another table in cassandra? A large amount of data should be copied so i don't want to fetch data to client and stream it back to cassandra using cql.
> 
> Sent using Zoho Mail
> 
> 
>