You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Stone Fang <cn...@gmail.com> on 2016/06/30 09:52:43 UTC

cassandra archive data

Hi All,

I want to archive cassandra data.have seen a ticket on this.
https://issues.apache.org/jira/browse/CASSANDRA-8460

but I think there are still some issues may be caused by this solution.

Q1.from application perspective,we rarely used these archived data,
but when scale up cluster,add node or decommission node,we will stream data
between node,
since these archived sstable still in token ring,it may take long time to
finish bootstrap when the archived data is too large.

S1:
we can add a flag on sstable:boolean archived
and when stream sstable,we will filter these archived sstable.

Q2.
why not separate "archive sstable" from compaction  strategy?
archive sstable is not a round,in-time task,we just need to execute the
task periodly.
I mean we should not mix "archive sstable" code with compaction code.

S2:
we can provide a sstable tool to archive data.split sstable by date is the
job of compaction strategy.
we dont care it is DTCS or TWCS.

stone

Thanks in advance!