You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Batranut Bogdan <ba...@yahoo.com> on 2014/11/27 13:54:11 UTC

Cassandra COPY to CSV and DateTieredCompactionStrategy

Hello all,
I have a few things that I need to understand.
1 . Here is the scenario: we have a HUGE cf where there are daily writes it is like a time series. Now we want to change the type of a column in primary key. What I think we can do is to export to csv, create the new table and write back the transformed data. But here is the catch... the constant writes in the cf. I assume that by the time the export finishes, new data will be inserted in the source cf. So is there a tool that will export data without having to stop the writes? 
2. I have seen that there is a new compaction strategy: DTCS, that will better fit historical data. This compaction strategy will take into account writeTime() of an entry or will it be smart enough and detect that the column family is a time series and take into account those timestamps when creating the time windows? I am asking this since when we write to the cf, the time for a particular record is 00:00h of a given day, so basicaly all entries have the same timestamp value in the cf but of course different writeTime() .

Re: Cassandra COPY to CSV and DateTieredCompactionStrategy

Posted by Paulo Ricardo Motta Gomes <pa...@chaordicsystems.com>.
Regarding the first question you need to configure your application to
write to both CFs (old and new) during the migration phase.

I'm not sure about the second question, but my guess is that only the
writeTime will be taken into account.

On Thu, Nov 27, 2014 at 10:54 AM, Batranut Bogdan <ba...@yahoo.com>
wrote:

> Hello all,
>
> I have a few things that I need to understand.
>
> 1 . Here is the scenario:
> we have a HUGE cf where there are daily writes it is like a time series.
> Now we want to change the type of a column in primary key. What I think we
> can do is to export to csv, create the new table and write back the
> transformed data. But here is the catch... the constant writes in the cf. I
> assume that by the time the export finishes, new data will be inserted in
> the source cf. So is there a tool that will export data without having to
> stop the writes?
>
> 2. I have seen that there is a new compaction strategy: DTCS, that will
> better fit historical data. This compaction strategy will take into account
> writeTime() of an entry or will it be smart enough and detect that the
> column family is a time series and take into account those timestamps when
> creating the time windows? I am asking this since when we write to the cf,
> the time for a particular record is 00:00h of a given day, so basicaly all
> entries have the same timestamp value in the cf but of course different
> writeTime() .
>



-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br <http://www.chaordic.com.br/>*
+55 48 3232.3200