You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kanwar Sangha <ka...@mavenir.com> on 2013/02/19 03:06:24 UTC

Cassandra backup

Hi - We have a req to store around 90 days of data per user. Last 7 days of data is going to be accessed frequently. Is there a way we can have the recent data (7 days) in SSD and the rest of the data in the
HDD ? Do we take a snapshot every 7 days and use a separate 'archive' cluster to serve the old data and a 'active' cluster to serve recent data ?

Any links/thoughts would be helpful.

Thanks,
Kanwar

Re: Cassandra backup

Posted by aaron morton <aa...@thelastpickle.com>.
You'll need to use two CF's to achieve that. Denormalising to support a workload like that is not a terrible idea. 

Depending on how big the 7 days hot set is you may get benefit from using a large row cache with one CF. Maybe worth doing some testing. 

CHeers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/02/2013, at 3:26 PM, Kanwar Sangha <ka...@mavenir.com> wrote:

> Thanks. I will look into the details.
>  
> One issue I see is that if I have only one column family which needs only the last 7 days data to be on SSD and the rest to be on the HDD, how will that work.
>  
> From: Michael Kjellman [mailto:mkjellman@barracuda.com] 
> Sent: 18 February 2013 20:08
> To: user@cassandra.apache.org
> Subject: Re: Cassandra backup
>  
> There is this:
>  
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-flexible-data-file-placement
>  
> But you'll need to design your data model around the fact that this is only as granular as 1 column family
>  
> Best,
> michael
>  
> From: Kanwar Sangha <ka...@mavenir.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Monday, February 18, 2013 6:06 PM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Cassandra backup
>  
> Hi – We have a req to store around 90 days of data per user. Last 7 days of data is going to be accessed frequently. Is there a way we can have the recent data (7 days) in SSD and the rest of the data in the
> HDD ? Do we take a snapshot every 7 days and use a separate ‘archive’ cluster to serve the old data and a ‘active’ cluster to serve recent data ?
>  
> Any links/thoughts would be helpful.
>  
> Thanks,
> Kanwar


RE: Cassandra backup

Posted by Kanwar Sangha <ka...@mavenir.com>.
Thanks. I will look into the details.

One issue I see is that if I have only one column family which needs only the last 7 days data to be on SSD and the rest to be on the HDD, how will that work.

From: Michael Kjellman [mailto:mkjellman@barracuda.com]
Sent: 18 February 2013 20:08
To: user@cassandra.apache.org
Subject: Re: Cassandra backup

There is this:

http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-flexible-data-file-placement

But you'll need to design your data model around the fact that this is only as granular as 1 column family

Best,
michael

From: Kanwar Sangha <ka...@mavenir.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Monday, February 18, 2013 6:06 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Cassandra backup

Hi - We have a req to store around 90 days of data per user. Last 7 days of data is going to be accessed frequently. Is there a way we can have the recent data (7 days) in SSD and the rest of the data in the
HDD ? Do we take a snapshot every 7 days and use a separate 'archive' cluster to serve the old data and a 'active' cluster to serve recent data ?

Any links/thoughts would be helpful.

Thanks,
Kanwar

Re: Cassandra backup

Posted by Michael Kjellman <mk...@barracuda.com>.
There is this:

http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-flexible-data-file-placement

But you'll need to design your data model around the fact that this is only as granular as 1 column family

Best,
michael

From: Kanwar Sangha <ka...@mavenir.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Monday, February 18, 2013 6:06 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Cassandra backup

Hi – We have a req to store around 90 days of data per user. Last 7 days of data is going to be accessed frequently. Is there a way we can have the recent data (7 days) in SSD and the rest of the data in the
HDD ? Do we take a snapshot every 7 days and use a separate ‘archive’ cluster to serve the old data and a ‘active’ cluster to serve recent data ?

Any links/thoughts would be helpful.

Thanks,
Kanwar