You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Edmond Lau <ed...@ooyala.com> on 2009/10/05 07:52:15 UTC

backing up data from cassandra

For folks who are using or considering using cassandra in their
production systems, what do you use for backups?

With HBase, one could potentially write a mapreduce to perform a row
scan of the entire table (restricted to some historical timestamp to
get a consistent view) and export the data to hdfs.  With Cassandra,
if you're using an ordered partitioner, a similar mechanism could be
built over a key range scan.

With a random partitioner, though, there's no api to iterate through
all existing keys.  Why not?

Edmond

Re: backing up data from cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.

On Mon, Oct 5, 2009 at 11:10 AM, Anthony Molinaro
<an...@alumni.caltech.edu> wrote:
> I assume the server also needs to be stopped while your are swapping
> files, but what about if you have a cluster of several servers and
> need to restore.  Is the process to shutdown all the servers, move
> the files and restart? Or can you you do it one at a time.  (I assume
> one at a time might mean a lot of read-repair work happening, so not
> a good idea).

Right, you could end up with newer data mixed in with your backed up
version which is probably not what you want.

> Also, is it best to flush_binary (which I think flushes in memory
> tables to disk), and compact prior to snapshotting?

flush_binary only acts on the binary memtables.  I wrote earlier:

>> note that the 0.4 branch, which will become 0.4.1, automatically
>> flushes each columnfamily when you ask for a snapshot of the table, so
>> you don't have to do that manually anymore.

Compaction isn't necessary and would slow down snapshotting considerably.

-Jonathan

Re: backing up data from cassandra

Posted by Anthony Molinaro <an...@alumni.caltech.edu>.

I assume the server also needs to be stopped while your are swapping
files, but what about if you have a cluster of several servers and
need to restore.  Is the process to shutdown all the servers, move
the files and restart?  Or can you you do it one at a time.  (I assume
one at a time might mean a lot of read-repair work happening, so not
a good idea).

Also, is it best to flush_binary (which I think flushes in memory
tables to disk), and compact prior to snapshotting?

-Anthony

On Mon, Oct 05, 2009 at 08:09:48AM -0500, Jonathan Ellis wrote:
> bin/nodeprobe snapshot
> 
> to restore, move the snapshot sstables from the snapshot location to
> the live data location (e.g. with dsh).
> 
> note that the 0.4 branch, which will become 0.4.1, automatically
> flushes each columnfamily when you ask for a snapshot of the table, so
> you don't have to do that manually anymore.
> 
> On Mon, Oct 5, 2009 at 8:05 AM, Joe Van Dyk <jo...@gmail.com> wrote:
> > How do you take the snapshot?  What's the restore process?
> >
> > On Mon, Oct 5, 2009 at 5:22 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> >> You can take a snapshot and either leave it in place indefinitely or
> >> throw it into your existing backup ecosystem.  That's your best option
> >> for backup no matter which kind of partitioner you're using.
> >>
> >> -Jonathan
> >>
> >> On Mon, Oct 5, 2009 at 12:52 AM, Edmond Lau <ed...@ooyala.com> wrote:
> >>> For folks who are using or considering using cassandra in their
> >>> production systems, what do you use for backups?
> >>>
> >>> With HBase, one could potentially write a mapreduce to perform a row
> >>> scan of the entire table (restricted to some historical timestamp to
> >>> get a consistent view) and export the data to hdfs.  With Cassandra,
> >>> if you're using an ordered partitioner, a similar mechanism could be
> >>> built over a key range scan.
> >>>
> >>> With a random partitioner, though, there's no api to iterate through
> >>> all existing keys.  Why not?
> >>>
> >>> Edmond
> >>>
> >>
> >
> >
> >
> > --
> > Joe Van Dyk
> > http://fixieconsulting.com
> >

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <an...@alumni.caltech.edu>

Re: backing up data from cassandra

Posted by Chris Were <ch...@gmail.com>.

Better late than never... ticket created.

On Thu, Oct 29, 2009 at 4:33 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> programatically, yes, but nodeprobe doesn't expose that yet.  Feel
> free to create a ticket.
>
> On Thu, Oct 29, 2009 at 4:20 AM, Chris Were <ch...@gmail.com> wrote:
> > Is it possible to only backup selected column families?
> > On Wed, Oct 7, 2009 at 8:15 AM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>
> >> I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir"
> >> as all that much harder, but maybe that's just me.
> >>
> >> for a cluster, just add dsh.
> >>
> >> -Jonathan
> >>
> >> On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk <jo...@gmail.com>
> wrote:
> >> > Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql"
> >> > though.  Oh well.
> >> >
> >> >
> >> >
> >> > On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <ed...@ooyala.com>
> wrote:
> >> >> Thanks for the replies guys.  It sounds like restoration via
> snapshots
> >> >> + some application-side logic to sanity check/repair any data around
> >> >> the snapshot time is the way to go.
> >> >>
> >> >> Edmond
> >> >>
> >> >> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jb...@gmail.com>
> >> >> wrote:
> >> >>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken
> >> >>> <tv...@rightscale.com> wrote:
> >> >>>> Isn't the question about how you back up a cassandra cluster, not a
> >> >>>> single node?
> >> >>>
> >> >>> Sure, but the generalization is straightforward. :)
> >> >>>
> >> >>>> Can you snapshot the various nodes at different times or do
> >> >>>> they need to be synchronized?
> >> >>>
> >> >>> The closer the synchronization, the more consistent they will be.
> >> >>> (Since Cassandra is designed around eventual consistency, there's
> some
> >> >>> flexibility here.  Conversely, there's no way to tell the system
> >> >>> "don't accept any more writes until the snapshot is done.")
> >> >>>
> >> >>>> Is there a minimal set of nodes that are
> >> >>>> sufficient to back up?
> >> >>>
> >> >>> Assuming your replication is 100% up to date, backing up every N
> nodes
> >> >>> where N is the replication factor could be adequate in theory, but I
> >> >>> wouldn't recommend trying to be clever like that, since if you
> >> >>> "restored" from backup like that your system would be in a degraded
> >> >>> state and vulnerable to any of the restored nodes failing.
> >> >>>
> >> >>> -Jonathan
> >> >>>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Joe Van Dyk
> >> > http://fixieconsulting.com
> >> >
> >
> >
>

Re: backing up data from cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.

programatically, yes, but nodeprobe doesn't expose that yet.  Feel
free to create a ticket.

On Thu, Oct 29, 2009 at 4:20 AM, Chris Were <ch...@gmail.com> wrote:
> Is it possible to only backup selected column families?
> On Wed, Oct 7, 2009 at 8:15 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir"
>> as all that much harder, but maybe that's just me.
>>
>> for a cluster, just add dsh.
>>
>> -Jonathan
>>
>> On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk <jo...@gmail.com> wrote:
>> > Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql"
>> > though.  Oh well.
>> >
>> >
>> >
>> > On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <ed...@ooyala.com> wrote:
>> >> Thanks for the replies guys.  It sounds like restoration via snapshots
>> >> + some application-side logic to sanity check/repair any data around
>> >> the snapshot time is the way to go.
>> >>
>> >> Edmond
>> >>
>> >> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jb...@gmail.com>
>> >> wrote:
>> >>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken
>> >>> <tv...@rightscale.com> wrote:
>> >>>> Isn't the question about how you back up a cassandra cluster, not a
>> >>>> single node?
>> >>>
>> >>> Sure, but the generalization is straightforward. :)
>> >>>
>> >>>> Can you snapshot the various nodes at different times or do
>> >>>> they need to be synchronized?
>> >>>
>> >>> The closer the synchronization, the more consistent they will be.
>> >>> (Since Cassandra is designed around eventual consistency, there's some
>> >>> flexibility here.  Conversely, there's no way to tell the system
>> >>> "don't accept any more writes until the snapshot is done.")
>> >>>
>> >>>> Is there a minimal set of nodes that are
>> >>>> sufficient to back up?
>> >>>
>> >>> Assuming your replication is 100% up to date, backing up every N nodes
>> >>> where N is the replication factor could be adequate in theory, but I
>> >>> wouldn't recommend trying to be clever like that, since if you
>> >>> "restored" from backup like that your system would be in a degraded
>> >>> state and vulnerable to any of the restored nodes failing.
>> >>>
>> >>> -Jonathan
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Joe Van Dyk
>> > http://fixieconsulting.com
>> >
>
>

Re: backing up data from cassandra

Posted by Chris Were <ch...@gmail.com>.

Is it possible to only backup selected column families?

On Wed, Oct 7, 2009 at 8:15 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir"
> as all that much harder, but maybe that's just me.
>
> for a cluster, just add dsh.
>
> -Jonathan
>
> On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk <jo...@gmail.com> wrote:
> > Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql"
> > though.  Oh well.
> >
> >
> >
> > On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <ed...@ooyala.com> wrote:
> >> Thanks for the replies guys.  It sounds like restoration via snapshots
> >> + some application-side logic to sanity check/repair any data around
> >> the snapshot time is the way to go.
> >>
> >> Edmond
> >>
> >> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <
> tve@rightscale.com> wrote:
> >>>> Isn't the question about how you back up a cassandra cluster, not a
> >>>> single node?
> >>>
> >>> Sure, but the generalization is straightforward. :)
> >>>
> >>>> Can you snapshot the various nodes at different times or do
> >>>> they need to be synchronized?
> >>>
> >>> The closer the synchronization, the more consistent they will be.
> >>> (Since Cassandra is designed around eventual consistency, there's some
> >>> flexibility here.  Conversely, there's no way to tell the system
> >>> "don't accept any more writes until the snapshot is done.")
> >>>
> >>>> Is there a minimal set of nodes that are
> >>>> sufficient to back up?
> >>>
> >>> Assuming your replication is 100% up to date, backing up every N nodes
> >>> where N is the replication factor could be adequate in theory, but I
> >>> wouldn't recommend trying to be clever like that, since if you
> >>> "restored" from backup like that your system would be in a degraded
> >>> state and vulnerable to any of the restored nodes failing.
> >>>
> >>> -Jonathan
> >>>
> >>
> >
> >
> >
> > --
> > Joe Van Dyk
> > http://fixieconsulting.com
> >
>

Re: backing up data from cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.

I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir"
as all that much harder, but maybe that's just me.

for a cluster, just add dsh.

-Jonathan

On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk <jo...@gmail.com> wrote:
> Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql"
> though.  Oh well.
>
>
>
> On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <ed...@ooyala.com> wrote:
>> Thanks for the replies guys.  It sounds like restoration via snapshots
>> + some application-side logic to sanity check/repair any data around
>> the snapshot time is the way to go.
>>
>> Edmond
>>
>> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <tv...@rightscale.com> wrote:
>>>> Isn't the question about how you back up a cassandra cluster, not a
>>>> single node?
>>>
>>> Sure, but the generalization is straightforward. :)
>>>
>>>> Can you snapshot the various nodes at different times or do
>>>> they need to be synchronized?
>>>
>>> The closer the synchronization, the more consistent they will be.
>>> (Since Cassandra is designed around eventual consistency, there's some
>>> flexibility here.  Conversely, there's no way to tell the system
>>> "don't accept any more writes until the snapshot is done.")
>>>
>>>> Is there a minimal set of nodes that are
>>>> sufficient to back up?
>>>
>>> Assuming your replication is 100% up to date, backing up every N nodes
>>> where N is the replication factor could be adequate in theory, but I
>>> wouldn't recommend trying to be clever like that, since if you
>>> "restored" from backup like that your system would be in a degraded
>>> state and vulnerable to any of the restored nodes failing.
>>>
>>> -Jonathan
>>>
>>
>
>
>
> --
> Joe Van Dyk
> http://fixieconsulting.com
>

Re: backing up data from cassandra

Posted by Joe Van Dyk <jo...@gmail.com>.

Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql"
though.  Oh well.



On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <ed...@ooyala.com> wrote:
> Thanks for the replies guys.  It sounds like restoration via snapshots
> + some application-side logic to sanity check/repair any data around
> the snapshot time is the way to go.
>
> Edmond
>
> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <tv...@rightscale.com> wrote:
>>> Isn't the question about how you back up a cassandra cluster, not a
>>> single node?
>>
>> Sure, but the generalization is straightforward. :)
>>
>>> Can you snapshot the various nodes at different times or do
>>> they need to be synchronized?
>>
>> The closer the synchronization, the more consistent they will be.
>> (Since Cassandra is designed around eventual consistency, there's some
>> flexibility here.  Conversely, there's no way to tell the system
>> "don't accept any more writes until the snapshot is done.")
>>
>>> Is there a minimal set of nodes that are
>>> sufficient to back up?
>>
>> Assuming your replication is 100% up to date, backing up every N nodes
>> where N is the replication factor could be adequate in theory, but I
>> wouldn't recommend trying to be clever like that, since if you
>> "restored" from backup like that your system would be in a degraded
>> state and vulnerable to any of the restored nodes failing.
>>
>> -Jonathan
>>
>



-- 
Joe Van Dyk
http://fixieconsulting.com

Re: backing up data from cassandra

Posted by Edmond Lau <ed...@ooyala.com>.

Thanks for the replies guys.  It sounds like restoration via snapshots
+ some application-side logic to sanity check/repair any data around
the snapshot time is the way to go.

Edmond

On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <tv...@rightscale.com> wrote:
>> Isn't the question about how you back up a cassandra cluster, not a
>> single node?
>
> Sure, but the generalization is straightforward. :)
>
>> Can you snapshot the various nodes at different times or do
>> they need to be synchronized?
>
> The closer the synchronization, the more consistent they will be.
> (Since Cassandra is designed around eventual consistency, there's some
> flexibility here.  Conversely, there's no way to tell the system
> "don't accept any more writes until the snapshot is done.")
>
>> Is there a minimal set of nodes that are
>> sufficient to back up?
>
> Assuming your replication is 100% up to date, backing up every N nodes
> where N is the replication factor could be adequate in theory, but I
> wouldn't recommend trying to be clever like that, since if you
> "restored" from backup like that your system would be in a degraded
> state and vulnerable to any of the restored nodes failing.
>
> -Jonathan
>

Re: backing up data from cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.

On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <tv...@rightscale.com> wrote:
> Isn't the question about how you back up a cassandra cluster, not a
> single node?

Sure, but the generalization is straightforward. :)

> Can you snapshot the various nodes at different times or do
> they need to be synchronized?

The closer the synchronization, the more consistent they will be.
(Since Cassandra is designed around eventual consistency, there's some
flexibility here.  Conversely, there's no way to tell the system
"don't accept any more writes until the snapshot is done.")

> Is there a minimal set of nodes that are
> sufficient to back up?

Assuming your replication is 100% up to date, backing up every N nodes
where N is the replication factor could be adequate in theory, but I
wouldn't recommend trying to be clever like that, since if you
"restored" from backup like that your system would be in a degraded
state and vulnerable to any of the restored nodes failing.

-Jonathan

Re: backing up data from cassandra

Posted by Thorsten von Eicken <tv...@rightscale.com>.

Isn't the question about how you back up a cassandra cluster, not a
single node? Can you snapshot the various nodes at different times or do
they need to be synchronized? Is there a minimal set of nodes that are
sufficient to back up?
    Thorsten

Jonathan Ellis wrote:
> bin/nodeprobe snapshot
>
> to restore, move the snapshot sstables from the snapshot location to
> the live data location (e.g. with dsh).
>
> note that the 0.4 branch, which will become 0.4.1, automatically
> flushes each columnfamily when you ask for a snapshot of the table, so
> you don't have to do that manually anymore.
>
> On Mon, Oct 5, 2009 at 8:05 AM, Joe Van Dyk <jo...@gmail.com> wrote:
>   
>> How do you take the snapshot?  What's the restore process?
>>
>> On Mon, Oct 5, 2009 at 5:22 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>     
>>> You can take a snapshot and either leave it in place indefinitely or
>>> throw it into your existing backup ecosystem.  That's your best option
>>> for backup no matter which kind of partitioner you're using.
>>>
>>> -Jonathan
>>>
>>> On Mon, Oct 5, 2009 at 12:52 AM, Edmond Lau <ed...@ooyala.com> wrote:
>>>       
>>>> For folks who are using or considering using cassandra in their
>>>> production systems, what do you use for backups?
>>>>
>>>> With HBase, one could potentially write a mapreduce to perform a row
>>>> scan of the entire table (restricted to some historical timestamp to
>>>> get a consistent view) and export the data to hdfs.  With Cassandra,
>>>> if you're using an ordered partitioner, a similar mechanism could be
>>>> built over a key range scan.
>>>>
>>>> With a random partitioner, though, there's no api to iterate through
>>>> all existing keys.  Why not?
>>>>
>>>> Edmond
>>>>
>>>>         
>> --
>> Joe Van Dyk
>> http://fixieconsulting.com
>>
>>     
>
>

Re: backing up data from cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.

bin/nodeprobe snapshot

to restore, move the snapshot sstables from the snapshot location to
the live data location (e.g. with dsh).

note that the 0.4 branch, which will become 0.4.1, automatically
flushes each columnfamily when you ask for a snapshot of the table, so
you don't have to do that manually anymore.

On Mon, Oct 5, 2009 at 8:05 AM, Joe Van Dyk <jo...@gmail.com> wrote:
> How do you take the snapshot?  What's the restore process?
>
> On Mon, Oct 5, 2009 at 5:22 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>> You can take a snapshot and either leave it in place indefinitely or
>> throw it into your existing backup ecosystem.  That's your best option
>> for backup no matter which kind of partitioner you're using.
>>
>> -Jonathan
>>
>> On Mon, Oct 5, 2009 at 12:52 AM, Edmond Lau <ed...@ooyala.com> wrote:
>>> For folks who are using or considering using cassandra in their
>>> production systems, what do you use for backups?
>>>
>>> With HBase, one could potentially write a mapreduce to perform a row
>>> scan of the entire table (restricted to some historical timestamp to
>>> get a consistent view) and export the data to hdfs.  With Cassandra,
>>> if you're using an ordered partitioner, a similar mechanism could be
>>> built over a key range scan.
>>>
>>> With a random partitioner, though, there's no api to iterate through
>>> all existing keys.  Why not?
>>>
>>> Edmond
>>>
>>
>
>
>
> --
> Joe Van Dyk
> http://fixieconsulting.com
>

Re: backing up data from cassandra

Posted by Joe Van Dyk <jo...@gmail.com>.

How do you take the snapshot?  What's the restore process?

On Mon, Oct 5, 2009 at 5:22 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> You can take a snapshot and either leave it in place indefinitely or
> throw it into your existing backup ecosystem.  That's your best option
> for backup no matter which kind of partitioner you're using.
>
> -Jonathan
>
> On Mon, Oct 5, 2009 at 12:52 AM, Edmond Lau <ed...@ooyala.com> wrote:
>> For folks who are using or considering using cassandra in their
>> production systems, what do you use for backups?
>>
>> With HBase, one could potentially write a mapreduce to perform a row
>> scan of the entire table (restricted to some historical timestamp to
>> get a consistent view) and export the data to hdfs.  With Cassandra,
>> if you're using an ordered partitioner, a similar mechanism could be
>> built over a key range scan.
>>
>> With a random partitioner, though, there's no api to iterate through
>> all existing keys.  Why not?
>>
>> Edmond
>>
>



-- 
Joe Van Dyk
http://fixieconsulting.com

Re: backing up data from cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.

You can take a snapshot and either leave it in place indefinitely or
throw it into your existing backup ecosystem.  That's your best option
for backup no matter which kind of partitioner you're using.

-Jonathan

On Mon, Oct 5, 2009 at 12:52 AM, Edmond Lau <ed...@ooyala.com> wrote:
> For folks who are using or considering using cassandra in their
> production systems, what do you use for backups?
>
> With HBase, one could potentially write a mapreduce to perform a row
> scan of the entire table (restricted to some historical timestamp to
> get a consistent view) and export the data to hdfs.  With Cassandra,
> if you're using an ordered partitioner, a similar mechanism could be
> built over a key range scan.
>
> With a random partitioner, though, there's no api to iterate through
> all existing keys.  Why not?
>
> Edmond
>