You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ben Standefer <be...@simplegeo.com> on 2010/04/14 01:14:44 UTC

EC2, XFS, and ec2-consistent-snapshot with Cassandra

On EC2, it is common and recommended (
http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663)
to use XFS's freeze/thaw functionality to create near-online snapshots of an
EBS volume for MySQL snapshots.

"Besides being a stable, modern, high performance, journaling file system,
XFS supports file system freeze/thaw which is extremely useful for ensuring
a consistent state during EBS snapshots."

I've implemented this with MySQL before, and it worked extremely well (miles
beyond mysqldump or mysqlhotcopy).  On a given node, you sacrifice a short
period of availability (less than 0.5 seconds) to get a full, consistent
snapshot of your EBS volume that can be sent off to S3 in the background,
after the filesystem has unlocked and disk activity has resumed.  Has
anybody tried implementing this with a Cassandra cluster?  What are the
issues you ran into?  How did it compare with using Cassandra's "nodetool
snapshot"?

I think I could do this on a running node with a 0.5 second timeout.  The
XFS docs state "Any process attempting to write to the frozen filesystem
will block  waiting  for  the  filesystem  to  be unfrozen."  Having writes
block on a node for <0.5s sounds like something the Cassandra would handle
fine.

The Cassandra docs state "You can get an eventually consistent backup by
flushing all nodes and snapshotting; no individual node's backup is
guaranteed to be consistent but if you restore from that snapshot then
clients will get eventually consistent behavior as usual."  This lead me to
believe that as long as I have snapshot each node in the cluster within a
reasonable window (say 2 hours), I'd be able to bring the entire cluster
back with a guarantee that it is consistent up to the point where the
snapshot window began.

I realize one of Cassandra's design goals is redundancy and high
availability.  I'm not worried about our entire infrastructure collapsing
and having to restore backups because of massive node failure.  I want to
backup so that a bad logic bug in our app (ie messing up the timestamps) or
Cassandra itself deletes or corrupts data in our Cassandra cluster.

-Ben

Re: EC2, XFS, and ec2-consistent-snapshot with Cassandra

Posted by Scott White <sc...@gmail.com>.
> I've implemented this with MySQL before, and it worked extremely well
> (miles beyond mysqldump or mysqlhotcopy).  On a given node, you sacrifice a
> short period of availability (less than 0.5 seconds) to get a full,
> consistent snapshot of your EBS volume that can be sent off to S3 in the
> background, after the filesystem has unlocked and disk activity has
> resumed.  Has anybody tried implementing this with a Cassandra cluster?
> What are the issues you ran into?


Yes I have implemented this and it works as one would expect. I would
recommend stopping one cassandra jvm at a time and fail fast so if anything
happens to one node, the worst case is that you have to swap out that one
node assuming you have replication set up but I haven't run into any issues.
With the instance taken down freezing the filesystem is a precautionary
measure in case it accidentally gets restarted, etc.



> How did it compare with using Cassandra's "nodetool snapshot"?
>

They both work well but serve different needs. EC2 snapshotting your nodes
makes sense if, for example, you run the entire cluster in the same
availability zone and you want to be able to restore from backup if the data
center has an outage. It's also handy for things like "cloning" your cluster
since you can create a new cluster of the same size and loading the data off
the ec2 snapshots. That being said, EC2 machines can be very unreliable in
terms of network latency, etc. so being tightly coupled to EC2 can be risky.
If you have a rack aware strategy with your cluster partitioned across
different availability zones, nodetool snapshot may be all you need.


>
> I think I could do this on a running node with a 0.5 second timeout.  The
> XFS docs state "Any process attempting to write to the frozen filesystem
> will block  waiting  for  the  filesystem  to  be unfrozen."  Having writes
> block on a node for <0.5s sounds like something the Cassandra would handle
> fine.
>

If you just stop one instance at a time, it's a non-issue if you have say
RF=3 and CL=quorum. Reads and writes will just be redirected to other up
nodes while the node is down.


>
> The Cassandra docs state "You can get an eventually consistent backup by
> flushing all nodes and snapshotting; no individual node's backup is
> guaranteed to be consistent but if you restore from that snapshot then
> clients will get eventually consistent behavior as usual."  This lead me to
> believe that as long as I have snapshot each node in the cluster within a
> reasonable window (say 2 hours), I'd be able to bring the entire cluster
> back with a guarantee that it is consistent up to the point where the
> snapshot window began.
>

Right, one thing to keep in mind is depending on how much data you are
storing on each box, the ec2 snapshot may not finish within 2 hours. Also if
you are using cassandra snapshot, you don't want to just keep snapshotting
without either removing or moving the data of previous snapshots off the
disk since the disk will fill up quickly.