You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Sasha Dolgy <sd...@gmail.com> on 2011/03/09 16:27:42 UTC

Aamzon EC2 & Cassandra to ebs or not..

Hi Everyone,

Now that I'm past the problems of IP addresses changing ... I am onto the
idea of storage.  Initially I had though that for each cassandra instance, I
should have an EBS volume to store all the cassandra data / information.
Now I'm starting to wonder if this is duplication and not necessary.  If an
instance dies, I loose anything that's not attached to EBS.  However, if the
cassandra cluster is healthy ... this shouldn't be an issue ... Is this a
correct assumption?

-sd

-- 
Sasha Dolgy
sasha.dolgy@gmail.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Jeremy Hanna <je...@gmail.com>.
I've seen both sides but Cassandra does handle replication and bringing data back is a matter of bootstrapping a node to replace the downed node.  

One thing to consider is availability zones and regions though.  What happens if your entire cluster goes down in the case of a single datacenter going offline?  From what I understand ec2 availability zones are equivalent to physical datacenters so going across availability zones will handle an entire datacenter going down.  Regions are another level of safeguarding against this.  Anyway, just some thoughts.

Some considerations are also found in the Cloud section of this page: http://wiki.apache.org/cassandra/CassandraHardware

On Mar 9, 2011, at 9:57 AM, Sasha Dolgy wrote:

> 
> well, this is what i'm getting at.  why would you want to back it up if the cluster is working properly?  backup is silly.... ; )
> 
> On Wed, Mar 9, 2011 at 4:54 PM, William Oberman <ob...@civicscience.com> wrote:
> I'm considering similar issues right now.  The problem with ephemeral storage is I don't know an easy way to back it up, while on an EBS it's a simple snapshot API call.
> 
> Otherwise, I believe the performance of the ephemeral (certainly in the case of large or greater, where you can RAID0 multiple disks) is way better than EBS.
> 
> will
> 


Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by William Oberman <ob...@civicscience.com>.
This is excellent, specific feedback.  Thanks!

Given the relative costs, I was hoping L was the optimal tradeoff vs XL, but
if that's the best option, that's the best option.

will

On Wed, Mar 9, 2011 at 12:04 PM, Erik Onnen <eo...@gmail.com> wrote:

> I'd recommend not storing commit logs or data files on EBS volumes if
> your machines are under any decent amount of load. I say that for
> three reasons.
>
> First, both EBS volumes contend directly for network throughput with
> what appears to be a peer QoS policy to standard packets. In other
> words, if you're saturating a network link, EBS throughput falls. The
> same has not been true of ephemeral volumes in all of our testing,
> ephemeral I/O speeds tend to only take a minor hit under network
> pressure and are consistently faster in raw speed tests.
>
> Second, at some point it's a given that you will encounter misbehaving
> EBS volumes. They won't completely fail, worse they will just get
> really, really slow. Often times this is worse than a total failure
> because the system just back piles reads/writes but doesn't totally
> fall over until the entire cluster becomes overwhelmed. We've never
> had single volume ephemeral problems.
>
> Lastly, I think people have a tendency to bolt a large number of EBS
> volumes to a host and think that because they have disk capacity they
> serve more data from fewer hosts. If you push that too far, you'll
> outstrip the ability of the system to keep effective buffer caches and
> concurrently serve requests for all the data it is responsible for
> managing. IME there is pretty good parity between an EC2 XL and the
> ephemeral disks available relative to how Cassandra uses disk and RAM
> that adding more storage is right at the breaking point of over
> committing your hardware.
>
> If you want protection from AZ failure, split you ring across AZs
> (Cassandra is quite good at this) or copy snapshots to EBS volumes.
>
> -erik
>
> There are a lot of benefits to EBS volumes, I/O throughput and
> reliability are not among those benefits.
>
> On Wed, Mar 9, 2011 at 8:39 AM, William Oberman
> <ob...@civicscience.com> wrote:
> > I thought nodetool snapshot writes the snapshot locally, requiring 2x of
> > expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
> > snapshot).  By that I mean EBS allocation is GB allocated per month costs
> at
> > one rate, and EBS snapshots are delta compressed copies to S3.
> >
> > Can you point the snapshot to an external filesystem?
> >
> > will
> >
> > On Wed, Mar 9, 2011 at 11:31 AM, Sasha Dolgy <sd...@gmail.com> wrote:
> >>
> >> Could you not nodetool snapshot the data into an mounted ebs/s3 bucket
> and
> >> satisfy your development requirement?
> >> -sd
> >>
> >> On Wed, Mar 9, 2011 at 5:23 PM, William Oberman <
> oberman@civicscience.com>
> >> wrote:
> >>>
> >>> For me, to transition production data into a development environment
> for
> >>> real world testing.  Also, backups are never a bad idea, though I agree
> most
> >>> all risk is mitigated due to cassandra's design.
> >>>
> >>> will
> >
> >
> >
> > --
> > Will Oberman
> > Civic Science, Inc.
> > 3030 Penn Avenue., First Floor
> > Pittsburgh, PA 15201
> > (M) 412-480-7835
> > (E) oberman@civicscience.com
> >
>



-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Erik Onnen <eo...@gmail.com>.
I'd recommend not storing commit logs or data files on EBS volumes if
your machines are under any decent amount of load. I say that for
three reasons.

First, both EBS volumes contend directly for network throughput with
what appears to be a peer QoS policy to standard packets. In other
words, if you're saturating a network link, EBS throughput falls. The
same has not been true of ephemeral volumes in all of our testing,
ephemeral I/O speeds tend to only take a minor hit under network
pressure and are consistently faster in raw speed tests.

Second, at some point it's a given that you will encounter misbehaving
EBS volumes. They won't completely fail, worse they will just get
really, really slow. Often times this is worse than a total failure
because the system just back piles reads/writes but doesn't totally
fall over until the entire cluster becomes overwhelmed. We've never
had single volume ephemeral problems.

Lastly, I think people have a tendency to bolt a large number of EBS
volumes to a host and think that because they have disk capacity they
serve more data from fewer hosts. If you push that too far, you'll
outstrip the ability of the system to keep effective buffer caches and
concurrently serve requests for all the data it is responsible for
managing. IME there is pretty good parity between an EC2 XL and the
ephemeral disks available relative to how Cassandra uses disk and RAM
that adding more storage is right at the breaking point of over
committing your hardware.

If you want protection from AZ failure, split you ring across AZs
(Cassandra is quite good at this) or copy snapshots to EBS volumes.

-erik

There are a lot of benefits to EBS volumes, I/O throughput and
reliability are not among those benefits.

On Wed, Mar 9, 2011 at 8:39 AM, William Oberman
<ob...@civicscience.com> wrote:
> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
> one rate, and EBS snapshots are delta compressed copies to S3.
>
> Can you point the snapshot to an external filesystem?
>
> will
>
> On Wed, Mar 9, 2011 at 11:31 AM, Sasha Dolgy <sd...@gmail.com> wrote:
>>
>> Could you not nodetool snapshot the data into an mounted ebs/s3 bucket and
>> satisfy your development requirement?
>> -sd
>>
>> On Wed, Mar 9, 2011 at 5:23 PM, William Oberman <ob...@civicscience.com>
>> wrote:
>>>
>>> For me, to transition production data into a development environment for
>>> real world testing.  Also, backups are never a bad idea, though I agree most
>>> all risk is mitigated due to cassandra's design.
>>>
>>> will
>
>
>
> --
> Will Oberman
> Civic Science, Inc.
> 3030 Penn Avenue., First Floor
> Pittsburgh, PA 15201
> (M) 412-480-7835
> (E) oberman@civicscience.com
>

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Dave Viner <da...@gmail.com>.
Sasha,

You might also check out http://coreyhulen.org/category/cassandra/ for speed
tests done by Corey Hulan on different disk configurations (both inside ec2
and on real hw).

If you write to the ephermeral storage on an EC2 instance, there is no
additional cost for the data written.  Mostly similarly with EBS.  In EBS
you pay for the disk size you allocate.  There's a tiny additional charge
for IO (currently $0.10 per 1M io requests).

HTH,

Dave Viner


On Wed, Mar 9, 2011 at 8:48 AM, Sasha Dolgy <sd...@gmail.com> wrote:

> Hi Will,
>
> http://wiki.apache.org/cassandra/Operations#Backing_up_data
>
> <http://wiki.apache.org/cassandra/Operations#Backing_up_data>If the
> snapshot is written to the ephemeral storage ... there isn't a cost. (i need
> to confirm that)
>
> You can then move this to an S3 bucket with RDS if you want or full
> 99.999999999% redundancy and have it available to developers
>
> This is what I had in my head....
> -sd
>
>
> On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <ob...@civicscience.com>wrote:
>
>> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
>> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
>> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
>> one rate, and EBS snapshots are delta compressed copies to S3.
>>
>> Can you point the snapshot to an external filesystem?
>>
>> will
>>
>>

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Jonathan Ellis <jb...@gmail.com>.
Right, local snapshot is no-cost both from an EC2 pricing standpoint
and from a disk usage standpoint (because it uses hard links).

On Wed, Mar 9, 2011 at 10:48 AM, Sasha Dolgy <sd...@gmail.com> wrote:
> Hi Will,
> http://wiki.apache.org/cassandra/Operations#Backing_up_data
> If the snapshot is written to the ephemeral storage ... there isn't a cost.
> (i need to confirm that)
> You can then move this to an S3 bucket with RDS if you want or full
> 99.999999999% redundancy and have it available to developers
> This is what I had in my head....
> -sd
>
> On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <ob...@civicscience.com>
> wrote:
>>
>> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
>> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
>> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
>> one rate, and EBS snapshots are delta compressed copies to S3.
>>
>> Can you point the snapshot to an external filesystem?
>>
>> will
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by William Oberman <ob...@civicscience.com>.
Based on Eric's email, it sounds like EBS is a no go from the start.  But
given your snapshot feedback, it seems like you have to plan on leaving
slack on every disk, and the % of slack depends on the size of a snapshot
relative to the data (given the snapshot shares the disk with the data, at
least temporarily).

will

On Wed, Mar 9, 2011 at 11:59 AM, Sasha Dolgy <sd...@gmail.com> wrote:

> Hi will,
>
> Quickly did a snapshot:
>
> nodetool -h 10.0.0.2 -p 8080 snapshot 09032011
>
> The snapshots end up in the data dir for cassandra.  The default is
> /var/lib/cassandra/data/<keyspace>/snapshots/
>
> In this directory i have:  1299689801925-09032011
>
> -sd
>
>
> On Wed, Mar 9, 2011 at 5:54 PM, William Oberman <ob...@civicscience.com>wrote:
>
>> I haven't done backups yet, so I don't know where the data is written.  Is
>> it where the nodetool is run from?  Or local to the instance running
>> cassandra (and there, local to the data directory?).  I assumed it was the
>> latter (not finding docs on that yet), and that would require 2x storage
>> allocated on that instance for 1x data (to have room for the snapshot).  If
>> its the former, then yes, I'd totally run the command from an ephemeral
>> store, and backup to S3.
>>
>> will
>>
>>
>> On Wed, Mar 9, 2011 at 11:48 AM, Sasha Dolgy <sd...@gmail.com> wrote:
>>
>>> Hi Will,
>>>
>>> http://wiki.apache.org/cassandra/Operations#Backing_up_data
>>>
>>> <http://wiki.apache.org/cassandra/Operations#Backing_up_data>If the
>>> snapshot is written to the ephemeral storage ... there isn't a cost. (i need
>>> to confirm that)
>>>
>>> You can then move this to an S3 bucket with RDS if you want or full
>>> 99.999999999% redundancy and have it available to developers
>>>
>>> This is what I had in my head....
>>> -sd
>>>
>>>
>>> On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <
>>> oberman@civicscience.com> wrote:
>>>
>>>> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
>>>> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
>>>> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
>>>> one rate, and EBS snapshots are delta compressed copies to S3.
>>>>
>>>> Can you point the snapshot to an external filesystem?
>>>>
>>>> will
>>>>
>>>>
>>
>>
>> --
>> Will Oberman
>> Civic Science, Inc.
>> 3030 Penn Avenue., First Floor
>> Pittsburgh, PA 15201
>> (M) 412-480-7835
>> (E) oberman@civicscience.com
>>
>
>
>
> --
> Sasha Dolgy
> sasha.dolgy@gmail.com
>



-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Sasha Dolgy <sd...@gmail.com>.
Hi will,

Quickly did a snapshot:

nodetool -h 10.0.0.2 -p 8080 snapshot 09032011

The snapshots end up in the data dir for cassandra.  The default is
/var/lib/cassandra/data/<keyspace>/snapshots/

In this directory i have:  1299689801925-09032011

-sd

On Wed, Mar 9, 2011 at 5:54 PM, William Oberman <ob...@civicscience.com>wrote:

> I haven't done backups yet, so I don't know where the data is written.  Is
> it where the nodetool is run from?  Or local to the instance running
> cassandra (and there, local to the data directory?).  I assumed it was the
> latter (not finding docs on that yet), and that would require 2x storage
> allocated on that instance for 1x data (to have room for the snapshot).  If
> its the former, then yes, I'd totally run the command from an ephemeral
> store, and backup to S3.
>
> will
>
>
> On Wed, Mar 9, 2011 at 11:48 AM, Sasha Dolgy <sd...@gmail.com> wrote:
>
>> Hi Will,
>>
>> http://wiki.apache.org/cassandra/Operations#Backing_up_data
>>
>> <http://wiki.apache.org/cassandra/Operations#Backing_up_data>If the
>> snapshot is written to the ephemeral storage ... there isn't a cost. (i need
>> to confirm that)
>>
>> You can then move this to an S3 bucket with RDS if you want or full
>> 99.999999999% redundancy and have it available to developers
>>
>> This is what I had in my head....
>> -sd
>>
>>
>> On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <oberman@civicscience.com
>> > wrote:
>>
>>> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
>>> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
>>> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
>>> one rate, and EBS snapshots are delta compressed copies to S3.
>>>
>>> Can you point the snapshot to an external filesystem?
>>>
>>> will
>>>
>>>
>
>
> --
> Will Oberman
> Civic Science, Inc.
> 3030 Penn Avenue., First Floor
> Pittsburgh, PA 15201
> (M) 412-480-7835
> (E) oberman@civicscience.com
>



-- 
Sasha Dolgy
sasha.dolgy@gmail.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by William Oberman <ob...@civicscience.com>.
I haven't done backups yet, so I don't know where the data is written.  Is
it where the nodetool is run from?  Or local to the instance running
cassandra (and there, local to the data directory?).  I assumed it was the
latter (not finding docs on that yet), and that would require 2x storage
allocated on that instance for 1x data (to have room for the snapshot).  If
its the former, then yes, I'd totally run the command from an ephemeral
store, and backup to S3.

will

On Wed, Mar 9, 2011 at 11:48 AM, Sasha Dolgy <sd...@gmail.com> wrote:

> Hi Will,
>
> http://wiki.apache.org/cassandra/Operations#Backing_up_data
>
> <http://wiki.apache.org/cassandra/Operations#Backing_up_data>If the
> snapshot is written to the ephemeral storage ... there isn't a cost. (i need
> to confirm that)
>
> You can then move this to an S3 bucket with RDS if you want or full
> 99.999999999% redundancy and have it available to developers
>
> This is what I had in my head....
> -sd
>
>
> On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <ob...@civicscience.com>wrote:
>
>> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
>> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
>> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
>> one rate, and EBS snapshots are delta compressed copies to S3.
>>
>> Can you point the snapshot to an external filesystem?
>>
>> will
>>
>>


-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Frank LoVecchio <fr...@isidorey.com>.
>
> Now that I'm past the problems of IP addresses changing ... I am onto the
> idea of storage.  Initially I had though that for each cassandra instance, I
> should have an EBS volume to store all the cassandra data / information.
> Now I'm starting to wonder if this is duplication and not necessary.  If an
> instance dies, I loose anything that's not attached to EBS.  However, if the
> cassandra cluster is healthy ... this shouldn't be an issue ... Is this a
> correct assumption?


Correct.  Why not use EBS backed instances?  The ability to reboot comes in
handy.  I have a cluster of 6 nodes, each with an EBS drive of data (EBS
drives can scale, if you need them to - not advised).  Bootstrapping has
always worked better for me than doing any sort of data snapshotting,
allowing nodes to come in and out with proper token management.  You can
attach S3 buckets as drives as well...

On Wed, Mar 9, 2011 at 9:48 AM, Sasha Dolgy <sd...@gmail.com> wrote:

> Hi Will,
>
> http://wiki.apache.org/cassandra/Operations#Backing_up_data
>
> <http://wiki.apache.org/cassandra/Operations#Backing_up_data>If the
> snapshot is written to the ephemeral storage ... there isn't a cost. (i need
> to confirm that)
>
> You can then move this to an S3 bucket with RDS if you want or full
> 99.999999999% redundancy and have it available to developers
>
> This is what I had in my head....
> -sd
>
>
> On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <ob...@civicscience.com>wrote:
>
>> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
>> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
>> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
>> one rate, and EBS snapshots are delta compressed copies to S3.
>>
>> Can you point the snapshot to an external filesystem?
>>
>> will
>>
>>


-- 
Frank LoVecchio
isidorey.com | facebook.com/franklovecchio | franklovecchio.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Sasha Dolgy <sd...@gmail.com>.
Hi Will,

http://wiki.apache.org/cassandra/Operations#Backing_up_data

<http://wiki.apache.org/cassandra/Operations#Backing_up_data>If the snapshot
is written to the ephemeral storage ... there isn't a cost. (i need to
confirm that)

You can then move this to an S3 bucket with RDS if you want or full
99.999999999% redundancy and have it available to developers

This is what I had in my head....
-sd

On Wed, Mar 9, 2011 at 5:39 PM, William Oberman <ob...@civicscience.com>wrote:

> I thought nodetool snapshot writes the snapshot locally, requiring 2x of
> expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
> snapshot).  By that I mean EBS allocation is GB allocated per month costs at
> one rate, and EBS snapshots are delta compressed copies to S3.
>
> Can you point the snapshot to an external filesystem?
>
> will
>
>

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by William Oberman <ob...@civicscience.com>.
I thought nodetool snapshot writes the snapshot locally, requiring 2x of
expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs
snapshot).  By that I mean EBS allocation is GB allocated per month costs at
one rate, and EBS snapshots are delta compressed copies to S3.

Can you point the snapshot to an external filesystem?

will

On Wed, Mar 9, 2011 at 11:31 AM, Sasha Dolgy <sd...@gmail.com> wrote:

>
> Could you not nodetool snapshot the data into an mounted ebs/s3 bucket and
> satisfy your development requirement?
> -sd
>
>
> On Wed, Mar 9, 2011 at 5:23 PM, William Oberman <ob...@civicscience.com>wrote:
>
>> For me, to transition production data into a development environment for
>> real world testing.  Also, backups are never a bad idea, though I agree most
>> all risk is mitigated due to cassandra's design.
>>
>> will
>>
>>


-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Sasha Dolgy <sd...@gmail.com>.
Could you not nodetool snapshot the data into an mounted ebs/s3 bucket and
satisfy your development requirement?
-sd

On Wed, Mar 9, 2011 at 5:23 PM, William Oberman <ob...@civicscience.com>wrote:

> For me, to transition production data into a development environment for
> real world testing.  Also, backups are never a bad idea, though I agree most
> all risk is mitigated due to cassandra's design.
>
> will
>
>

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by William Oberman <ob...@civicscience.com>.
For me, to transition production data into a development environment for
real world testing.  Also, backups are never a bad idea, though I agree most
all risk is mitigated due to cassandra's design.

will

On Wed, Mar 9, 2011 at 10:57 AM, Sasha Dolgy <sd...@gmail.com> wrote:

>
> well, this is what i'm getting at.  why would you want to back it up if the
> cluster is working properly?  backup is silly.... ; )
>
>
> On Wed, Mar 9, 2011 at 4:54 PM, William Oberman <ob...@civicscience.com>wrote:
>
>> I'm considering similar issues right now.  The problem with ephemeral
>> storage is I don't know an easy way to back it up, while on an EBS it's a
>> simple snapshot API call.
>>
>> Otherwise, I believe the performance of the ephemeral (certainly in the
>> case of large or greater, where you can RAID0 multiple disks) is way better
>> than EBS.
>>
>> will
>>
>>


-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by Sasha Dolgy <sd...@gmail.com>.
well, this is what i'm getting at.  why would you want to back it up if the
cluster is working properly?  backup is silly.... ; )

On Wed, Mar 9, 2011 at 4:54 PM, William Oberman <ob...@civicscience.com>wrote:

> I'm considering similar issues right now.  The problem with ephemeral
> storage is I don't know an easy way to back it up, while on an EBS it's a
> simple snapshot API call.
>
> Otherwise, I believe the performance of the ephemeral (certainly in the
> case of large or greater, where you can RAID0 multiple disks) is way better
> than EBS.
>
> will
>
>

Re: Aamzon EC2 & Cassandra to ebs or not..

Posted by William Oberman <ob...@civicscience.com>.
I'm considering similar issues right now.  The problem with ephemeral
storage is I don't know an easy way to back it up, while on an EBS it's a
simple snapshot API call.

Otherwise, I believe the performance of the ephemeral (certainly in the case
of large or greater, where you can RAID0 multiple disks) is way better than
EBS.

will

On Wed, Mar 9, 2011 at 10:27 AM, Sasha Dolgy <sd...@gmail.com> wrote:

> Hi Everyone,
>
> Now that I'm past the problems of IP addresses changing ... I am onto the
> idea of storage.  Initially I had though that for each cassandra instance, I
> should have an EBS volume to store all the cassandra data / information.
> Now I'm starting to wonder if this is duplication and not necessary.  If an
> instance dies, I loose anything that's not attached to EBS.  However, if the
> cassandra cluster is healthy ... this shouldn't be an issue ... Is this a
> correct assumption?
>
> -sd
>
> --
> Sasha Dolgy
> sasha.dolgy@gmail.com
>



-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) oberman@civicscience.com