You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Peter Haidinyak <ph...@local.com> on 2011/02/07 20:25:25 UTC

Amazon EC2

Hi,
	We are looking at moving our cluster to Amazon's EC2 solution. Has anybody out there already done this or tried and would you have any recommendations/warning?

Thanks

-Pete

Re: Amazon EC2

Posted by Mark Kerzner <ma...@gmail.com>.
It worked fine for me, with Cloudera distribution about a year ago.

I would keep the prepared private AMI, witho whatever additions go into it,
and start the cluster from that. The AMI served as my release version. The
cluster had occasional problems starting, at least back then, due to
networking, but once it was up, it worked great. Of course, you need a
provision to store the HDFS data when the cluster goes down - most likely in
S3, or just always store it in S3, depending on your requirements.

Mark

On Mon, Feb 7, 2011 at 1:25 PM, Peter Haidinyak <ph...@local.com>wrote:

> Hi,
>        We are looking at moving our cluster to Amazon's EC2 solution. Has
> anybody out there already done this or tried and would you have any
> recommendations/warning?
>
> Thanks
>
> -Pete
>

Re: Amazon EC2

Posted by Gary Helmling <gh...@gmail.com>.
I think Jon and Ryan have covered the key points here.

I just want to reiterate that they really valuable aspect of EC2 is the
"elastic" part of the name.  It's great for spinning up a cluster for
testing or batch data processing, without dedicated hardware.  Having the
ability to launch 100 servers on demand is very powerful!  And in these
cases, the economics of EC2 pricing work greatly in your favor.

Where EC2 makes less sense, though, is when you're running an always-on,
24x7, cluster (probably the most frequent scenario for HBase deployments).
 You still avoid the up-front capital expenditure for hardware, but the
monthly cost (especially for HBase where you need to use the larger and more
costly instance types) will quickly overtake the cost of the hardware.  And
at the same time you'll be incurring a performance penalty due to the
virtualized IO and contention for resources with other subscribers.

So do some up-front cost calculations based on your expected usage and
service lifetime.  And be aware of the performance penalty and additional
operational complications.

--gh

On Mon, Feb 7, 2011 at 11:39 AM, Ryan Rawson <ry...@gmail.com> wrote:

> There are other virtualizing environments that offer better perf/$,
> such as softlayer, rackspace cloud, and more.
>
> EC2 is popular... and hence oversubscribed.  People complain about IO
> perf, and while it's not as bad as some people claim, you have to be
> aware that EC2 isnt some magical land where things work great, there
> are lots of gotchas, slower machines, cluster, etc. Running a high
> performance database on low performance systems will end up with a low
> performance database, you might want to check those expectations at
> the door.
>
> Good luck!
> -ryan
>
> On Mon, Feb 7, 2011 at 11:35 AM, Jonathan Gray <jg...@fb.com> wrote:
> > There are others who have had far more experience than I have with HBase
> + EC2, so will let them chime in.  But I personally recommend against this
> direction if you expect to have a consistent cluster size and/or a
> significant amount of load.
> >
> > EC2 is great at quickly scaling up/down, but is usually not cost
> effective if you're running a cluster of a fixed set of nodes 24/7.
> >
> > EC2 also generally experiences far worse IO performance than dedicated
> hardware, so with any significant load, performance suffers on EC2.
> >
> > In addition, EC2 presents its own operational pains and availability
> issues.  Users on EC2 generally have more problems than those with their own
> setups.
> >
> > JG
> >
> >> -----Original Message-----
> >> From: Peter Haidinyak [mailto:phaidinyak@local.com]
> >> Sent: Monday, February 07, 2011 11:25 AM
> >> To: user@hbase.apache.org
> >> Subject: Amazon EC2
> >>
> >> Hi,
> >>       We are looking at moving our cluster to Amazon's EC2 solution. Has
> >> anybody out there already done this or tried and would you have any
> >> recommendations/warning?
> >>
> >> Thanks
> >>
> >> -Pete
> >
>

Re: Amazon EC2

Posted by Ryan Rawson <ry...@gmail.com>.
There are other virtualizing environments that offer better perf/$,
such as softlayer, rackspace cloud, and more.

EC2 is popular... and hence oversubscribed.  People complain about IO
perf, and while it's not as bad as some people claim, you have to be
aware that EC2 isnt some magical land where things work great, there
are lots of gotchas, slower machines, cluster, etc. Running a high
performance database on low performance systems will end up with a low
performance database, you might want to check those expectations at
the door.

Good luck!
-ryan

On Mon, Feb 7, 2011 at 11:35 AM, Jonathan Gray <jg...@fb.com> wrote:
> There are others who have had far more experience than I have with HBase + EC2, so will let them chime in.  But I personally recommend against this direction if you expect to have a consistent cluster size and/or a significant amount of load.
>
> EC2 is great at quickly scaling up/down, but is usually not cost effective if you're running a cluster of a fixed set of nodes 24/7.
>
> EC2 also generally experiences far worse IO performance than dedicated hardware, so with any significant load, performance suffers on EC2.
>
> In addition, EC2 presents its own operational pains and availability issues.  Users on EC2 generally have more problems than those with their own setups.
>
> JG
>
>> -----Original Message-----
>> From: Peter Haidinyak [mailto:phaidinyak@local.com]
>> Sent: Monday, February 07, 2011 11:25 AM
>> To: user@hbase.apache.org
>> Subject: Amazon EC2
>>
>> Hi,
>>       We are looking at moving our cluster to Amazon's EC2 solution. Has
>> anybody out there already done this or tried and would you have any
>> recommendations/warning?
>>
>> Thanks
>>
>> -Pete
>

RE: Amazon EC2

Posted by Jonathan Gray <jg...@fb.com>.
There are others who have had far more experience than I have with HBase + EC2, so will let them chime in.  But I personally recommend against this direction if you expect to have a consistent cluster size and/or a significant amount of load.

EC2 is great at quickly scaling up/down, but is usually not cost effective if you're running a cluster of a fixed set of nodes 24/7.

EC2 also generally experiences far worse IO performance than dedicated hardware, so with any significant load, performance suffers on EC2.

In addition, EC2 presents its own operational pains and availability issues.  Users on EC2 generally have more problems than those with their own setups.

JG

> -----Original Message-----
> From: Peter Haidinyak [mailto:phaidinyak@local.com]
> Sent: Monday, February 07, 2011 11:25 AM
> To: user@hbase.apache.org
> Subject: Amazon EC2
> 
> Hi,
> 	We are looking at moving our cluster to Amazon's EC2 solution. Has
> anybody out there already done this or tried and would you have any
> recommendations/warning?
> 
> Thanks
> 
> -Pete