You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Software Dev <st...@gmail.com> on 2014/05/01 00:57:42 UTC

Machine specs

What kind of specs are we looking at for

1) Nimbus
2) Workers

Any recommendations?

Re: Machine specs

Posted by "Cody A. Ray" <co...@gmail.com>.

I hate to give this answer, but I think it really depends on your
application. If you're doing distributed machine learning or video
compression or something that's CPU heavy, then it'll be CPU heavy. If
you're doing pre-aggregation or rolling windows or other CPU-light
analysis, you're more likely to be memory- or network- bound.

Or people might just be scaling horizontally across a lot of cheap worker
nodes rather than fewer nodes with a lot of CPUs. :)

-Cody


On Thu, May 1, 2014 at 11:57 AM, Software Dev <st...@gmail.com>wrote:

> Seems like all of these setups involve a small number of CPU's??? Does
> storm typically require more RAM than CPU.. ie which is usually the
> bottleneck?
>
> On Wed, Apr 30, 2014 at 8:54 PM, Michael Rose <mi...@fullcontact.com>
> wrote:
> > In AWS, we're fans of c1.xlarges, m3.xlarges, and c3.2xlarges, but have
> seen
> > Storm successfully run on cheaper hardware.
> >
> > Our Nimbus server is usually bored on a m1.large.
> >
> > Michael Rose (@Xorlev)
> > Senior Platform Engineer, FullContact
> > michael@fullcontact.com
> >
> >
> >
> > On Wed, Apr 30, 2014 at 9:48 PM, Cody A. Ray <co...@gmail.com>
> wrote:
> >>
> >> We use m1.larges in EC2 for both nimbus and supervisor machines (though
> >> the m1 family have been deprecated in favor of m3). Our use case is to
> do
> >> some pre-aggregation before persisting the data in a store. (The main
> >> bottleneck in this setup is the downstream datastore, but memory is the
> >> primary constraint on the worker machines due to the in-memory cache
> which
> >> wraps the trident state.)
> >>
> >> For what its worth, Infochimps suggests c1.xlarge or m3.xlarge machines.
> >>
> >> Using the Amazon cloud machines as a reference, we like to use either
> the
> >> c1.xlarge machines (7GB ram, 8 cores, $424/month, giving the highest
> >> CPU-performance-per-dollar) or the m3.xlargemachines (15 GB ram, 4
> cores,
> >> $365/month, the best balance of CPU-per-dollar and RAM-per-dollar). You
> >> shouldn’t use fewer than four worker machines in production, so if your
> >> needs are modest feel free to downsize the hardware accordingly.
> >>
> >> Not sure what others would recommend.
> >>
> >> -Cody
> >>
> >>
> >> On Wed, Apr 30, 2014 at 5:57 PM, Software Dev <
> static.void.dev@gmail.com>
> >> wrote:
> >>>
> >>> What kind of specs are we looking at for
> >>>
> >>> 1) Nimbus
> >>> 2) Workers
> >>>
> >>> Any recommendations?
> >>
> >>
> >>
> >>
> >> --
> >> Cody A. Ray, LEED AP
> >> cody.a.ray@gmail.com
> >> 215.501.7891
> >
> >
>



-- 
Cody A. Ray, LEED AP
cody.a.ray@gmail.com
215.501.7891

Re: Machine specs

Posted by Software Dev <st...@gmail.com>.

Seems like all of these setups involve a small number of CPU's??? Does
storm typically require more RAM than CPU.. ie which is usually the
bottleneck?

On Wed, Apr 30, 2014 at 8:54 PM, Michael Rose <mi...@fullcontact.com> wrote:
> In AWS, we're fans of c1.xlarges, m3.xlarges, and c3.2xlarges, but have seen
> Storm successfully run on cheaper hardware.
>
> Our Nimbus server is usually bored on a m1.large.
>
> Michael Rose (@Xorlev)
> Senior Platform Engineer, FullContact
> michael@fullcontact.com
>
>
>
> On Wed, Apr 30, 2014 at 9:48 PM, Cody A. Ray <co...@gmail.com> wrote:
>>
>> We use m1.larges in EC2 for both nimbus and supervisor machines (though
>> the m1 family have been deprecated in favor of m3). Our use case is to do
>> some pre-aggregation before persisting the data in a store. (The main
>> bottleneck in this setup is the downstream datastore, but memory is the
>> primary constraint on the worker machines due to the in-memory cache which
>> wraps the trident state.)
>>
>> For what its worth, Infochimps suggests c1.xlarge or m3.xlarge machines.
>>
>> Using the Amazon cloud machines as a reference, we like to use either the
>> c1.xlarge machines (7GB ram, 8 cores, $424/month, giving the highest
>> CPU-performance-per-dollar) or the m3.xlargemachines (15 GB ram, 4 cores,
>> $365/month, the best balance of CPU-per-dollar and RAM-per-dollar). You
>> shouldn’t use fewer than four worker machines in production, so if your
>> needs are modest feel free to downsize the hardware accordingly.
>>
>> Not sure what others would recommend.
>>
>> -Cody
>>
>>
>> On Wed, Apr 30, 2014 at 5:57 PM, Software Dev <st...@gmail.com>
>> wrote:
>>>
>>> What kind of specs are we looking at for
>>>
>>> 1) Nimbus
>>> 2) Workers
>>>
>>> Any recommendations?
>>
>>
>>
>>
>> --
>> Cody A. Ray, LEED AP
>> cody.a.ray@gmail.com
>> 215.501.7891
>
>

Re: Machine specs

Posted by Michael Rose <mi...@fullcontact.com>.

In AWS, we're fans of c1.xlarges, m3.xlarges, and c3.2xlarges, but have
seen Storm successfully run on cheaper hardware.

Our Nimbus server is usually bored on a m1.large.

Michael Rose (@Xorlev <https://twitter.com/xorlev>)
Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
michael@fullcontact.com


On Wed, Apr 30, 2014 at 9:48 PM, Cody A. Ray <co...@gmail.com> wrote:

> We use m1.larges in EC2 <http://aws.amazon.com/ec2/instance-types/> for
> both nimbus and supervisor machines (though the m1 family have been
> deprecated in favor of m3). Our use case is to do some pre-aggregation
> before persisting the data in a store. (The main bottleneck in this setup
> is the downstream datastore, but memory is the primary constraint on the
> worker machines due to the in-memory cache which wraps the trident state.)
>
> For what its worth, Infochimps suggests<https://github.com/infochimps-labs/big_data_for_chimps/blob/master/25-storm%2Btrident-tuning.asciidoc>c1.xlarge or m3.xlarge machines.
>
> Using the Amazon cloud machines as a reference, we like to use either the
> c1.xlarge machines (7GB ram, 8 cores, $424/month, giving the highest
> CPU-performance-per-dollar) or the m3.xlargemachines (15 GB ram, 4 cores,
> $365/month, the best balance of CPU-per-dollar and RAM-per-dollar). You
> shouldn’t use fewer than four worker machines in production, so if your
> needs are modest feel free to downsize the hardware accordingly.
>
> Not sure what others would recommend.
>
> -Cody
>
>
> On Wed, Apr 30, 2014 at 5:57 PM, Software Dev <st...@gmail.com>wrote:
>
>> What kind of specs are we looking at for
>>
>> 1) Nimbus
>> 2) Workers
>>
>> Any recommendations?
>>
>
>
>
> --
> Cody A. Ray, LEED AP
> cody.a.ray@gmail.com
> 215.501.7891
>

Re: Machine specs

Posted by "Cody A. Ray" <co...@gmail.com>.

We use m1.larges in EC2 <http://aws.amazon.com/ec2/instance-types/> for
both nimbus and supervisor machines (though the m1 family have been
deprecated in favor of m3). Our use case is to do some pre-aggregation
before persisting the data in a store. (The main bottleneck in this setup
is the downstream datastore, but memory is the primary constraint on the
worker machines due to the in-memory cache which wraps the trident state.)

For what its worth, Infochimps
suggests<https://github.com/infochimps-labs/big_data_for_chimps/blob/master/25-storm%2Btrident-tuning.asciidoc>c1.xlarge
or m3.xlarge machines.

Using the Amazon cloud machines as a reference, we like to use either the
c1.xlarge machines (7GB ram, 8 cores, $424/month, giving the highest
CPU-performance-per-dollar) or the m3.xlargemachines (15 GB ram, 4 cores,
$365/month, the best balance of CPU-per-dollar and RAM-per-dollar). You
shouldn’t use fewer than four worker machines in production, so if your
needs are modest feel free to downsize the hardware accordingly.

Not sure what others would recommend.

-Cody

On Wed, Apr 30, 2014 at 5:57 PM, Software Dev <st...@gmail.com>wrote:

> What kind of specs are we looking at for
>
> 1) Nimbus
> 2) Workers
>
> Any recommendations?
>

-- 
Cody A. Ray, LEED AP
cody.a.ray@gmail.com
215.501.7891