You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Sa Li <sa...@gmail.com> on 2015/02/06 19:46:44 UTC

How many workers can I config

Hi, all

My storm Dev cluster has 3 nodes, and I config to run 4 workers on each
node by default,

supervisor.slots.ports: For each worker machine, you configure how many
workers run on that machine with this config. Each worker uses a single
port for receiving messages, and this setting defines which ports are open
for use. If you define five ports here, then Storm will allocate up to five
workers to run on this machine. If you define three ports, Storm will only
run up to three. By default, this setting is configured to run 4 workers on
the ports 6700, 6701, 6702, and 6703.

I think I can allocate more workers for each node, what is the maximum
number of worker for each node without impact the performance?


thanks

AL

Re: How many workers can I config

Posted by Luke Rohde <ro...@gmail.com>.
Well that's true about GC only if you're setting a large heap, which should
only do if you need to keep a lot of persistent state (i.e. state that's
not being GCed or flushed to a db, for example). I'd be willing to bet that
this covers most use cases. But yeah, point taken that there are exceptions.

On Fri Feb 06 2015 at 3:26:46 PM Nathan Leung <nc...@gmail.com> wrote:

> I would say it depends on what you are trying to do and how your hardware
> is configured.  If you have a lot of memory, you might want more workers so
> that GC does not take as long (this can be a problem if you have 32GB or
> more RAM in your VM, depending on how your application behaves and how your
> GC is tuned).  If you have your topology split across more workers, you
> have to do more serialization, but if a worker fails, you will lose less of
> the topology.
>
> If you have smaller host machines, then fewer workers makes sense.
>
> one worker per node will not affect how many tasks or executors you have
> have.  They will just be distributed amongst fewer workers.
>
> On Fri, Feb 6, 2015 at 3:20 PM, Sa Li <sa...@gmail.com> wrote:
>
>> Thank you very much, Luke, are you saying set one worker each node, like
>>
>> nimbus.host: "nimbus"
>> supervisor.slots.ports:
>>  - 6700
>>  # - 6701
>>  # - 6702
>>  # - 6703
>>
>> comment out 6701-6703, just leave one worker up? Now my question, only
>> one worker per node won't effect parallelism?
>>
>> thanks
>>
>> AL
>>
>> On Fri, Feb 6, 2015 at 11:18 AM, Luke Rohde <ro...@gmail.com> wrote:
>>
>>> You're probably better of using just one worker per node, unless you
>>> have a specific reason that you want to have more JVM instances. Keeping
>>> processing within a single JVM on a node allows tasks running on the same
>>> node to avoid serialization.
>>>
>>> On Fri Feb 06 2015 at 1:48:42 PM Sa Li <sa...@gmail.com> wrote:
>>>
>>>> Hi, all
>>>>
>>>> My storm Dev cluster has 3 nodes, and I config to run 4 workers on each
>>>> node by default,
>>>>
>>>> supervisor.slots.ports: For each worker machine, you configure how
>>>> many workers run on that machine with this config. Each worker uses a
>>>> single port for receiving messages, and this setting defines which ports
>>>> are open for use. If you define five ports here, then Storm will allocate
>>>> up to five workers to run on this machine. If you define three ports, Storm
>>>> will only run up to three. By default, this setting is configured to run 4
>>>> workers on the ports 6700, 6701, 6702, and 6703.
>>>>
>>>> I think I can allocate more workers for each node, what is the maximum
>>>> number of worker for each node without impact the performance?
>>>>
>>>>
>>>> thanks
>>>>
>>>> AL
>>>>
>>>
>>
>

Re: How many workers can I config

Posted by Nathan Leung <nc...@gmail.com>.
I would say it depends on what you are trying to do and how your hardware
is configured.  If you have a lot of memory, you might want more workers so
that GC does not take as long (this can be a problem if you have 32GB or
more RAM in your VM, depending on how your application behaves and how your
GC is tuned).  If you have your topology split across more workers, you
have to do more serialization, but if a worker fails, you will lose less of
the topology.

If you have smaller host machines, then fewer workers makes sense.

one worker per node will not affect how many tasks or executors you have
have.  They will just be distributed amongst fewer workers.

On Fri, Feb 6, 2015 at 3:20 PM, Sa Li <sa...@gmail.com> wrote:

> Thank you very much, Luke, are you saying set one worker each node, like
>
> nimbus.host: "nimbus"
> supervisor.slots.ports:
>  - 6700
>  # - 6701
>  # - 6702
>  # - 6703
>
> comment out 6701-6703, just leave one worker up? Now my question, only one
> worker per node won't effect parallelism?
>
> thanks
>
> AL
>
> On Fri, Feb 6, 2015 at 11:18 AM, Luke Rohde <ro...@gmail.com> wrote:
>
>> You're probably better of using just one worker per node, unless you have
>> a specific reason that you want to have more JVM instances. Keeping
>> processing within a single JVM on a node allows tasks running on the same
>> node to avoid serialization.
>>
>> On Fri Feb 06 2015 at 1:48:42 PM Sa Li <sa...@gmail.com> wrote:
>>
>>> Hi, all
>>>
>>> My storm Dev cluster has 3 nodes, and I config to run 4 workers on each
>>> node by default,
>>>
>>> supervisor.slots.ports: For each worker machine, you configure how many
>>> workers run on that machine with this config. Each worker uses a single
>>> port for receiving messages, and this setting defines which ports are open
>>> for use. If you define five ports here, then Storm will allocate up to five
>>> workers to run on this machine. If you define three ports, Storm will only
>>> run up to three. By default, this setting is configured to run 4 workers on
>>> the ports 6700, 6701, 6702, and 6703.
>>>
>>> I think I can allocate more workers for each node, what is the maximum
>>> number of worker for each node without impact the performance?
>>>
>>>
>>> thanks
>>>
>>> AL
>>>
>>
>

Re: How many workers can I config

Posted by Sa Li <sa...@gmail.com>.
Thank you very much, Luke, are you saying set one worker each node, like

nimbus.host: "nimbus"
supervisor.slots.ports:
 - 6700
 # - 6701
 # - 6702
 # - 6703

comment out 6701-6703, just leave one worker up? Now my question, only one
worker per node won't effect parallelism?

thanks

AL

On Fri, Feb 6, 2015 at 11:18 AM, Luke Rohde <ro...@gmail.com> wrote:

> You're probably better of using just one worker per node, unless you have
> a specific reason that you want to have more JVM instances. Keeping
> processing within a single JVM on a node allows tasks running on the same
> node to avoid serialization.
>
> On Fri Feb 06 2015 at 1:48:42 PM Sa Li <sa...@gmail.com> wrote:
>
>> Hi, all
>>
>> My storm Dev cluster has 3 nodes, and I config to run 4 workers on each
>> node by default,
>>
>> supervisor.slots.ports: For each worker machine, you configure how many
>> workers run on that machine with this config. Each worker uses a single
>> port for receiving messages, and this setting defines which ports are open
>> for use. If you define five ports here, then Storm will allocate up to five
>> workers to run on this machine. If you define three ports, Storm will only
>> run up to three. By default, this setting is configured to run 4 workers on
>> the ports 6700, 6701, 6702, and 6703.
>>
>> I think I can allocate more workers for each node, what is the maximum
>> number of worker for each node without impact the performance?
>>
>>
>> thanks
>>
>> AL
>>
>

Re: How many workers can I config

Posted by Luke Rohde <ro...@gmail.com>.
You're probably better of using just one worker per node, unless you have a
specific reason that you want to have more JVM instances. Keeping
processing within a single JVM on a node allows tasks running on the same
node to avoid serialization.

On Fri Feb 06 2015 at 1:48:42 PM Sa Li <sa...@gmail.com> wrote:

> Hi, all
>
> My storm Dev cluster has 3 nodes, and I config to run 4 workers on each
> node by default,
>
> supervisor.slots.ports: For each worker machine, you configure how many
> workers run on that machine with this config. Each worker uses a single
> port for receiving messages, and this setting defines which ports are open
> for use. If you define five ports here, then Storm will allocate up to five
> workers to run on this machine. If you define three ports, Storm will only
> run up to three. By default, this setting is configured to run 4 workers on
> the ports 6700, 6701, 6702, and 6703.
>
> I think I can allocate more workers for each node, what is the maximum
> number of worker for each node without impact the performance?
>
>
> thanks
>
> AL
>