You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Srinath C <sr...@gmail.com> on 2014/06/01 03:25:30 UTC

Re: Storm performance

Hi Shaikh,
    You may want to configure your internal buffers. See this blog -
http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/
    Increase the parallelism of your spouts and bolts depending on the
system configuration. This is not simple and will require repeated tests to
figure out the right numbers. The storm metrics will help you figure this
out. I would say set it to a considerably high value and then reduce if
required.
    In our case, we had 2 CPU systems to run the workers so we set netty
threads to 2 and ran only a single worker per supervisor.
    2500 messages per second should be easy considering the number of
servers you have to run the workers.

Regards,
Srinath.




On Sun, Jun 1, 2014 at 3:21 AM, Shaikh Riyaz <sh...@gmail.com> wrote:

> Hi All,
>
> We have setup our storm cluster to process 2500 messaged per second.
> Unfortunately, we are not to able the throughput as expected.
>
> Most of time either we get "GC overhead limit exceeded"  or "Too many
> failed tuples".
>
> Below the server configuration and Storm configuration.
> 1. One Nimbus server.
> 2. 5 supervisor with 2 slots (workers) each.
>
> *Storm Configuration:*
> Config conf = new Config();
> conf.setNumWorkers(10);
> conf.setMaxSpoutPending(80000);
> conf.setMaxTaskParallelism(6);
> conf.put(RichSpoutBatchExecutor.MAX_BATCH_SIZE_CONF, 64 * 1024);
>
> *Kafka spout configuration:*
> kafkaConfig.bufferSizeBytes = 1024*1024*4;
> kafkaConfig.fetchSizeBytes = 1024*1024*4;
> kafkaConfig.forceFromStart = false;
>
> Kafka cluster is running with partition of 2.
>
> *Topology Configuration:*
> Spout:  With parallelism_hint 5
> Bolt1:   With parallelism_hint 6
> Bolt2:   With parallelism_hint 5
> Bolt3, Bolt4, Bolt5 and Bolt5: With parallelism_hint 3
>
>
> Storm.yaml:
>
>
> supervisor.slots.ports:
>     - 6700
>
>     - 6701
> supervisor.childopts: "-Xmx1024m"
>
>
> worker.childopts: "-Xmx2048m"
> topology.message.timeout.secs: 30
>
> Please help me to solve this problem.
>
> Thanks in advance.
>  --
> Regards,
>
> Riyaz
>
>

Re: Storm performance

Posted by Nathan Leung <nc...@gmail.com>.

If you are GCing too much and failing a lot of tuples (which may be in part
due to GCs) it is quite possible that you are out of RAM and you should
increase the amount that is allocated for each worker.


On Sat, May 31, 2014 at 9:25 PM, Srinath C <sr...@gmail.com> wrote:

> Hi Shaikh,
>     You may want to configure your internal buffers. See this blog -
> http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/
>     Increase the parallelism of your spouts and bolts depending on the
> system configuration. This is not simple and will require repeated tests to
> figure out the right numbers. The storm metrics will help you figure this
> out. I would say set it to a considerably high value and then reduce if
> required.
>     In our case, we had 2 CPU systems to run the workers so we set netty
> threads to 2 and ran only a single worker per supervisor.
>     2500 messages per second should be easy considering the number of
> servers you have to run the workers.
>
> Regards,
> Srinath.
>
>
>
>
> On Sun, Jun 1, 2014 at 3:21 AM, Shaikh Riyaz <sh...@gmail.com> wrote:
>
>> Hi All,
>>
>> We have setup our storm cluster to process 2500 messaged per second.
>> Unfortunately, we are not to able the throughput as expected.
>>
>> Most of time either we get "GC overhead limit exceeded"  or "Too many
>> failed tuples".
>>
>> Below the server configuration and Storm configuration.
>> 1. One Nimbus server.
>> 2. 5 supervisor with 2 slots (workers) each.
>>
>> *Storm Configuration:*
>> Config conf = new Config();
>> conf.setNumWorkers(10);
>> conf.setMaxSpoutPending(80000);
>> conf.setMaxTaskParallelism(6);
>> conf.put(RichSpoutBatchExecutor.MAX_BATCH_SIZE_CONF, 64 * 1024);
>>
>> *Kafka spout configuration:*
>> kafkaConfig.bufferSizeBytes = 1024*1024*4;
>> kafkaConfig.fetchSizeBytes = 1024*1024*4;
>> kafkaConfig.forceFromStart = false;
>>
>> Kafka cluster is running with partition of 2.
>>
>> *Topology Configuration:*
>> Spout:  With parallelism_hint 5
>> Bolt1:   With parallelism_hint 6
>> Bolt2:   With parallelism_hint 5
>> Bolt3, Bolt4, Bolt5 and Bolt5: With parallelism_hint 3
>>
>>
>> Storm.yaml:
>>
>>
>> supervisor.slots.ports:
>>     - 6700
>>
>>     - 6701
>> supervisor.childopts: "-Xmx1024m"
>>
>>
>>
>> worker.childopts: "-Xmx2048m"
>> topology.message.timeout.secs: 30
>>
>> Please help me to solve this problem.
>>
>> Thanks in advance.
>>  --
>> Regards,
>>
>> Riyaz
>>
>>
>