You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Victor Kovrizhkin <vi...@gmail.com> on 2016/05/30 08:43:30 UTC

Storm 0.10.0 topology performance

Hi everybody,

I have DRPC topology (not Trident) which executes some several requests to Cassandra, performs some business logic and returns result back to the client. 
Topology looks like this: 


Bolt #2 and #3 execute some queries to Cassandra and aggregates results. 
I have 3 supervisors with 4 cpu cores and 16 GB of memory, one DRPC with 4 cpu cores and 16 GB of memory and one instance with nimbus and zookeeper.  
Each supervisor runs just one worker.
Also there are 5 separate Cassandra nodes.

I test my topology with jMeter and it is able to do 36 req/sec with average response time 4-5 sec. This generates 20k requests/sec to Cassandra.
Process latency of each bolt is quite small – for example bolts #2 and #3, which executes all business logic have quite low process latency – 68 and 10 ms respectively. Complete latency is higher then sum of all process latencies but still it much less then average response time in Jmeter. 
CPU utilisation of supervisor nodes is below 60%, DPRC - 20%. If I increase the load (number of thread in Jmeter) – average response time becomes higher but thoughtput doesn’t increase. 

Cassandra doesn’t seem to be a bottleneck cause it is able to handle more then 60k read requests/second (tested via Jmeter plugin).
Also the network channel between Cassandra and Storm and Storm DRPC and Jmeter is not a bottleneck (checked via iperf3).

It looks like there is some internal queue in storm, but I don’t know how to check it.
I have following cluster settings: 

supervisor.childopts: "-Xmx2g -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true"

worker.childopts: "-Xms8g -Xmx8g -XX:NewSize=2g -XX:MaxNewSize=2g -XX:MaxTenuringThreshold=1 -XX:SurvivorRatio=6 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true”
nimbus.thrift.threads: 64
nimbus.childopts: "-Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true"

drpc.port: 3772
drpc.http.port: 3774
drpc.request.timeout.secs: 600
drpc.childopts: "-Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true"
drpc.queue.size: 1024
drpc.worker.threads: 64
drpc.invocations.threads: 64
And topology settings: 

topology.workers: 3 
topology.acker.executors: 3
topology.max.spout.pending: 1000
topology.executor.receive.buffer.size: 1024
topology.executor.send.buffer.size: 1024
topology.transfer.buffer.size: 1024
topology.worker.receiver.thread.count: 1
I’ve tried to play with this settings – set them to higher and lower values, but it doesn’t help. May be I miss something? 
Please help
Victor