You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Emmanuel <el...@msn.com> on 2015/03/20 20:32:09 UTC

Kafka-Storm: troubleshooting low R/W throughput

Kafka on test cluster: 2 Kafka nodes, 2GB, 2CPUs3 Zookeeper nodes, 2GB, 2CPUs
Storm:3 nodes, 3CPUs each, on the same Zookeeper cluster as Kafka.
1 topic, 5 partitions, replication x2
Whether I use 1 slot for the Kafka Spout or 5 slots (=#partitions), the throughput seems about the same.
I can't seem to read much more than 7000 events/sec.
Same, on writing, I set a generator spout and write to Kafka on 1 topic/5partitions with a KafkaBolt with parallelism of 5 and I can't seem to write much more than 7000 events/sec.
Meanwhile, none of the CPU, IO or MEM seem to be a bottleneck: In Storm UI the bolts all show capacities <50%, sometimes much less (in the single digit %)Top shows CPUs being used at ~30% max
We have another process moving data from Kafka to Cassandra and it gives similar throughput, so it seems related to Kafka more than Storm.

What could be wrong? Sorry for the generic question but I would appreciate any hint on where to start to troubleshoot.
Thanks 		 	   		  

Re: Kafka-Storm: troubleshooting low R/W throughput

Posted by Manu Zhang <ow...@gmail.com>.
Hi Emmanuel,

You can firstly run a kafka producer perf (bin/kafka-producer-perf-test.sh)
test with your storm consumers and kafka consumer perf (bin/
kafka-consumer-perf.test.sh) test with your own producers respectively to
see if the bottleneck is really in kafka.

Thanks,
Manu Zhang

On Mon, Mar 23, 2015 at 6:31 AM Harsha <ha...@fastmail.fm> wrote:

> Hi Emmanuel,
>        Can you post your kafka server.properties and in your producer are
> your distributing your messages into all kafka topic partitions.
>
> --
> Harsha
>
>
> On March 20, 2015 at 12:33:02 PM, Emmanuel (eleroy@msn.com) wrote:
>
> Kafka on test cluster:
> 2 Kafka nodes, 2GB, 2CPUs
> 3 Zookeeper nodes, 2GB, 2CPUs
>
> Storm:
> 3 nodes, 3CPUs each, on the same Zookeeper cluster as Kafka.
>
> 1 topic, 5 partitions, replication x2
>
> Whether I use 1 slot for the Kafka Spout or 5 slots (=#partitions), the
> throughput seems about the same.
>
> I can't seem to read much more than 7000 events/sec.
>
> Same, on writing, I set a generator spout and write to Kafka on 1
> topic/5partitions with a KafkaBolt with parallelism of 5 and I can't seem
> to write much more than 7000 events/sec.
>
> Meanwhile, none of the CPU, IO or MEM seem to be a bottleneck:
> In Storm UI the bolts all show capacities <50%, sometimes much less (in
> the single digit %)
> Top shows CPUs being used at ~30% max
>
> We have another process moving data from Kafka to Cassandra and it gives
> similar throughput, so it seems related to Kafka more than Storm.
>
>
> What could be wrong?
> Sorry for the generic question but I would appreciate any hint on where to
> start to troubleshoot.
>
> Thanks

Re: Kafka-Storm: troubleshooting low R/W throughput

Posted by Harsha <ha...@fastmail.fm>.
Hi Emmanuel,
       Can you post your kafka server.properties and in your producer are your distributing your messages into all kafka topic partitions.

-- 
Harsha


On March 20, 2015 at 12:33:02 PM, Emmanuel (eleroy@msn.com) wrote:

Kafka on test cluster: 
2 Kafka nodes, 2GB, 2CPUs
3 Zookeeper nodes, 2GB, 2CPUs

Storm:
3 nodes, 3CPUs each, on the same Zookeeper cluster as Kafka.

1 topic, 5 partitions, replication x2

Whether I use 1 slot for the Kafka Spout or 5 slots (=#partitions), the throughput seems about the same.

I can't seem to read much more than 7000 events/sec.

Same, on writing, I set a generator spout and write to Kafka on 1 topic/5partitions with a KafkaBolt with parallelism of 5 and I can't seem to write much more than 7000 events/sec.

Meanwhile, none of the CPU, IO or MEM seem to be a bottleneck: 
In Storm UI the bolts all show capacities <50%, sometimes much less (in the single digit %)
Top shows CPUs being used at ~30% max

We have another process moving data from Kafka to Cassandra and it gives similar throughput, so it seems related to Kafka more than Storm.


What could be wrong? 
Sorry for the generic question but I would appreciate any hint on where to start to troubleshoot.

Thanks