You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by David Luu <ma...@gmail.com> on 2015/09/20 07:14:32 UTC

Suggestions on load testing a system that uses kafka as underlying message bus?

I'd like to generate load against a system we have that uses kafka as the
message bus. We have a custom JSON message format, and to properly load
test the system, each set of messages for a particular scenario (i.e. user)
needs to have a unique identifier, which it normally does.

I think of using record & playback technique to capture messages that
correspond to a few users. Then play back those messages to generate load
but to be realistic simulation and to scale up the load, I would:

* re-use the captured user set to simulate additional users to scale up #
of users against the system

* for original set of users and when scaling beyond that for more users, I
would dynamically replace the identifier in the captured messages with a
unique one generated at runtime for each user. also replacing anything else
that needs to like timestamps.

As such, this would have to be a scripted solution. I don't think there is
existing kafka-centric tool to assist with such testing is there?

If not, I'd likely have to build my own. In which case, my question is what
technology stack to use would be most suitable so that I can generate the
highest amount of load with the least amount of load generating
machines/hardware. Using threads, processes, or whatever. node.js, python,
scala, ruby, java, .net, etc.

thoughts, suggestions appreciated,
David

Re: Suggestions on load testing a system that uses kafka as underlying message bus?

Posted by David Luu <ma...@gmail.com>.
Thanks, Otis, I'll review the info there.

Also, after posting the original message, I came across some kafka
extensions to popular industry (OSS) load test tools:

https://github.com/BrightTag/kafkameter

https://github.com/mnogu/gatling-kafka

they might be useful for tests that require customizations (rather than
standard benchmark tests).

On Sun, Sep 20, 2015 at 7:02 AM, Otis Gospodnetić <
otis.gospodnetic@gmail.com> wrote:

> Hi,
>
> A few pointers are in this Kafka user ML thread:
> http://search-hadoop.com/m/uyzND177HP92xnm4e
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Sun, Sep 20, 2015 at 1:14 AM, David Luu <ma...@gmail.com> wrote:
>
> > I'd like to generate load against a system we have that uses kafka as the
> > message bus. We have a custom JSON message format, and to properly load
> > test the system, each set of messages for a particular scenario (i.e.
> user)
> > needs to have a unique identifier, which it normally does.
> >
> > I think of using record & playback technique to capture messages that
> > correspond to a few users. Then play back those messages to generate load
> > but to be realistic simulation and to scale up the load, I would:
> >
> > * re-use the captured user set to simulate additional users to scale up #
> > of users against the system
> >
> > * for original set of users and when scaling beyond that for more users,
> I
> > would dynamically replace the identifier in the captured messages with a
> > unique one generated at runtime for each user. also replacing anything
> else
> > that needs to like timestamps.
> >
> > As such, this would have to be a scripted solution. I don't think there
> is
> > existing kafka-centric tool to assist with such testing is there?
> >
> > If not, I'd likely have to build my own. In which case, my question is
> what
> > technology stack to use would be most suitable so that I can generate the
> > highest amount of load with the least amount of load generating
> > machines/hardware. Using threads, processes, or whatever. node.js,
> python,
> > scala, ruby, java, .net, etc.
> >
> > thoughts, suggestions appreciated,
> > David
> >
>

Re: Suggestions on load testing a system that uses kafka as underlying message bus?

Posted by Otis Gospodnetić <ot...@gmail.com>.
Hi,

A few pointers are in this Kafka user ML thread:
http://search-hadoop.com/m/uyzND177HP92xnm4e

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Sun, Sep 20, 2015 at 1:14 AM, David Luu <ma...@gmail.com> wrote:

> I'd like to generate load against a system we have that uses kafka as the
> message bus. We have a custom JSON message format, and to properly load
> test the system, each set of messages for a particular scenario (i.e. user)
> needs to have a unique identifier, which it normally does.
>
> I think of using record & playback technique to capture messages that
> correspond to a few users. Then play back those messages to generate load
> but to be realistic simulation and to scale up the load, I would:
>
> * re-use the captured user set to simulate additional users to scale up #
> of users against the system
>
> * for original set of users and when scaling beyond that for more users, I
> would dynamically replace the identifier in the captured messages with a
> unique one generated at runtime for each user. also replacing anything else
> that needs to like timestamps.
>
> As such, this would have to be a scripted solution. I don't think there is
> existing kafka-centric tool to assist with such testing is there?
>
> If not, I'd likely have to build my own. In which case, my question is what
> technology stack to use would be most suitable so that I can generate the
> highest amount of load with the least amount of load generating
> machines/hardware. Using threads, processes, or whatever. node.js, python,
> scala, ruby, java, .net, etc.
>
> thoughts, suggestions appreciated,
> David
>