You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Saloni Vithalani <sa...@thoughtworks.com> on 2018/01/19 17:23:08 UTC

How many kafka streams app is recommended to run on single machine in production ?

In our architecture, we are assuming to run three jvm processes on one
machine (approx.) and each jvm machine can host upto 15 kafka-stream apps.

And if I am not wrong each kafka-stream app spawns one java thread. So,
this seems like an awkward architecture to have with around 45 kafka-stream
apps running on a single machine.

So, I have question in three parts

1) Is my understanding correct that each kafka-stream app spawns one java
thread ? Also, each kafka-stream starts a new tcp connection with
kafka-broker ?

2) Is there a way to share one tcp connection for multiple kafka-streams ?

3) Is is difficult(not recommended) to run 45 streams on single machine ?
The answer to this is definitely NO unless there is a real use case in
production.

Regards,
Saloni Vithalani
Developer
Email saloniv@thoughtworks.com
Telephone +91 8552889571 <8552889571>
[image: ThoughtWorks]
<http://www.thoughtworks.com/?utm_campaign=saloni-vithalani-signature&utm_medium=email&utm_source=thoughtworks-email-signature-generator>

Re: How many kafka streams app is recommended to run on single machine in production ?

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Multiple answers:

- a KafkaStreams instance start one *processing* thread by default (you
can configure more processing threads, too)

- internally, KafkaStreams uses two KafkaConsumers and one KafkaProducer
(if you turn on EOS, it uses even more KafkaProducers): a KafkaConsumer
starts a background heartbeat thread and a KafkaProducer starts a
background sender thread => you get 4 threads in total (processing, 2x
heartbeat, sender) -- if you configure two processing threads, you end
up with 8 threads in total, etc)

- there is more than one TCP connection as the consumer and the producer
(and the restore consumer, if you enable StandbyTasks) connect to the
cluster

- it's not possible to share any TPC connections atm (this would require
a mayor rewrite of consumers and producers)

- how many thread you can efficient run depends on your hardward and
workload... monitor you CPU utilization and see how buys your machine is...


Hope this helps

-Matthias

On 1/19/18 9:23 AM, Saloni Vithalani wrote:
> In our architecture, we are assuming to run three jvm processes on one
> machine (approx.) and each jvm machine can host upto 15 kafka-stream apps.
> 
> And if I am not wrong each kafka-stream app spawns one java thread. So,
> this seems like an awkward architecture to have with around 45 kafka-stream
> apps running on a single machine.
> 
> So, I have question in three parts
> 
> 1) Is my understanding correct that each kafka-stream app spawns one java
> thread ? Also, each kafka-stream starts a new tcp connection with
> kafka-broker ?
> 
> 2) Is there a way to share one tcp connection for multiple kafka-streams ?
> 
> 3) Is is difficult(not recommended) to run 45 streams on single machine ?
> The answer to this is definitely NO unless there is a real use case in
> production.
> 
> Regards,
> Saloni Vithalani
> Developer
> Email saloniv@thoughtworks.com
> Telephone +91 8552889571 <8552889571>
> [image: ThoughtWorks]
> <http://www.thoughtworks.com/?utm_campaign=saloni-vithalani-signature&utm_medium=email&utm_source=thoughtworks-email-signature-generator>
> 


Re: How many kafka streams app is recommended to run on single machine in production ?

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Multiple answers:

- a KafkaStreams instance start one *processing* thread by default (you
can configure more processing threads, too)

- internally, KafkaStreams uses two KafkaConsumers and one KafkaProducer
(if you turn on EOS, it uses even more KafkaProducers): a KafkaConsumer
starts a background heartbeat thread and a KafkaProducer starts a
background sender thread => you get 4 threads in total (processing, 2x
heartbeat, sender) -- if you configure two processing threads, you end
up with 8 threads in total, etc)

- there is more than one TCP connection as the consumer and the producer
(and the restore consumer, if you enable StandbyTasks) connect to the
cluster

- it's not possible to share any TPC connections atm (this would require
a mayor rewrite of consumers and producers)

- how many thread you can efficient run depends on your hardward and
workload... monitor you CPU utilization and see how buys your machine is...


Hope this helps

-Matthias

On 1/19/18 9:23 AM, Saloni Vithalani wrote:
> In our architecture, we are assuming to run three jvm processes on one
> machine (approx.) and each jvm machine can host upto 15 kafka-stream apps.
> 
> And if I am not wrong each kafka-stream app spawns one java thread. So,
> this seems like an awkward architecture to have with around 45 kafka-stream
> apps running on a single machine.
> 
> So, I have question in three parts
> 
> 1) Is my understanding correct that each kafka-stream app spawns one java
> thread ? Also, each kafka-stream starts a new tcp connection with
> kafka-broker ?
> 
> 2) Is there a way to share one tcp connection for multiple kafka-streams ?
> 
> 3) Is is difficult(not recommended) to run 45 streams on single machine ?
> The answer to this is definitely NO unless there is a real use case in
> production.
> 
> Regards,
> Saloni Vithalani
> Developer
> Email saloniv@thoughtworks.com
> Telephone +91 8552889571 <8552889571>
> [image: ThoughtWorks]
> <http://www.thoughtworks.com/?utm_campaign=saloni-vithalani-signature&utm_medium=email&utm_source=thoughtworks-email-signature-generator>
>