You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Joe San <co...@gmail.com> on 2016/04/25 15:34:47 UTC

Using Multiple Kafka Producers for a single Kafka Topic

I have an application that is currently running and is using Rx Streams to
move data. Now in this application, I have a couple of streams whose
messages I would like to write to a single Kafka topic. Given this, I have
say Streams 1 to 5 as below:

Stream1 - Takes in DataType A Stream2 - Takes in DataType B and so on

Where these Streams are Rx Observers. All these data types that I get out
of the stream are converted to a common JSON structure. I want this JSON
structure to be pushed to a single Kafka topic.

Now the questions are:

   1.

   Should I create one KafkaProducer for each of those Streams or rather Rx
   Observer instances?
   2.

   What happens if multiple threads using its own instance of a
   KafkaProducer to write to the same topic?

Re: Using Multiple Kafka Producers for a single Kafka Topic

Posted by Tom Crayford <tc...@heroku.com>.
Generally Kafka isn't super great with a giant number of topics. I'd
recommend designing your system around a smaller number than 10k. There's
an upper limit enforced on the total number of partitions by zookeeper
anyway, somewhere around 29k.

I'd recommend having just a single producer per JVM, to reuse TCP
connections and maximize batching. There's no real benefit over having more
producers except slightly minimized lock contention. However, the limiting
factor in most Kafka based apps isn't usually anything like lock contention
on the producer - I'd expect the network to be the real limiter here.

Thanks

Tom Crayford
Heroku Kafka

On Wednesday, 25 May 2016, Joe San <co...@gmail.com> wrote:

> I do not mind the ordering as I have a Timestamp in all my messages and all
> my messaged land in a Timeseries database. So I understand that it is
> better to have just one Producer instance per JVM and use that to write to
> n number of topics. I mean even if I have 10,000 topics, I can just get
> away with a single Producer instance per JVM?
>
> On Wed, May 25, 2016 at 8:41 AM, Ewen Cheslack-Postava <ewen@confluent.io
> <javascript:;>>
> wrote:
>
> > On Mon, Apr 25, 2016 at 6:34 AM, Joe San <codeintheopen@gmail.com
> <javascript:;>> wrote:
> >
> > > I have an application that is currently running and is using Rx Streams
> > to
> > > move data. Now in this application, I have a couple of streams whose
> > > messages I would like to write to a single Kafka topic. Given this, I
> > have
> > > say Streams 1 to 5 as below:
> > >
> > > Stream1 - Takes in DataType A Stream2 - Takes in DataType B and so on
> > >
> > > Where these Streams are Rx Observers. All these data types that I get
> out
> > > of the stream are converted to a common JSON structure. I want this
> JSON
> > > structure to be pushed to a single Kafka topic.
> > >
> > > Now the questions are:
> > >
> > >    1.
> > >
> > >    Should I create one KafkaProducer for each of those Streams or
> rather
> > Rx
> > >    Observer instances?
> > >
> >
> > A single producer instance is fine. In fact, it may be better since you
> > share TCP connections and requests to produce data can be batched
> together.
> >
> >
> > >    2.
> > >
> > >    What happens if multiple threads using its own instance of a
> > >    KafkaProducer to write to the same topic?
> > >
> >
> > They can all write to the same topic, but their data will be arbitrarily
> > interleaved since there's no ordering guarantee across these producers.
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
>

Re: Using Multiple Kafka Producers for a single Kafka Topic

Posted by Joe San <co...@gmail.com>.
I do not mind the ordering as I have a Timestamp in all my messages and all
my messaged land in a Timeseries database. So I understand that it is
better to have just one Producer instance per JVM and use that to write to
n number of topics. I mean even if I have 10,000 topics, I can just get
away with a single Producer instance per JVM?

On Wed, May 25, 2016 at 8:41 AM, Ewen Cheslack-Postava <ew...@confluent.io>
wrote:

> On Mon, Apr 25, 2016 at 6:34 AM, Joe San <co...@gmail.com> wrote:
>
> > I have an application that is currently running and is using Rx Streams
> to
> > move data. Now in this application, I have a couple of streams whose
> > messages I would like to write to a single Kafka topic. Given this, I
> have
> > say Streams 1 to 5 as below:
> >
> > Stream1 - Takes in DataType A Stream2 - Takes in DataType B and so on
> >
> > Where these Streams are Rx Observers. All these data types that I get out
> > of the stream are converted to a common JSON structure. I want this JSON
> > structure to be pushed to a single Kafka topic.
> >
> > Now the questions are:
> >
> >    1.
> >
> >    Should I create one KafkaProducer for each of those Streams or rather
> Rx
> >    Observer instances?
> >
>
> A single producer instance is fine. In fact, it may be better since you
> share TCP connections and requests to produce data can be batched together.
>
>
> >    2.
> >
> >    What happens if multiple threads using its own instance of a
> >    KafkaProducer to write to the same topic?
> >
>
> They can all write to the same topic, but their data will be arbitrarily
> interleaved since there's no ordering guarantee across these producers.
>
>
>
> --
> Thanks,
> Ewen
>

Re: Using Multiple Kafka Producers for a single Kafka Topic

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
On Mon, Apr 25, 2016 at 6:34 AM, Joe San <co...@gmail.com> wrote:

> I have an application that is currently running and is using Rx Streams to
> move data. Now in this application, I have a couple of streams whose
> messages I would like to write to a single Kafka topic. Given this, I have
> say Streams 1 to 5 as below:
>
> Stream1 - Takes in DataType A Stream2 - Takes in DataType B and so on
>
> Where these Streams are Rx Observers. All these data types that I get out
> of the stream are converted to a common JSON structure. I want this JSON
> structure to be pushed to a single Kafka topic.
>
> Now the questions are:
>
>    1.
>
>    Should I create one KafkaProducer for each of those Streams or rather Rx
>    Observer instances?
>

A single producer instance is fine. In fact, it may be better since you
share TCP connections and requests to produce data can be batched together.


>    2.
>
>    What happens if multiple threads using its own instance of a
>    KafkaProducer to write to the same topic?
>

They can all write to the same topic, but their data will be arbitrarily
interleaved since there's no ordering guarantee across these producers.



-- 
Thanks,
Ewen