You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Patricio Echagüe <pa...@gmail.com> on 2012/01/04 20:22:09 UTC

Producer thread-safety and performance

Hi all, I'm trying to get familiar with Kafka code base and scala. I
haven't been to answer the question myself yet.

I have a bunch of threads (managed by a Thread pool executor in Java) and
would like to know/ask recommendation as far as thread-safety goes on the
producer side.

Should I have one producer object per thread ? or one for all ?

I read in an old email that the producer has proper synchronization, but
It's not clear if it follows the concurrent approach similar to
java.concurrent package style or if it's simple java synchronization. My
goal is to not slow down the message generation.

Thanks in advance.
Patricio

Re: Producer thread-safety and performance

Posted by Neha Narkhede <ne...@gmail.com>.
>> Whether or not you will see throughput advantages from having multiple
producer instances depends on your usage, you will have to try it out.

It depends on the network and disk bandwidth too. For example, when we
ran some performance tests at LinkedIn few months ago,
we could saturate the network (1Gb link) with 8 producer threads. Note
that disk was not the bottleneck in this case, since the server was
configured to flush after
~100K messages or so, and we used 6 SATA drives in RAID 10 on the
Kafka cluster.

On the other hand, if you configure your servers to flush too often,
the disk becomes the bottleneck. In this case, whether you use 1 or
more producer threads, would not matter.

Thanks,
Neha

2012/1/4 Patricio Echagüe <pa...@gmail.com>:
> Thanks for the insight Jay.
>
> On Wed, Jan 4, 2012 at 11:30 AM, Jay Kreps <ja...@gmail.com> wrote:
>
>> The producer is thread safe.
>>
>> Whether or not you will see throughput advantages from having multiple
>> producer instances depends on your usage, you will have to try it out. For
>> larger messages or using the producer in async mode, one producer can
>> pretty easily saturate the network and adding more producers won't add much
>> in the way of throughput. For small messages sent in sync mode having more
>> than one may improve throughput.
>>
>> -Jay
>>
>> 2012/1/4 Patricio Echagüe <pa...@gmail.com>
>>
>> > Hi all, I'm trying to get familiar with Kafka code base and scala. I
>> > haven't been to answer the question myself yet.
>> >
>> > I have a bunch of threads (managed by a Thread pool executor in Java) and
>> > would like to know/ask recommendation as far as thread-safety goes on the
>> > producer side.
>> >
>> > Should I have one producer object per thread ? or one for all ?
>> >
>> > I read in an old email that the producer has proper synchronization, but
>> > It's not clear if it follows the concurrent approach similar to
>> > java.concurrent package style or if it's simple java synchronization. My
>> > goal is to not slow down the message generation.
>> >
>> > Thanks in advance.
>> > Patricio
>> >
>>

Re: Producer thread-safety and performance

Posted by Patricio Echagüe <pa...@gmail.com>.
Thanks for the insight Jay.

On Wed, Jan 4, 2012 at 11:30 AM, Jay Kreps <ja...@gmail.com> wrote:

> The producer is thread safe.
>
> Whether or not you will see throughput advantages from having multiple
> producer instances depends on your usage, you will have to try it out. For
> larger messages or using the producer in async mode, one producer can
> pretty easily saturate the network and adding more producers won't add much
> in the way of throughput. For small messages sent in sync mode having more
> than one may improve throughput.
>
> -Jay
>
> 2012/1/4 Patricio Echagüe <pa...@gmail.com>
>
> > Hi all, I'm trying to get familiar with Kafka code base and scala. I
> > haven't been to answer the question myself yet.
> >
> > I have a bunch of threads (managed by a Thread pool executor in Java) and
> > would like to know/ask recommendation as far as thread-safety goes on the
> > producer side.
> >
> > Should I have one producer object per thread ? or one for all ?
> >
> > I read in an old email that the producer has proper synchronization, but
> > It's not clear if it follows the concurrent approach similar to
> > java.concurrent package style or if it's simple java synchronization. My
> > goal is to not slow down the message generation.
> >
> > Thanks in advance.
> > Patricio
> >
>

Re: Producer thread-safety and performance

Posted by Jay Kreps <ja...@gmail.com>.
The producer is thread safe.

Whether or not you will see throughput advantages from having multiple
producer instances depends on your usage, you will have to try it out. For
larger messages or using the producer in async mode, one producer can
pretty easily saturate the network and adding more producers won't add much
in the way of throughput. For small messages sent in sync mode having more
than one may improve throughput.

-Jay

2012/1/4 Patricio Echagüe <pa...@gmail.com>

> Hi all, I'm trying to get familiar with Kafka code base and scala. I
> haven't been to answer the question myself yet.
>
> I have a bunch of threads (managed by a Thread pool executor in Java) and
> would like to know/ask recommendation as far as thread-safety goes on the
> producer side.
>
> Should I have one producer object per thread ? or one for all ?
>
> I read in an old email that the producer has proper synchronization, but
> It's not clear if it follows the concurrent approach similar to
> java.concurrent package style or if it's simple java synchronization. My
> goal is to not slow down the message generation.
>
> Thanks in advance.
> Patricio
>