You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Guo -X (jungu - CIIC at Cisco)" <ju...@cisco.com> on 2013/01/09 05:32:28 UTC

kafka 0.8 producer throughput

According to Kafka official document, the producer throughput is about 50MB/S. But I do some test, the producer throughout is only about 2MB/S. The test environment is the same with document says. One producer, One broker, One Zookeeper are in independent machine. Message size is 100 bytes, batch size is 200, flush interval is 600 messages. The test environment is the same, the configuration is the same. The why there is such big gap the my test result and the document says?

Re: kafka 0.8 producer throughput

Posted by Jay Kreps <ja...@gmail.com>.
Some folks came up with a cool hack in 0.8 that makes acks=0 send no
response. This changes the performance for small message sends to be
equivalent to 0.7. This is proposed for inclusion in 0.8. It would
obviously be less useful for the java/scala client in 0.9 if we are able to
properly pipeline requests, but it would still be a valid option for
non-java clients who don't want to deal with the complexity of request
pipelining. JIRA is here for discussion:
  https://issues.apache.org/jira/browse/KAFKA-736

-Jay


On Wed, Jan 9, 2013 at 8:31 AM, Jay Kreps <ja...@gmail.com> wrote:

> We haven't done a ton of performance work on 0.8 yet.
>
> Regardless, requiring the ack will certainly reduce per-producer
> throughput, but it is too early to say by how much. Obviously this won't
> impact broker throughput (so if you have many producers you may not notice).
>
> The plan to fix this is just to make the produce request non-blocking.
> This will allow the same kind of throughput we had before but still allow
> us to give you back and error response if you want it. The hope would be to
> make this change in 0.9
>
> -Jay
>
>
> On Wed, Jan 9, 2013 at 8:24 AM, Jun Rao <ju...@gmail.com> wrote:
>
>> In 0.8, ack is always required. The ack returns an errorcode that
>> indicates
>> the reason if a produce request  fails (e.g., the request is sent to a
>> broker that's not a leader). It also returns the offset of the produced
>> messages. However, the producer can choose when to receive the acks (e.g.,
>> when data reaches 1 replica or all replicas). If the ack indicates an
>> error, the client can choose to retry. The retry logic is built into our
>> high level producer.
>>
>> Thanks,
>>
>> Jun
>>
>> On Wed, Jan 9, 2013 at 6:20 AM, S Ahmed <sa...@gmail.com> wrote:
>>
>> > What's the ack for?  If it fails, it will try another broker?  Can this
>> be
>> > disabled or it's a major design change?
>> >
>> >
>> > On Wed, Jan 9, 2013 at 12:40 AM, Jun Rao <ju...@gmail.com> wrote:
>> >
>> > > The 50MB/s number is for 0.7. We haven't carefully measured the
>> > performance
>> > > in 0.8 yet. We do expect the throughput that a single producer can
>> drive
>> > in
>> > > 0.8 to be less. This is because the 0.8 producer needs to wait for an
>> RPC
>> > > response from the broker while in 0.7, there is no ack for the
>> producer.
>> > > Nevertheless, 2MB/s seems low. Could you try increasing flush
>> interval to
>> > > sth bigger, like 20000?
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Tue, Jan 8, 2013 at 8:32 PM, Jun Guo -X (jungu - CIIC at Cisco) <
>> > > jungu@cisco.com> wrote:
>> > >
>> > > > According to Kafka official document, the producer throughput is
>> about
>> > > > 50MB/S. But I do some test, the producer throughout is only about
>> > 2MB/S.
>> > > > The test environment is the same with document says. One producer,
>> One
>> > > > broker, One Zookeeper are in independent machine. Message size is
>> 100
>> > > > bytes, batch size is 200, flush interval is 600 messages. The test
>> > > > environment is the same, the configuration is the same. The why
>> there
>> > is
>> > > > such big gap the my test result and the document says?
>> > > >
>> > >
>> >
>>
>
>

Re: kafka 0.8 producer throughput

Posted by Jay Kreps <ja...@gmail.com>.
We haven't done a ton of performance work on 0.8 yet.

Regardless, requiring the ack will certainly reduce per-producer
throughput, but it is too early to say by how much. Obviously this won't
impact broker throughput (so if you have many producers you may not notice).

The plan to fix this is just to make the produce request non-blocking. This
will allow the same kind of throughput we had before but still allow us to
give you back and error response if you want it. The hope would be to make
this change in 0.9

-Jay


On Wed, Jan 9, 2013 at 8:24 AM, Jun Rao <ju...@gmail.com> wrote:

> In 0.8, ack is always required. The ack returns an errorcode that indicates
> the reason if a produce request  fails (e.g., the request is sent to a
> broker that's not a leader). It also returns the offset of the produced
> messages. However, the producer can choose when to receive the acks (e.g.,
> when data reaches 1 replica or all replicas). If the ack indicates an
> error, the client can choose to retry. The retry logic is built into our
> high level producer.
>
> Thanks,
>
> Jun
>
> On Wed, Jan 9, 2013 at 6:20 AM, S Ahmed <sa...@gmail.com> wrote:
>
> > What's the ack for?  If it fails, it will try another broker?  Can this
> be
> > disabled or it's a major design change?
> >
> >
> > On Wed, Jan 9, 2013 at 12:40 AM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > The 50MB/s number is for 0.7. We haven't carefully measured the
> > performance
> > > in 0.8 yet. We do expect the throughput that a single producer can
> drive
> > in
> > > 0.8 to be less. This is because the 0.8 producer needs to wait for an
> RPC
> > > response from the broker while in 0.7, there is no ack for the
> producer.
> > > Nevertheless, 2MB/s seems low. Could you try increasing flush interval
> to
> > > sth bigger, like 20000?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Jan 8, 2013 at 8:32 PM, Jun Guo -X (jungu - CIIC at Cisco) <
> > > jungu@cisco.com> wrote:
> > >
> > > > According to Kafka official document, the producer throughput is
> about
> > > > 50MB/S. But I do some test, the producer throughout is only about
> > 2MB/S.
> > > > The test environment is the same with document says. One producer,
> One
> > > > broker, One Zookeeper are in independent machine. Message size is 100
> > > > bytes, batch size is 200, flush interval is 600 messages. The test
> > > > environment is the same, the configuration is the same. The why there
> > is
> > > > such big gap the my test result and the document says?
> > > >
> > >
> >
>

Re: kafka 0.8 producer throughput

Posted by Jun Rao <ju...@gmail.com>.
In 0.8, ack is always required. The ack returns an errorcode that indicates
the reason if a produce request  fails (e.g., the request is sent to a
broker that's not a leader). It also returns the offset of the produced
messages. However, the producer can choose when to receive the acks (e.g.,
when data reaches 1 replica or all replicas). If the ack indicates an
error, the client can choose to retry. The retry logic is built into our
high level producer.

Thanks,

Jun

On Wed, Jan 9, 2013 at 6:20 AM, S Ahmed <sa...@gmail.com> wrote:

> What's the ack for?  If it fails, it will try another broker?  Can this be
> disabled or it's a major design change?
>
>
> On Wed, Jan 9, 2013 at 12:40 AM, Jun Rao <ju...@gmail.com> wrote:
>
> > The 50MB/s number is for 0.7. We haven't carefully measured the
> performance
> > in 0.8 yet. We do expect the throughput that a single producer can drive
> in
> > 0.8 to be less. This is because the 0.8 producer needs to wait for an RPC
> > response from the broker while in 0.7, there is no ack for the producer.
> > Nevertheless, 2MB/s seems low. Could you try increasing flush interval to
> > sth bigger, like 20000?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Jan 8, 2013 at 8:32 PM, Jun Guo -X (jungu - CIIC at Cisco) <
> > jungu@cisco.com> wrote:
> >
> > > According to Kafka official document, the producer throughput is about
> > > 50MB/S. But I do some test, the producer throughout is only about
> 2MB/S.
> > > The test environment is the same with document says. One producer, One
> > > broker, One Zookeeper are in independent machine. Message size is 100
> > > bytes, batch size is 200, flush interval is 600 messages. The test
> > > environment is the same, the configuration is the same. The why there
> is
> > > such big gap the my test result and the document says?
> > >
> >
>

Re: kafka 0.8 producer throughput

Posted by S Ahmed <sa...@gmail.com>.
What's the ack for?  If it fails, it will try another broker?  Can this be
disabled or it's a major design change?


On Wed, Jan 9, 2013 at 12:40 AM, Jun Rao <ju...@gmail.com> wrote:

> The 50MB/s number is for 0.7. We haven't carefully measured the performance
> in 0.8 yet. We do expect the throughput that a single producer can drive in
> 0.8 to be less. This is because the 0.8 producer needs to wait for an RPC
> response from the broker while in 0.7, there is no ack for the producer.
> Nevertheless, 2MB/s seems low. Could you try increasing flush interval to
> sth bigger, like 20000?
>
> Thanks,
>
> Jun
>
> On Tue, Jan 8, 2013 at 8:32 PM, Jun Guo -X (jungu - CIIC at Cisco) <
> jungu@cisco.com> wrote:
>
> > According to Kafka official document, the producer throughput is about
> > 50MB/S. But I do some test, the producer throughout is only about 2MB/S.
> > The test environment is the same with document says. One producer, One
> > broker, One Zookeeper are in independent machine. Message size is 100
> > bytes, batch size is 200, flush interval is 600 messages. The test
> > environment is the same, the configuration is the same. The why there is
> > such big gap the my test result and the document says?
> >
>

Re: kafka 0.8 producer throughput

Posted by Jun Rao <ju...@gmail.com>.
The 50MB/s number is for 0.7. We haven't carefully measured the performance
in 0.8 yet. We do expect the throughput that a single producer can drive in
0.8 to be less. This is because the 0.8 producer needs to wait for an RPC
response from the broker while in 0.7, there is no ack for the producer.
Nevertheless, 2MB/s seems low. Could you try increasing flush interval to
sth bigger, like 20000?

Thanks,

Jun

On Tue, Jan 8, 2013 at 8:32 PM, Jun Guo -X (jungu - CIIC at Cisco) <
jungu@cisco.com> wrote:

> According to Kafka official document, the producer throughput is about
> 50MB/S. But I do some test, the producer throughout is only about 2MB/S.
> The test environment is the same with document says. One producer, One
> broker, One Zookeeper are in independent machine. Message size is 100
> bytes, batch size is 200, flush interval is 600 messages. The test
> environment is the same, the configuration is the same. The why there is
> such big gap the my test result and the document says?
>