You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Dong Lin <li...@gmail.com> on 2018/04/12 03:50:08 UTC

[DISCUSS] KIP-286: producer.send() should not block on metadata update

Hi all,

I have created KIP-286: producer.send() should not block on metadata
update. See
https://cwiki.apache.org/confluence/display/KAFKA/KIP-286%3A+producer.send%28%29+should+not+block+on+metadata+update
.

The KIP intends to improve user-experience of producer.send() when metadata
is not available. It is related but different from the previous discussion
in KAFKA-3539 in the sense that user still has the option of letting
producer.send() block on full producer queue.

Comments are welcome!

Thanks,
Dong

Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

Posted by Ted Yu <yu...@gmail.com>.
Dong:
Yes, that answers my question.

Thanks

On Thu, Apr 12, 2018 at 1:41 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Ted,
>
> Thanks for your comments. With the proposed solution in the KIP, the memory
> is only allocated once for the given message, which is the same as the
> existing implementation. The serialized message will be moved from
> per-topic queue to per-partition queue without incurring additional memory
> overhead. Does this address your question?
>
> Thanks,
> Dong
>
>
> On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Looks like per-topic queue is introduced.
> >
> > In terms of memory consumption, how does the KIP allocate memory
> > between per-topic
> > queue and per-partition queue ?
> >
> > Thanks
> >
> > On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin <li...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I have created KIP-286: producer.send() should not block on metadata
> > > update. See
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> > > .
> > >
> > > The KIP intends to improve user-experience of producer.send() when
> > metadata
> > > is not available. It is related but different from the previous
> > discussion
> > > in KAFKA-3539 in the sense that user still has the option of letting
> > > producer.send() block on full producer queue.
> > >
> > > Comments are welcome!
> > >
> > > Thanks,
> > > Dong
> > >
> >
>

Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

Posted by Dong Lin <li...@gmail.com>.
I am going to drop this KIP. Thinking about this more, the benefit of not
having to wait for metadata does not seem to worth the complexity added in
producer due to this KIP. Assuming that the Kafka cluster is available,
which should be the case, it should be fast to wait for the first metadata.
After the first metadata most likely the producer will not have to wait for
metadata to send message.

On Fri, Apr 13, 2018 at 11:34 PM, Dong Lin <li...@gmail.com> wrote:

> Hey Becket,
>
> Good point! Thanks for the comment.
>
> I have updated the KIP to move the compression to user thread in the
> common case. Basically user thread can be responsible for compressing and
> moving messages from per-topic queue to per-partition queue once the
> metadata is available. Only if IO threads has nothing to do (e.g. all
> messages in the per-partition queue has been sent), then the IO thread can
> try to compress and move some messages to the per-partition queue. Can you
> see if the latest solution in the KIP address the problem?
>
> Also, I have added a new section to analyze how the changes proposed in
> this KIP would change the performance of producer.
>
> Thanks,
> Dong
>
>
> On Fri, Apr 13, 2018 at 11:01 AM, Becket Qin <be...@gmail.com> wrote:
>
>> Thanks for the KIP, Dong.
>>
>> In the current threading model, compression is done by the user threads,
>> therefore the producer sender thread can focus on IO. With the proposed
>> changes, does that mean the producer sender thread will have to do all the
>> compression as well? Would this become a performance bottleneck?
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Thu, Apr 12, 2018 at 1:41 AM, Dong Lin <li...@gmail.com> wrote:
>>
>> > Hey Ted,
>> >
>> > Thanks for your comments. With the proposed solution in the KIP, the
>> memory
>> > is only allocated once for the given message, which is the same as the
>> > existing implementation. The serialized message will be moved from
>> > per-topic queue to per-partition queue without incurring additional
>> memory
>> > overhead. Does this address your question?
>> >
>> > Thanks,
>> > Dong
>> >
>> >
>> > On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu <yu...@gmail.com> wrote:
>> >
>> > > Looks like per-topic queue is introduced.
>> > >
>> > > In terms of memory consumption, how does the KIP allocate memory
>> > > between per-topic
>> > > queue and per-partition queue ?
>> > >
>> > > Thanks
>> > >
>> > > On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin <li...@gmail.com>
>> wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I have created KIP-286: producer.send() should not block on metadata
>> > > > update. See
>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
>> > > > .
>> > > >
>> > > > The KIP intends to improve user-experience of producer.send() when
>> > > metadata
>> > > > is not available. It is related but different from the previous
>> > > discussion
>> > > > in KAFKA-3539 in the sense that user still has the option of letting
>> > > > producer.send() block on full producer queue.
>> > > >
>> > > > Comments are welcome!
>> > > >
>> > > > Thanks,
>> > > > Dong
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

Posted by Dong Lin <li...@gmail.com>.
Hey Becket,

Good point! Thanks for the comment.

I have updated the KIP to move the compression to user thread in the common
case. Basically user thread can be responsible for compressing and moving
messages from per-topic queue to per-partition queue once the metadata is
available. Only if IO threads has nothing to do (e.g. all messages in the
per-partition queue has been sent), then the IO thread can try to compress
and move some messages to the per-partition queue. Can you see if the
latest solution in the KIP address the problem?

Also, I have added a new section to analyze how the changes proposed in
this KIP would change the performance of producer.

Thanks,
Dong


On Fri, Apr 13, 2018 at 11:01 AM, Becket Qin <be...@gmail.com> wrote:

> Thanks for the KIP, Dong.
>
> In the current threading model, compression is done by the user threads,
> therefore the producer sender thread can focus on IO. With the proposed
> changes, does that mean the producer sender thread will have to do all the
> compression as well? Would this become a performance bottleneck?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Apr 12, 2018 at 1:41 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Hey Ted,
> >
> > Thanks for your comments. With the proposed solution in the KIP, the
> memory
> > is only allocated once for the given message, which is the same as the
> > existing implementation. The serialized message will be moved from
> > per-topic queue to per-partition queue without incurring additional
> memory
> > overhead. Does this address your question?
> >
> > Thanks,
> > Dong
> >
> >
> > On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Looks like per-topic queue is introduced.
> > >
> > > In terms of memory consumption, how does the KIP allocate memory
> > > between per-topic
> > > queue and per-partition queue ?
> > >
> > > Thanks
> > >
> > > On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I have created KIP-286: producer.send() should not block on metadata
> > > > update. See
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> > > > .
> > > >
> > > > The KIP intends to improve user-experience of producer.send() when
> > > metadata
> > > > is not available. It is related but different from the previous
> > > discussion
> > > > in KAFKA-3539 in the sense that user still has the option of letting
> > > > producer.send() block on full producer queue.
> > > >
> > > > Comments are welcome!
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

Posted by Becket Qin <be...@gmail.com>.
Thanks for the KIP, Dong.

In the current threading model, compression is done by the user threads,
therefore the producer sender thread can focus on IO. With the proposed
changes, does that mean the producer sender thread will have to do all the
compression as well? Would this become a performance bottleneck?

Thanks,

Jiangjie (Becket) Qin

On Thu, Apr 12, 2018 at 1:41 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Ted,
>
> Thanks for your comments. With the proposed solution in the KIP, the memory
> is only allocated once for the given message, which is the same as the
> existing implementation. The serialized message will be moved from
> per-topic queue to per-partition queue without incurring additional memory
> overhead. Does this address your question?
>
> Thanks,
> Dong
>
>
> On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Looks like per-topic queue is introduced.
> >
> > In terms of memory consumption, how does the KIP allocate memory
> > between per-topic
> > queue and per-partition queue ?
> >
> > Thanks
> >
> > On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin <li...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I have created KIP-286: producer.send() should not block on metadata
> > > update. See
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> > > .
> > >
> > > The KIP intends to improve user-experience of producer.send() when
> > metadata
> > > is not available. It is related but different from the previous
> > discussion
> > > in KAFKA-3539 in the sense that user still has the option of letting
> > > producer.send() block on full producer queue.
> > >
> > > Comments are welcome!
> > >
> > > Thanks,
> > > Dong
> > >
> >
>

Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

Posted by Dong Lin <li...@gmail.com>.
Hey Ted,

Thanks for your comments. With the proposed solution in the KIP, the memory
is only allocated once for the given message, which is the same as the
existing implementation. The serialized message will be moved from
per-topic queue to per-partition queue without incurring additional memory
overhead. Does this address your question?

Thanks,
Dong


On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu <yu...@gmail.com> wrote:

> Looks like per-topic queue is introduced.
>
> In terms of memory consumption, how does the KIP allocate memory
> between per-topic
> queue and per-partition queue ?
>
> Thanks
>
> On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin <li...@gmail.com> wrote:
>
> > Hi all,
> >
> > I have created KIP-286: producer.send() should not block on metadata
> > update. See
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> > .
> >
> > The KIP intends to improve user-experience of producer.send() when
> metadata
> > is not available. It is related but different from the previous
> discussion
> > in KAFKA-3539 in the sense that user still has the option of letting
> > producer.send() block on full producer queue.
> >
> > Comments are welcome!
> >
> > Thanks,
> > Dong
> >
>

Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

Posted by Ted Yu <yu...@gmail.com>.
Looks like per-topic queue is introduced.

In terms of memory consumption, how does the KIP allocate memory
between per-topic
queue and per-partition queue ?

Thanks

On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin <li...@gmail.com> wrote:

> Hi all,
>
> I have created KIP-286: producer.send() should not block on metadata
> update. See
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> .
>
> The KIP intends to improve user-experience of producer.send() when metadata
> is not available. It is related but different from the previous discussion
> in KAFKA-3539 in the sense that user still has the option of letting
> producer.send() block on full producer queue.
>
> Comments are welcome!
>
> Thanks,
> Dong
>