You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jigar Shah <ji...@gmail.com> on 2022/01/17 07:54:56 UTC

Re: Huge latency at consumer side ,testing performance for production and consumption

Hello again,
I had performed a few more tests on producer and consumer again and I
observed a pattern in Kafka Producer creating large latency.
Could you please confirm that my understanding is correct about the
producer protocol?

The configurations are the same as above.

The producer is continuously producing messages into kafka topic, using the
default producer partitioner creating messages in random topic-partitions

The workflow of protocol according to my understanding is:
1. First connection from producer to a broker (1 out of 3) in the cluster
to fetch metadata.
2. If the partition to produce is located on the same broker then
   a. Re-use the existing connection to produce messages.
3. Else if the partition to produce is located on one of other brokers then
   a. Create a new connection
   b. Fetch again metadata.
   c. Produce the message using the new connection

After analysis, I assume the latency is caused at step *3.a & 3.b *when the
partition selected is on the other two brokers.  Such peaks are observed
during initial part of test only
[image: image.png]
Thank you in advance for feedback.

*Regards,*
*Jigar*


On Wed, 15 Dec 2021 at 10:53, Jigar Shah <ji...@gmail.com> wrote:

> Hello,
> I agree with time taken for consumer initialization processes
> But actually in the test I am taking care of that and I am waiting for the
> consumer to be initiated and only then starting the producer to discount
> the initialization delay.
> So, are there any more processes happening during the poll of consumers
> for the first few messages?
>
> Thank you
>
> On Mon, 13 Dec 2021 at 18:33, Luke Chen <sh...@gmail.com> wrote:
>
>> Hi Jigar,
>>
>> As Liam mentioned, those are necessary consumer initialization processes.
>> So, I don't think you can speed it up by altering some timeouts/interval
>> properties.
>> Is there any reason why you need to care about the initial delay?
>> If, like you said, the delay won't happen later on, I think the cost will
>> be amortized.
>>
>>
>> Thank you.
>> Luke
>>
>>
>> On Mon, Dec 13, 2021 at 4:59 PM Jigar Shah <ji...@gmail.com>
>> wrote:
>>
>> > Hello ,
>> > Answering your first mail, indeed I am using consumer groups using
>> > group.id
>> > , I must have missed to add it in mentioned properties
>> > Also, thank you for information regarding the internal processes working
>> > behind creating a KafkaConsumer.
>> > I agree that following steps do add latency during initial connection
>> > creation.But can it be somehow optimised(reduced) ,by altering some
>> > timeouts/interval properties, could you please suggest those?
>> >
>> > Thank you
>> >
>> > On Mon, 13 Dec 2021 at 12:05, Liam Clarke-Hutchinson <
>> lclarkeh@redhat.com>
>> > wrote:
>> >
>> > > I realise that's a silly question, you must be if you're using auto
>> > commit.
>> > >
>> > > When a consumer starts, it needs to do a few things.
>> > >
>> > > 1) Connect to a bootstrap server
>> > >
>> > > 2) Join an existing consumer group, or create a new one, if it doesn't
>> > > exist. This may cause a stop the world rebalance as partitions are
>> > > reassigned within the group.
>> > >
>> > > 3) Acquire metadata - which brokers are the partition leaders for my
>> > > assigned partitions on? And what offsets am I consuming from?
>> > >
>> > > 4) Establish the long lived connections to those brokers.
>> > >
>> > > 5) Send fetch requests
>> > >
>> > > (I might not have the order correct)
>> > >
>> > > So yeah, this is why you're seeing that initial delay before consuming
>> > > records.
>> > >
>> > > Kind regards,
>> > >
>> > > Liam Clarke-Hutchinson
>> > >
>> > > On Mon, 13 Dec 2021, 7:19 pm Liam Clarke-Hutchinson, <
>> > lclarkeh@redhat.com>
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I'm assuming you're using consumer groups? E.g., group.id=X
>> > > >
>> > > > Cheers,
>> > > >
>> > > > Liam
>> > > >
>> > > > On Mon, 13 Dec 2021, 6:30 pm Jigar Shah, <ji...@gmail.com>
>> > > wrote:
>> > > >
>> > > >> Hello,
>> > > >> I am trying to test the latency between message production and
>> message
>> > > >> consumption using Java Kafka-Client*(2.7.2)* library.
>> > > >> The configuration of cluster is 3 KafkaBrokers*(2.7.2, Scala
>> 2.13)*, 3
>> > > >> Zookeeper*(3.5.9)*
>> > > >> Here is a pattern what I have observed
>> > > >> Reference:
>> > > >>  ConsumerReadTimeStamp: Timestamp when record received in Kafka
>> > Consumer
>> > > >>  ProducerTimeStamp: Timestamp added before producer.send record
>> > > >>  RecordTimeStamp: CreateTimeStamp inside the record obtained at
>> > consumer
>> > > >>
>> > > >> [image: kafka1.png]
>> > > >>
>> > > >> *For 100 Messages*
>> > > >>
>> > > >> *ConsumerReadTimeStamp-ProducerTimeStamp(ms)*
>> > > >>
>> > > >> *ConsumerReadTimeStamp-RecordTimeStamp(ms)*
>> > > >>
>> > > >> *Average*
>> > > >>
>> > > >> *252.56*
>> > > >>
>> > > >> *238.85*
>> > > >>
>> > > >> *Max*
>> > > >>
>> > > >> *2723*
>> > > >>
>> > > >> *2016*
>> > > >>
>> > > >> *Min*
>> > > >>
>> > > >> *125*
>> > > >>
>> > > >> *125*
>> > > >>
>> > > >>
>> > > >> On the consumer side it takes too much time for initial few
>> messages
>> > but
>> > > >> later on it is quite consistent.
>> > > >> I have executed the above same test for large number of messages :
>> > > >> 100,1000,10000,etc. and the pattern seems to be same
>> > > >> Here are the configurations, mostly using default properties.
>> > > >> Topic:
>> > > >>   partitions=16
>> > > >>   min.insync.replica=2
>> > > >>   replication.factor=3
>> > > >>
>> > > >>
>> > > >> Consumer:
>> > > >>   security.protocol=PLAINTEXT
>> > > >>   enable.auto.commit=true
>> > > >>
>> > > >>
>> > > >> Producer:
>> > > >>   security.protocol=PLAINTEXT
>> > > >>   compression.type=gzip
>> > > >>   acks=all
>> > > >>
>> > > >>
>> > > >> Is there any reason why there is huge latency at the beginning
>> when a
>> > > >> consumer is created please?
>> > > >> Also please suggest some way to optimise configurations to have
>> some
>> > > >> better consistent results ?
>> > > >>
>> > > >> Thank you in advance for your feedback.
>> > > >>
>> > > >
>> > >
>> >
>>
>

Re: Huge latency at consumer side ,testing performance for production and consumption

Posted by Jigar Shah <ji...@gmail.com>.
Hello Liam,
Here is the image. I hope it is accessible now


*Regards,*

*Jigar*


On Fri, 28 Jan 2022 at 15:04, Liam Clarke-Hutchinson <lc...@redhat.com>
wrote:

> Hi Jigar,
>
> Your image attachment didn't come through again.
>
> Thanks,
>
> Liam
>
> On Fri, 28 Jan 2022, 5:35 pm Jigar Shah, <ji...@gmail.com> wrote:
>
> > Hello again,
> > Could someone please provide feedback on these findings ?
> > Thank you in advance for feedback.
> >
> > *Regards,*
> > *Jigar*
> >
> >
> >
> > On Mon, 17 Jan 2022 at 13:24, Jigar Shah <ji...@gmail.com>
> wrote:
> >
> >> Hello again,
> >> I had performed a few more tests on producer and consumer again and I
> >> observed a pattern in Kafka Producer creating large latency.
> >> Could you please confirm that my understanding is correct about the
> >> producer protocol?
> >>
> >> The configurations are the same as above.
> >>
> >> The producer is continuously producing messages into kafka topic, using
> >> the default producer partitioner creating messages in random
> >> topic-partitions
> >>
> >> The workflow of protocol according to my understanding is:
> >> 1. First connection from producer to a broker (1 out of 3) in the
> cluster
> >> to fetch metadata.
> >> 2. If the partition to produce is located on the same broker then
> >>    a. Re-use the existing connection to produce messages.
> >> 3. Else if the partition to produce is located on one of other brokers
> >> then
> >>    a. Create a new connection
> >>    b. Fetch again metadata.
> >>    c. Produce the message using the new connection
> >>
> >> After analysis, I assume the latency is caused at step *3.a & 3.b *when
> >> the partition selected is on the other two brokers.  Such peaks are
> >> observed during initial part of test only
> >> [image: image.png]
> >> Thank you in advance for feedback.
> >>
> >> *Regards,*
> >> *Jigar*
> >>
> >>
> >> On Wed, 15 Dec 2021 at 10:53, Jigar Shah <ji...@gmail.com>
> >> wrote:
> >>
> >>> Hello,
> >>> I agree with time taken for consumer initialization processes
> >>> But actually in the test I am taking care of that and I am waiting for
> >>> the consumer to be initiated and only then starting the producer to
> >>> discount the initialization delay.
> >>> So, are there any more processes happening during the poll of consumers
> >>> for the first few messages?
> >>>
> >>> Thank you
> >>>
> >>> On Mon, 13 Dec 2021 at 18:33, Luke Chen <sh...@gmail.com> wrote:
> >>>
> >>>> Hi Jigar,
> >>>>
> >>>> As Liam mentioned, those are necessary consumer initialization
> >>>> processes.
> >>>> So, I don't think you can speed it up by altering some
> timeouts/interval
> >>>> properties.
> >>>> Is there any reason why you need to care about the initial delay?
> >>>> If, like you said, the delay won't happen later on, I think the cost
> >>>> will
> >>>> be amortized.
> >>>>
> >>>>
> >>>> Thank you.
> >>>> Luke
> >>>>
> >>>>
> >>>> On Mon, Dec 13, 2021 at 4:59 PM Jigar Shah <ji...@gmail.com>
> >>>> wrote:
> >>>>
> >>>> > Hello ,
> >>>> > Answering your first mail, indeed I am using consumer groups using
> >>>> > group.id
> >>>> > , I must have missed to add it in mentioned properties
> >>>> > Also, thank you for information regarding the internal processes
> >>>> working
> >>>> > behind creating a KafkaConsumer.
> >>>> > I agree that following steps do add latency during initial
> connection
> >>>> > creation.But can it be somehow optimised(reduced) ,by altering some
> >>>> > timeouts/interval properties, could you please suggest those?
> >>>> >
> >>>> > Thank you
> >>>> >
> >>>> > On Mon, 13 Dec 2021 at 12:05, Liam Clarke-Hutchinson <
> >>>> lclarkeh@redhat.com>
> >>>> > wrote:
> >>>> >
> >>>> > > I realise that's a silly question, you must be if you're using
> auto
> >>>> > commit.
> >>>> > >
> >>>> > > When a consumer starts, it needs to do a few things.
> >>>> > >
> >>>> > > 1) Connect to a bootstrap server
> >>>> > >
> >>>> > > 2) Join an existing consumer group, or create a new one, if it
> >>>> doesn't
> >>>> > > exist. This may cause a stop the world rebalance as partitions are
> >>>> > > reassigned within the group.
> >>>> > >
> >>>> > > 3) Acquire metadata - which brokers are the partition leaders for
> my
> >>>> > > assigned partitions on? And what offsets am I consuming from?
> >>>> > >
> >>>> > > 4) Establish the long lived connections to those brokers.
> >>>> > >
> >>>> > > 5) Send fetch requests
> >>>> > >
> >>>> > > (I might not have the order correct)
> >>>> > >
> >>>> > > So yeah, this is why you're seeing that initial delay before
> >>>> consuming
> >>>> > > records.
> >>>> > >
> >>>> > > Kind regards,
> >>>> > >
> >>>> > > Liam Clarke-Hutchinson
> >>>> > >
> >>>> > > On Mon, 13 Dec 2021, 7:19 pm Liam Clarke-Hutchinson, <
> >>>> > lclarkeh@redhat.com>
> >>>> > > wrote:
> >>>> > >
> >>>> > > > Hi,
> >>>> > > >
> >>>> > > > I'm assuming you're using consumer groups? E.g., group.id=X
> >>>> > > >
> >>>> > > > Cheers,
> >>>> > > >
> >>>> > > > Liam
> >>>> > > >
> >>>> > > > On Mon, 13 Dec 2021, 6:30 pm Jigar Shah, <
> >>>> jigar.shah1497@gmail.com>
> >>>> > > wrote:
> >>>> > > >
> >>>> > > >> Hello,
> >>>> > > >> I am trying to test the latency between message production and
> >>>> message
> >>>> > > >> consumption using Java Kafka-Client*(2.7.2)* library.
> >>>> > > >> The configuration of cluster is 3 KafkaBrokers*(2.7.2, Scala
> >>>> 2.13)*, 3
> >>>> > > >> Zookeeper*(3.5.9)*
> >>>> > > >> Here is a pattern what I have observed
> >>>> > > >> Reference:
> >>>> > > >>  ConsumerReadTimeStamp: Timestamp when record received in Kafka
> >>>> > Consumer
> >>>> > > >>  ProducerTimeStamp: Timestamp added before producer.send record
> >>>> > > >>  RecordTimeStamp: CreateTimeStamp inside the record obtained at
> >>>> > consumer
> >>>> > > >>
> >>>> > > >> [image: kafka1.png]
> >>>> > > >>
> >>>> > > >> *For 100 Messages*
> >>>> > > >>
> >>>> > > >> *ConsumerReadTimeStamp-ProducerTimeStamp(ms)*
> >>>> > > >>
> >>>> > > >> *ConsumerReadTimeStamp-RecordTimeStamp(ms)*
> >>>> > > >>
> >>>> > > >> *Average*
> >>>> > > >>
> >>>> > > >> *252.56*
> >>>> > > >>
> >>>> > > >> *238.85*
> >>>> > > >>
> >>>> > > >> *Max*
> >>>> > > >>
> >>>> > > >> *2723*
> >>>> > > >>
> >>>> > > >> *2016*
> >>>> > > >>
> >>>> > > >> *Min*
> >>>> > > >>
> >>>> > > >> *125*
> >>>> > > >>
> >>>> > > >> *125*
> >>>> > > >>
> >>>> > > >>
> >>>> > > >> On the consumer side it takes too much time for initial few
> >>>> messages
> >>>> > but
> >>>> > > >> later on it is quite consistent.
> >>>> > > >> I have executed the above same test for large number of
> messages
> >>>> :
> >>>> > > >> 100,1000,10000,etc. and the pattern seems to be same
> >>>> > > >> Here are the configurations, mostly using default properties.
> >>>> > > >> Topic:
> >>>> > > >>   partitions=16
> >>>> > > >>   min.insync.replica=2
> >>>> > > >>   replication.factor=3
> >>>> > > >>
> >>>> > > >>
> >>>> > > >> Consumer:
> >>>> > > >>   security.protocol=PLAINTEXT
> >>>> > > >>   enable.auto.commit=true
> >>>> > > >>
> >>>> > > >>
> >>>> > > >> Producer:
> >>>> > > >>   security.protocol=PLAINTEXT
> >>>> > > >>   compression.type=gzip
> >>>> > > >>   acks=all
> >>>> > > >>
> >>>> > > >>
> >>>> > > >> Is there any reason why there is huge latency at the beginning
> >>>> when a
> >>>> > > >> consumer is created please?
> >>>> > > >> Also please suggest some way to optimise configurations to have
> >>>> some
> >>>> > > >> better consistent results ?
> >>>> > > >>
> >>>> > > >> Thank you in advance for your feedback.
> >>>> > > >>
> >>>> > > >
> >>>> > >
> >>>> >
> >>>>
> >>>
>

Re: Huge latency at consumer side ,testing performance for production and consumption

Posted by Liam Clarke-Hutchinson <lc...@redhat.com>.
Hi Jigar,

Your image attachment didn't come through again.

Thanks,

Liam

On Fri, 28 Jan 2022, 5:35 pm Jigar Shah, <ji...@gmail.com> wrote:

> Hello again,
> Could someone please provide feedback on these findings ?
> Thank you in advance for feedback.
>
> *Regards,*
> *Jigar*
>
>
>
> On Mon, 17 Jan 2022 at 13:24, Jigar Shah <ji...@gmail.com> wrote:
>
>> Hello again,
>> I had performed a few more tests on producer and consumer again and I
>> observed a pattern in Kafka Producer creating large latency.
>> Could you please confirm that my understanding is correct about the
>> producer protocol?
>>
>> The configurations are the same as above.
>>
>> The producer is continuously producing messages into kafka topic, using
>> the default producer partitioner creating messages in random
>> topic-partitions
>>
>> The workflow of protocol according to my understanding is:
>> 1. First connection from producer to a broker (1 out of 3) in the cluster
>> to fetch metadata.
>> 2. If the partition to produce is located on the same broker then
>>    a. Re-use the existing connection to produce messages.
>> 3. Else if the partition to produce is located on one of other brokers
>> then
>>    a. Create a new connection
>>    b. Fetch again metadata.
>>    c. Produce the message using the new connection
>>
>> After analysis, I assume the latency is caused at step *3.a & 3.b *when
>> the partition selected is on the other two brokers.  Such peaks are
>> observed during initial part of test only
>> [image: image.png]
>> Thank you in advance for feedback.
>>
>> *Regards,*
>> *Jigar*
>>
>>
>> On Wed, 15 Dec 2021 at 10:53, Jigar Shah <ji...@gmail.com>
>> wrote:
>>
>>> Hello,
>>> I agree with time taken for consumer initialization processes
>>> But actually in the test I am taking care of that and I am waiting for
>>> the consumer to be initiated and only then starting the producer to
>>> discount the initialization delay.
>>> So, are there any more processes happening during the poll of consumers
>>> for the first few messages?
>>>
>>> Thank you
>>>
>>> On Mon, 13 Dec 2021 at 18:33, Luke Chen <sh...@gmail.com> wrote:
>>>
>>>> Hi Jigar,
>>>>
>>>> As Liam mentioned, those are necessary consumer initialization
>>>> processes.
>>>> So, I don't think you can speed it up by altering some timeouts/interval
>>>> properties.
>>>> Is there any reason why you need to care about the initial delay?
>>>> If, like you said, the delay won't happen later on, I think the cost
>>>> will
>>>> be amortized.
>>>>
>>>>
>>>> Thank you.
>>>> Luke
>>>>
>>>>
>>>> On Mon, Dec 13, 2021 at 4:59 PM Jigar Shah <ji...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hello ,
>>>> > Answering your first mail, indeed I am using consumer groups using
>>>> > group.id
>>>> > , I must have missed to add it in mentioned properties
>>>> > Also, thank you for information regarding the internal processes
>>>> working
>>>> > behind creating a KafkaConsumer.
>>>> > I agree that following steps do add latency during initial connection
>>>> > creation.But can it be somehow optimised(reduced) ,by altering some
>>>> > timeouts/interval properties, could you please suggest those?
>>>> >
>>>> > Thank you
>>>> >
>>>> > On Mon, 13 Dec 2021 at 12:05, Liam Clarke-Hutchinson <
>>>> lclarkeh@redhat.com>
>>>> > wrote:
>>>> >
>>>> > > I realise that's a silly question, you must be if you're using auto
>>>> > commit.
>>>> > >
>>>> > > When a consumer starts, it needs to do a few things.
>>>> > >
>>>> > > 1) Connect to a bootstrap server
>>>> > >
>>>> > > 2) Join an existing consumer group, or create a new one, if it
>>>> doesn't
>>>> > > exist. This may cause a stop the world rebalance as partitions are
>>>> > > reassigned within the group.
>>>> > >
>>>> > > 3) Acquire metadata - which brokers are the partition leaders for my
>>>> > > assigned partitions on? And what offsets am I consuming from?
>>>> > >
>>>> > > 4) Establish the long lived connections to those brokers.
>>>> > >
>>>> > > 5) Send fetch requests
>>>> > >
>>>> > > (I might not have the order correct)
>>>> > >
>>>> > > So yeah, this is why you're seeing that initial delay before
>>>> consuming
>>>> > > records.
>>>> > >
>>>> > > Kind regards,
>>>> > >
>>>> > > Liam Clarke-Hutchinson
>>>> > >
>>>> > > On Mon, 13 Dec 2021, 7:19 pm Liam Clarke-Hutchinson, <
>>>> > lclarkeh@redhat.com>
>>>> > > wrote:
>>>> > >
>>>> > > > Hi,
>>>> > > >
>>>> > > > I'm assuming you're using consumer groups? E.g., group.id=X
>>>> > > >
>>>> > > > Cheers,
>>>> > > >
>>>> > > > Liam
>>>> > > >
>>>> > > > On Mon, 13 Dec 2021, 6:30 pm Jigar Shah, <
>>>> jigar.shah1497@gmail.com>
>>>> > > wrote:
>>>> > > >
>>>> > > >> Hello,
>>>> > > >> I am trying to test the latency between message production and
>>>> message
>>>> > > >> consumption using Java Kafka-Client*(2.7.2)* library.
>>>> > > >> The configuration of cluster is 3 KafkaBrokers*(2.7.2, Scala
>>>> 2.13)*, 3
>>>> > > >> Zookeeper*(3.5.9)*
>>>> > > >> Here is a pattern what I have observed
>>>> > > >> Reference:
>>>> > > >>  ConsumerReadTimeStamp: Timestamp when record received in Kafka
>>>> > Consumer
>>>> > > >>  ProducerTimeStamp: Timestamp added before producer.send record
>>>> > > >>  RecordTimeStamp: CreateTimeStamp inside the record obtained at
>>>> > consumer
>>>> > > >>
>>>> > > >> [image: kafka1.png]
>>>> > > >>
>>>> > > >> *For 100 Messages*
>>>> > > >>
>>>> > > >> *ConsumerReadTimeStamp-ProducerTimeStamp(ms)*
>>>> > > >>
>>>> > > >> *ConsumerReadTimeStamp-RecordTimeStamp(ms)*
>>>> > > >>
>>>> > > >> *Average*
>>>> > > >>
>>>> > > >> *252.56*
>>>> > > >>
>>>> > > >> *238.85*
>>>> > > >>
>>>> > > >> *Max*
>>>> > > >>
>>>> > > >> *2723*
>>>> > > >>
>>>> > > >> *2016*
>>>> > > >>
>>>> > > >> *Min*
>>>> > > >>
>>>> > > >> *125*
>>>> > > >>
>>>> > > >> *125*
>>>> > > >>
>>>> > > >>
>>>> > > >> On the consumer side it takes too much time for initial few
>>>> messages
>>>> > but
>>>> > > >> later on it is quite consistent.
>>>> > > >> I have executed the above same test for large number of messages
>>>> :
>>>> > > >> 100,1000,10000,etc. and the pattern seems to be same
>>>> > > >> Here are the configurations, mostly using default properties.
>>>> > > >> Topic:
>>>> > > >>   partitions=16
>>>> > > >>   min.insync.replica=2
>>>> > > >>   replication.factor=3
>>>> > > >>
>>>> > > >>
>>>> > > >> Consumer:
>>>> > > >>   security.protocol=PLAINTEXT
>>>> > > >>   enable.auto.commit=true
>>>> > > >>
>>>> > > >>
>>>> > > >> Producer:
>>>> > > >>   security.protocol=PLAINTEXT
>>>> > > >>   compression.type=gzip
>>>> > > >>   acks=all
>>>> > > >>
>>>> > > >>
>>>> > > >> Is there any reason why there is huge latency at the beginning
>>>> when a
>>>> > > >> consumer is created please?
>>>> > > >> Also please suggest some way to optimise configurations to have
>>>> some
>>>> > > >> better consistent results ?
>>>> > > >>
>>>> > > >> Thank you in advance for your feedback.
>>>> > > >>
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>

Re: Huge latency at consumer side ,testing performance for production and consumption

Posted by Jigar Shah <ji...@gmail.com>.
Hello again,
Could someone please provide feedback on these findings ?
Thank you in advance for feedback.

*Regards,*
*Jigar*



On Mon, 17 Jan 2022 at 13:24, Jigar Shah <ji...@gmail.com> wrote:

> Hello again,
> I had performed a few more tests on producer and consumer again and I
> observed a pattern in Kafka Producer creating large latency.
> Could you please confirm that my understanding is correct about the
> producer protocol?
>
> The configurations are the same as above.
>
> The producer is continuously producing messages into kafka topic, using
> the default producer partitioner creating messages in random
> topic-partitions
>
> The workflow of protocol according to my understanding is:
> 1. First connection from producer to a broker (1 out of 3) in the cluster
> to fetch metadata.
> 2. If the partition to produce is located on the same broker then
>    a. Re-use the existing connection to produce messages.
> 3. Else if the partition to produce is located on one of other brokers then
>    a. Create a new connection
>    b. Fetch again metadata.
>    c. Produce the message using the new connection
>
> After analysis, I assume the latency is caused at step *3.a & 3.b *when
> the partition selected is on the other two brokers.  Such peaks are
> observed during initial part of test only
> [image: image.png]
> Thank you in advance for feedback.
>
> *Regards,*
> *Jigar*
>
>
> On Wed, 15 Dec 2021 at 10:53, Jigar Shah <ji...@gmail.com> wrote:
>
>> Hello,
>> I agree with time taken for consumer initialization processes
>> But actually in the test I am taking care of that and I am waiting for
>> the consumer to be initiated and only then starting the producer to
>> discount the initialization delay.
>> So, are there any more processes happening during the poll of consumers
>> for the first few messages?
>>
>> Thank you
>>
>> On Mon, 13 Dec 2021 at 18:33, Luke Chen <sh...@gmail.com> wrote:
>>
>>> Hi Jigar,
>>>
>>> As Liam mentioned, those are necessary consumer initialization processes.
>>> So, I don't think you can speed it up by altering some timeouts/interval
>>> properties.
>>> Is there any reason why you need to care about the initial delay?
>>> If, like you said, the delay won't happen later on, I think the cost will
>>> be amortized.
>>>
>>>
>>> Thank you.
>>> Luke
>>>
>>>
>>> On Mon, Dec 13, 2021 at 4:59 PM Jigar Shah <ji...@gmail.com>
>>> wrote:
>>>
>>> > Hello ,
>>> > Answering your first mail, indeed I am using consumer groups using
>>> > group.id
>>> > , I must have missed to add it in mentioned properties
>>> > Also, thank you for information regarding the internal processes
>>> working
>>> > behind creating a KafkaConsumer.
>>> > I agree that following steps do add latency during initial connection
>>> > creation.But can it be somehow optimised(reduced) ,by altering some
>>> > timeouts/interval properties, could you please suggest those?
>>> >
>>> > Thank you
>>> >
>>> > On Mon, 13 Dec 2021 at 12:05, Liam Clarke-Hutchinson <
>>> lclarkeh@redhat.com>
>>> > wrote:
>>> >
>>> > > I realise that's a silly question, you must be if you're using auto
>>> > commit.
>>> > >
>>> > > When a consumer starts, it needs to do a few things.
>>> > >
>>> > > 1) Connect to a bootstrap server
>>> > >
>>> > > 2) Join an existing consumer group, or create a new one, if it
>>> doesn't
>>> > > exist. This may cause a stop the world rebalance as partitions are
>>> > > reassigned within the group.
>>> > >
>>> > > 3) Acquire metadata - which brokers are the partition leaders for my
>>> > > assigned partitions on? And what offsets am I consuming from?
>>> > >
>>> > > 4) Establish the long lived connections to those brokers.
>>> > >
>>> > > 5) Send fetch requests
>>> > >
>>> > > (I might not have the order correct)
>>> > >
>>> > > So yeah, this is why you're seeing that initial delay before
>>> consuming
>>> > > records.
>>> > >
>>> > > Kind regards,
>>> > >
>>> > > Liam Clarke-Hutchinson
>>> > >
>>> > > On Mon, 13 Dec 2021, 7:19 pm Liam Clarke-Hutchinson, <
>>> > lclarkeh@redhat.com>
>>> > > wrote:
>>> > >
>>> > > > Hi,
>>> > > >
>>> > > > I'm assuming you're using consumer groups? E.g., group.id=X
>>> > > >
>>> > > > Cheers,
>>> > > >
>>> > > > Liam
>>> > > >
>>> > > > On Mon, 13 Dec 2021, 6:30 pm Jigar Shah, <jigar.shah1497@gmail.com
>>> >
>>> > > wrote:
>>> > > >
>>> > > >> Hello,
>>> > > >> I am trying to test the latency between message production and
>>> message
>>> > > >> consumption using Java Kafka-Client*(2.7.2)* library.
>>> > > >> The configuration of cluster is 3 KafkaBrokers*(2.7.2, Scala
>>> 2.13)*, 3
>>> > > >> Zookeeper*(3.5.9)*
>>> > > >> Here is a pattern what I have observed
>>> > > >> Reference:
>>> > > >>  ConsumerReadTimeStamp: Timestamp when record received in Kafka
>>> > Consumer
>>> > > >>  ProducerTimeStamp: Timestamp added before producer.send record
>>> > > >>  RecordTimeStamp: CreateTimeStamp inside the record obtained at
>>> > consumer
>>> > > >>
>>> > > >> [image: kafka1.png]
>>> > > >>
>>> > > >> *For 100 Messages*
>>> > > >>
>>> > > >> *ConsumerReadTimeStamp-ProducerTimeStamp(ms)*
>>> > > >>
>>> > > >> *ConsumerReadTimeStamp-RecordTimeStamp(ms)*
>>> > > >>
>>> > > >> *Average*
>>> > > >>
>>> > > >> *252.56*
>>> > > >>
>>> > > >> *238.85*
>>> > > >>
>>> > > >> *Max*
>>> > > >>
>>> > > >> *2723*
>>> > > >>
>>> > > >> *2016*
>>> > > >>
>>> > > >> *Min*
>>> > > >>
>>> > > >> *125*
>>> > > >>
>>> > > >> *125*
>>> > > >>
>>> > > >>
>>> > > >> On the consumer side it takes too much time for initial few
>>> messages
>>> > but
>>> > > >> later on it is quite consistent.
>>> > > >> I have executed the above same test for large number of messages :
>>> > > >> 100,1000,10000,etc. and the pattern seems to be same
>>> > > >> Here are the configurations, mostly using default properties.
>>> > > >> Topic:
>>> > > >>   partitions=16
>>> > > >>   min.insync.replica=2
>>> > > >>   replication.factor=3
>>> > > >>
>>> > > >>
>>> > > >> Consumer:
>>> > > >>   security.protocol=PLAINTEXT
>>> > > >>   enable.auto.commit=true
>>> > > >>
>>> > > >>
>>> > > >> Producer:
>>> > > >>   security.protocol=PLAINTEXT
>>> > > >>   compression.type=gzip
>>> > > >>   acks=all
>>> > > >>
>>> > > >>
>>> > > >> Is there any reason why there is huge latency at the beginning
>>> when a
>>> > > >> consumer is created please?
>>> > > >> Also please suggest some way to optimise configurations to have
>>> some
>>> > > >> better consistent results ?
>>> > > >>
>>> > > >> Thank you in advance for your feedback.
>>> > > >>
>>> > > >
>>> > >
>>> >
>>>
>>