You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Tauzell, Dave" <Da...@surescripts.com> on 2019/11/08 15:18:29 UTC

Re: [External] AW: Consumer Lags and receive no records anymore

A null key results in the client sending to partitions in a round-robin order.  Use a key if you want to ensure that specific messages end up on the same partition.

-Dave

On 11/8/19, 1:06 AM, "Oliver Eckle" <ie...@gmx.de> wrote:

    Hi,

    Don’t get me wrong, I just want to understand what's going on.
    so how do I figure out, how much partitions are required? Trial and Error?
    And as far as I understand, if I have null as key for the record, the record is stored in all partitions.
    Is it then not also processed by each consumer, even if I have more than one consumer?
    So could you explain, why the consumer stops to get data?

    Thx

    -----Ursprüngliche Nachricht-----
    Von: M. Manna <ma...@gmail.com>
    Gesendet: Freitag, 8. November 2019 00:51
    An: Kafka Users <us...@kafka.apache.org>
    Betreff: Re: Consumer Lags and receive no records anymore

    Hi again,

    On Thu, 7 Nov 2019 at 23:40, Oliver Eckle <ie...@gmx.de> wrote:

    > Hi,
    >
    > slow consumers - that could be the case. But why is that an issue? I
    > mean I try to use kafka exactly for that and the ability to recover.
    > So e.g if there is some burst scenario where a lot of data arrives and
    > has to be processed, a "slow consumer" will be the default case.
    > What I could understand, that having constantly slow consumers will be
    > an issue, e.g. if there is some compaction on the topic or data will
    > be erased, without having been read.
    >
    > This is what I think about the "lagging topic"
    > The scenario is like that:
    >
    > Producer --- Topic C ---> Consumer --- Processing ---> external REST
    > Endpoint
    >
    > Sending a Record to the external REST Endpoint takes around 300ms.
    > So if I have the "burst scenario" I mentioned above, there is maybe a
    > lag of 1000-2000 records.
    > So consumer is pulling 500 and process them, which means it takes
    > around 150s for the app to process the records.
    > This could create some timeouts I guess ... so that’s the reason why I
    > try to lower the poll records to 50 e.g. cause then is takes only 15s
    > until the poll is committed.
    >
    > Yeah having some liveness probe sounds pretty elegant .. give that a
    > try ...
    > Anyway, I need to understand why that is happening to deal with the
    > scenario the correct way.. killing the consumer after he stops to
    > consume messages, seems to me more like a workaround.
    >
    > Regards
    >
    As per your previous replies, if you have 2 partitions with that topic, you can distribute all data between 2 consumers in your cgroup, and process information. But given your data burst case, I would advise you increase your number of partitions and spread the burst across. Just like any other tool, Kafka requires certain level of configuration to achieve what you want. I would recommend you increase your partitions and consumers to spread the load.

    Regards,

    >
    > -----Ursprüngliche Nachricht-----
    > Von: M. Manna <ma...@gmail.com>
    > Gesendet: Freitag, 8. November 2019 00:24
    > An: users@kafka.apache.org
    > Betreff: Re: Consumer Lags and receive no records anymore
    >
    > Hi,
    >
    > > On 7 Nov 2019, at 22:39, Oliver Eckle <ie...@gmx.de> wrote:
    > >
    > > Have a consumer group with one consumer for the topic .. by
    > misunderstanding I have two partitions on the topic ..
    > > Due to having no key set for the record - I think having several
    > consumers making no sense, or am I wrong.
    > >
    > I am not sure why that would be an issue. If you have 1 consumer your
    > cgroup, yes all the topic partitions will be assigned to that consumer.
    > Slow consumer means your consumers aren’t consuming messages as fast
    > as you are producing (or, fast enough).
    > > Is there any possibility to work around that?
    > > Cause for example on lagging topic is put to a external REST
    > > service,
    > which takes around 300ms to be handled.
    > What do you mean by “Lagging topic is put to an external REST service”?
    > > So is lowering the max.poll.records an option?
    > Polling will keep blocking continuously until minimum bytes of records
    > available. Also, it sends a heartbeat per call of poll().
    > > Anyhow, I could probably not avoid situations like that. Sounds to
    > > me
    > like a pretty common scenario?
    > > So how to deal with them? Having a health check that crush the app
    > > if no
    > data is appearing anymore?
    > In K8s world, you can tie this with liveness probe, if you consumers
    > aren’t live and then you may chose to destroy the pod and bring them
    > back up. Provided that your offset commits are adhering to how
    > technical requirements, you should be able to recover based on the
    > last committed offset. Try that and see how it goes.
    > >
    > > Regards
    > >
    > > -----Ursprüngliche Nachricht-----
    > > Von: M. Manna <ma...@gmail.com>
    > > Gesendet: Donnerstag, 7. November 2019 23:35
    > > An: users@kafka.apache.org
    > > Betreff: Re: Consumer Lags and receive no records anymore
    > >
    > > Consuming not fast/frequent enough is one of the most common reasons
    > > for
    > it. Have you you checked how fast/much message you’re churning out vs.
    > how many consumers you have in the group the handle the workload?
    > >
    > > Also, what are your partition setup for consumer groups?
    > >
    > >
    > > Regards,
    > >
    > > On Thu, 7 Nov 2019 at 22:03, Oliver Eckle <ie...@gmx.de> wrote:
    > >
    > >> Using  kafka-consumer-groups.sh --bootstrap-server localhost:9092
    > >> --describe -group my-app ..
    > >> put the output within the logs .. also its pretty obvious, cause no
    > >> data will flow anymore
    > >>
    > >> Regards
    > >>
    > >> -----Ursprüngliche Nachricht-----
    > >> Von: M. Manna <ma...@gmail.com>
    > >> Gesendet: Donnerstag, 7. November 2019 22:10
    > >> An: users@kafka.apache.org
    > >> Betreff: Re: Consumer Lags and receive no records anymore
    > >>
    > >> Have you checked your Kafka consumer group status ? How did you
    > >> determine that your consumers are lagging ?
    > >>
    > >> Thanks,
    > >>
    > >> On Thu, 7 Nov 2019 at 20:55, Oliver Eckle <ie...@gmx.de> wrote:
    > >>
    > >>> Hi there,
    > >>>
    > >>>
    > >>>
    > >>> have pretty strange behaviour questioned here already:
    > >>> https://stackoverflow.com/q/58650416/7776688
    > >>>
    > >>>
    > >>>
    > >>> As you could see from the logs: https://pastebin.com/yrSytSHD at a
    > >>> specific point the client is stopping to receive records.
    > >>>
    > >>> I have a strong suspicion that it relates to performance on
    > >>> handling the records - so that I run into kind of timeout.
    > >>>
    > >>> What seems to be strange, is that the client is not getting back
    > >>> and heartbeats are processed successfully.
    > >>>
    > >>> Even the consumer will be returned on inspecting the consumer group.
    > >>> Any idea .. kafka log has no error in it.
    > >>>
    > >>>
    > >>>
    > >>> Running a cluster with 3 broker inside a Kubernetes cluster, using
    > >>> the bitnami helm chart.
    > >>>
    > >>>
    > >>>
    > >>> Kind Regards
    > >>>
    > >>> Oliver
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>
    > >>
    > >
    >
    >
    >



This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.

Re: [External] AW: Consumer Lags and receive no records anymore

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I belive the behavior has changed over time.  There is a way to explicitly set a practitioner and they provide: https://github.com/axbaretto/kafka/blob/master/clients/src/main/java/org/apache/kafka/clients/producer/RoundRobinPartitioner.java

On 11/10/19, 5:45 AM, "Oliver Eckle" <ie...@gmx.de> wrote:

    Hi Dave,
    
    thank you . saw some tutorial where they told it otherwise .. which confuses me a litte.
    If its done round-robin .. my "world view" makes sense again 😊 
    
    Oliver
    
    
    -----Ursprüngliche Nachricht-----
    Von: Tauzell, Dave <Da...@surescripts.com> 
    Gesendet: Freitag, 8. November 2019 16:18
    An: users@kafka.apache.org
    Betreff: Re: [External] AW: Consumer Lags and receive no records anymore
    
    A null key results in the client sending to partitions in a round-robin order.  Use a key if you want to ensure that specific messages end up on the same partition.
    
    -Dave
    
    On 11/8/19, 1:06 AM, "Oliver Eckle" <ie...@gmx.de> wrote:
    
        Hi,
    
        Don’t get me wrong, I just want to understand what's going on.
        so how do I figure out, how much partitions are required? Trial and Error?
        And as far as I understand, if I have null as key for the record, the record is stored in all partitions.
        Is it then not also processed by each consumer, even if I have more than one consumer?
        So could you explain, why the consumer stops to get data?
    
        Thx
    
        -----Ursprüngliche Nachricht-----
        Von: M. Manna <ma...@gmail.com>
        Gesendet: Freitag, 8. November 2019 00:51
        An: Kafka Users <us...@kafka.apache.org>
        Betreff: Re: Consumer Lags and receive no records anymore
    
        Hi again,
    
        On Thu, 7 Nov 2019 at 23:40, Oliver Eckle <ie...@gmx.de> wrote:
    
        > Hi,
        >
        > slow consumers - that could be the case. But why is that an issue? I
        > mean I try to use kafka exactly for that and the ability to recover.
        > So e.g if there is some burst scenario where a lot of data arrives and
        > has to be processed, a "slow consumer" will be the default case.
        > What I could understand, that having constantly slow consumers will be
        > an issue, e.g. if there is some compaction on the topic or data will
        > be erased, without having been read.
        >
        > This is what I think about the "lagging topic"
        > The scenario is like that:
        >
        > Producer --- Topic C ---> Consumer --- Processing ---> external REST
        > Endpoint
        >
        > Sending a Record to the external REST Endpoint takes around 300ms.
        > So if I have the "burst scenario" I mentioned above, there is maybe a
        > lag of 1000-2000 records.
        > So consumer is pulling 500 and process them, which means it takes
        > around 150s for the app to process the records.
        > This could create some timeouts I guess ... so that’s the reason why I
        > try to lower the poll records to 50 e.g. cause then is takes only 15s
        > until the poll is committed.
        >
        > Yeah having some liveness probe sounds pretty elegant .. give that a
        > try ...
        > Anyway, I need to understand why that is happening to deal with the
        > scenario the correct way.. killing the consumer after he stops to
        > consume messages, seems to me more like a workaround.
        >
        > Regards
        >
        As per your previous replies, if you have 2 partitions with that topic, you can distribute all data between 2 consumers in your cgroup, and process information. But given your data burst case, I would advise you increase your number of partitions and spread the burst across. Just like any other tool, Kafka requires certain level of configuration to achieve what you want. I would recommend you increase your partitions and consumers to spread the load.
    
        Regards,
    
        >
        > -----Ursprüngliche Nachricht-----
        > Von: M. Manna <ma...@gmail.com>
        > Gesendet: Freitag, 8. November 2019 00:24
        > An: users@kafka.apache.org
        > Betreff: Re: Consumer Lags and receive no records anymore
        >
        > Hi,
        >
        > > On 7 Nov 2019, at 22:39, Oliver Eckle <ie...@gmx.de> wrote:
        > >
        > > Have a consumer group with one consumer for the topic .. by
        > misunderstanding I have two partitions on the topic ..
        > > Due to having no key set for the record - I think having several
        > consumers making no sense, or am I wrong.
        > >
        > I am not sure why that would be an issue. If you have 1 consumer your
        > cgroup, yes all the topic partitions will be assigned to that consumer.
        > Slow consumer means your consumers aren’t consuming messages as fast
        > as you are producing (or, fast enough).
        > > Is there any possibility to work around that?
        > > Cause for example on lagging topic is put to a external REST
        > > service,
        > which takes around 300ms to be handled.
        > What do you mean by “Lagging topic is put to an external REST service”?
        > > So is lowering the max.poll.records an option?
        > Polling will keep blocking continuously until minimum bytes of records
        > available. Also, it sends a heartbeat per call of poll().
        > > Anyhow, I could probably not avoid situations like that. Sounds to
        > > me
        > like a pretty common scenario?
        > > So how to deal with them? Having a health check that crush the app
        > > if no
        > data is appearing anymore?
        > In K8s world, you can tie this with liveness probe, if you consumers
        > aren’t live and then you may chose to destroy the pod and bring them
        > back up. Provided that your offset commits are adhering to how
        > technical requirements, you should be able to recover based on the
        > last committed offset. Try that and see how it goes.
        > >
        > > Regards
        > >
        > > -----Ursprüngliche Nachricht-----
        > > Von: M. Manna <ma...@gmail.com>
        > > Gesendet: Donnerstag, 7. November 2019 23:35
        > > An: users@kafka.apache.org
        > > Betreff: Re: Consumer Lags and receive no records anymore
        > >
        > > Consuming not fast/frequent enough is one of the most common reasons
        > > for
        > it. Have you you checked how fast/much message you’re churning out vs.
        > how many consumers you have in the group the handle the workload?
        > >
        > > Also, what are your partition setup for consumer groups?
        > >
        > >
        > > Regards,
        > >
        > > On Thu, 7 Nov 2019 at 22:03, Oliver Eckle <ie...@gmx.de> wrote:
        > >
        > >> Using  kafka-consumer-groups.sh --bootstrap-server localhost:9092
        > >> --describe -group my-app ..
        > >> put the output within the logs .. also its pretty obvious, cause no
        > >> data will flow anymore
        > >>
        > >> Regards
        > >>
        > >> -----Ursprüngliche Nachricht-----
        > >> Von: M. Manna <ma...@gmail.com>
        > >> Gesendet: Donnerstag, 7. November 2019 22:10
        > >> An: users@kafka.apache.org
        > >> Betreff: Re: Consumer Lags and receive no records anymore
        > >>
        > >> Have you checked your Kafka consumer group status ? How did you
        > >> determine that your consumers are lagging ?
        > >>
        > >> Thanks,
        > >>
        > >> On Thu, 7 Nov 2019 at 20:55, Oliver Eckle <ie...@gmx.de> wrote:
        > >>
        > >>> Hi there,
        > >>>
        > >>>
        > >>>
        > >>> have pretty strange behaviour questioned here already:
        > >>> https://stackoverflow.com/q/58650416/7776688
        > >>>
        > >>>
        > >>>
        > >>> As you could see from the logs: https://pastebin.com/yrSytSHD at a
        > >>> specific point the client is stopping to receive records.
        > >>>
        > >>> I have a strong suspicion that it relates to performance on
        > >>> handling the records - so that I run into kind of timeout.
        > >>>
        > >>> What seems to be strange, is that the client is not getting back
        > >>> and heartbeats are processed successfully.
        > >>>
        > >>> Even the consumer will be returned on inspecting the consumer group.
        > >>> Any idea .. kafka log has no error in it.
        > >>>
        > >>>
        > >>>
        > >>> Running a cluster with 3 broker inside a Kubernetes cluster, using
        > >>> the bitnami helm chart.
        > >>>
        > >>>
        > >>>
        > >>> Kind Regards
        > >>>
        > >>> Oliver
        > >>>
        > >>>
        > >>>
        > >>>
        > >>>
        > >>>
        > >>
        > >>
        > >
        >
        >
        >
    
    
    
    This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.
    
    


AW: [External] AW: Consumer Lags and receive no records anymore

Posted by Oliver Eckle <ie...@gmx.de>.
Hi Dave,

thank you . saw some tutorial where they told it otherwise .. which confuses me a litte.
If its done round-robin .. my "world view" makes sense again 😊 

Oliver


-----Ursprüngliche Nachricht-----
Von: Tauzell, Dave <Da...@surescripts.com> 
Gesendet: Freitag, 8. November 2019 16:18
An: users@kafka.apache.org
Betreff: Re: [External] AW: Consumer Lags and receive no records anymore

A null key results in the client sending to partitions in a round-robin order.  Use a key if you want to ensure that specific messages end up on the same partition.

-Dave

On 11/8/19, 1:06 AM, "Oliver Eckle" <ie...@gmx.de> wrote:

    Hi,

    Don’t get me wrong, I just want to understand what's going on.
    so how do I figure out, how much partitions are required? Trial and Error?
    And as far as I understand, if I have null as key for the record, the record is stored in all partitions.
    Is it then not also processed by each consumer, even if I have more than one consumer?
    So could you explain, why the consumer stops to get data?

    Thx

    -----Ursprüngliche Nachricht-----
    Von: M. Manna <ma...@gmail.com>
    Gesendet: Freitag, 8. November 2019 00:51
    An: Kafka Users <us...@kafka.apache.org>
    Betreff: Re: Consumer Lags and receive no records anymore

    Hi again,

    On Thu, 7 Nov 2019 at 23:40, Oliver Eckle <ie...@gmx.de> wrote:

    > Hi,
    >
    > slow consumers - that could be the case. But why is that an issue? I
    > mean I try to use kafka exactly for that and the ability to recover.
    > So e.g if there is some burst scenario where a lot of data arrives and
    > has to be processed, a "slow consumer" will be the default case.
    > What I could understand, that having constantly slow consumers will be
    > an issue, e.g. if there is some compaction on the topic or data will
    > be erased, without having been read.
    >
    > This is what I think about the "lagging topic"
    > The scenario is like that:
    >
    > Producer --- Topic C ---> Consumer --- Processing ---> external REST
    > Endpoint
    >
    > Sending a Record to the external REST Endpoint takes around 300ms.
    > So if I have the "burst scenario" I mentioned above, there is maybe a
    > lag of 1000-2000 records.
    > So consumer is pulling 500 and process them, which means it takes
    > around 150s for the app to process the records.
    > This could create some timeouts I guess ... so that’s the reason why I
    > try to lower the poll records to 50 e.g. cause then is takes only 15s
    > until the poll is committed.
    >
    > Yeah having some liveness probe sounds pretty elegant .. give that a
    > try ...
    > Anyway, I need to understand why that is happening to deal with the
    > scenario the correct way.. killing the consumer after he stops to
    > consume messages, seems to me more like a workaround.
    >
    > Regards
    >
    As per your previous replies, if you have 2 partitions with that topic, you can distribute all data between 2 consumers in your cgroup, and process information. But given your data burst case, I would advise you increase your number of partitions and spread the burst across. Just like any other tool, Kafka requires certain level of configuration to achieve what you want. I would recommend you increase your partitions and consumers to spread the load.

    Regards,

    >
    > -----Ursprüngliche Nachricht-----
    > Von: M. Manna <ma...@gmail.com>
    > Gesendet: Freitag, 8. November 2019 00:24
    > An: users@kafka.apache.org
    > Betreff: Re: Consumer Lags and receive no records anymore
    >
    > Hi,
    >
    > > On 7 Nov 2019, at 22:39, Oliver Eckle <ie...@gmx.de> wrote:
    > >
    > > Have a consumer group with one consumer for the topic .. by
    > misunderstanding I have two partitions on the topic ..
    > > Due to having no key set for the record - I think having several
    > consumers making no sense, or am I wrong.
    > >
    > I am not sure why that would be an issue. If you have 1 consumer your
    > cgroup, yes all the topic partitions will be assigned to that consumer.
    > Slow consumer means your consumers aren’t consuming messages as fast
    > as you are producing (or, fast enough).
    > > Is there any possibility to work around that?
    > > Cause for example on lagging topic is put to a external REST
    > > service,
    > which takes around 300ms to be handled.
    > What do you mean by “Lagging topic is put to an external REST service”?
    > > So is lowering the max.poll.records an option?
    > Polling will keep blocking continuously until minimum bytes of records
    > available. Also, it sends a heartbeat per call of poll().
    > > Anyhow, I could probably not avoid situations like that. Sounds to
    > > me
    > like a pretty common scenario?
    > > So how to deal with them? Having a health check that crush the app
    > > if no
    > data is appearing anymore?
    > In K8s world, you can tie this with liveness probe, if you consumers
    > aren’t live and then you may chose to destroy the pod and bring them
    > back up. Provided that your offset commits are adhering to how
    > technical requirements, you should be able to recover based on the
    > last committed offset. Try that and see how it goes.
    > >
    > > Regards
    > >
    > > -----Ursprüngliche Nachricht-----
    > > Von: M. Manna <ma...@gmail.com>
    > > Gesendet: Donnerstag, 7. November 2019 23:35
    > > An: users@kafka.apache.org
    > > Betreff: Re: Consumer Lags and receive no records anymore
    > >
    > > Consuming not fast/frequent enough is one of the most common reasons
    > > for
    > it. Have you you checked how fast/much message you’re churning out vs.
    > how many consumers you have in the group the handle the workload?
    > >
    > > Also, what are your partition setup for consumer groups?
    > >
    > >
    > > Regards,
    > >
    > > On Thu, 7 Nov 2019 at 22:03, Oliver Eckle <ie...@gmx.de> wrote:
    > >
    > >> Using  kafka-consumer-groups.sh --bootstrap-server localhost:9092
    > >> --describe -group my-app ..
    > >> put the output within the logs .. also its pretty obvious, cause no
    > >> data will flow anymore
    > >>
    > >> Regards
    > >>
    > >> -----Ursprüngliche Nachricht-----
    > >> Von: M. Manna <ma...@gmail.com>
    > >> Gesendet: Donnerstag, 7. November 2019 22:10
    > >> An: users@kafka.apache.org
    > >> Betreff: Re: Consumer Lags and receive no records anymore
    > >>
    > >> Have you checked your Kafka consumer group status ? How did you
    > >> determine that your consumers are lagging ?
    > >>
    > >> Thanks,
    > >>
    > >> On Thu, 7 Nov 2019 at 20:55, Oliver Eckle <ie...@gmx.de> wrote:
    > >>
    > >>> Hi there,
    > >>>
    > >>>
    > >>>
    > >>> have pretty strange behaviour questioned here already:
    > >>> https://stackoverflow.com/q/58650416/7776688
    > >>>
    > >>>
    > >>>
    > >>> As you could see from the logs: https://pastebin.com/yrSytSHD at a
    > >>> specific point the client is stopping to receive records.
    > >>>
    > >>> I have a strong suspicion that it relates to performance on
    > >>> handling the records - so that I run into kind of timeout.
    > >>>
    > >>> What seems to be strange, is that the client is not getting back
    > >>> and heartbeats are processed successfully.
    > >>>
    > >>> Even the consumer will be returned on inspecting the consumer group.
    > >>> Any idea .. kafka log has no error in it.
    > >>>
    > >>>
    > >>>
    > >>> Running a cluster with 3 broker inside a Kubernetes cluster, using
    > >>> the bitnami helm chart.
    > >>>
    > >>>
    > >>>
    > >>> Kind Regards
    > >>>
    > >>> Oliver
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>>
    > >>
    > >>
    > >
    >
    >
    >



This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.