You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jens Rantil <je...@tink.se> on 2016/01/24 14:10:51 UTC

Dealing with diverse consumption speeds

Hi,

How are you dealing with a slow consumer in Kafka? In the best of world,
each consumer will have the exact same specs and the exact same workload.
But unfortunately that's rarely true: Virtual machines share hardware with
other VMs, some Kafka tasks takes longer to process, some partition keys
occasionally make the Kafka cluster unbalanced etc.

On a larger perspective, maybe it would be nice if a consumer group would
occasionally rebalance consumers based on lag.

Cheers,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: Dealing with diverse consumption speeds

Posted by Marcos Juarez <mj...@gmail.com>.
I'm also interested in knowing if other people have run into this problem
of different consumption speeds across consumers, and how they've dealt
with it.  I've run into this in 0.7, 0.8, both beta and release, and now
0.9.0.1.  It doesn't seem to be partition-specific, but consumer-specific.
In our case, all 24 partitions in one consumer are lagging by ~50M offsets,
while all 24 partitions in another consumer are fully caught up.  The box
has plenty of capacity available, so the consumers were never starved of
CPU or RAM.  The offsets for the group id keep changing on all partitions,
so none of them are stuck, but the current rate of consumption seems to be
only as fast as the non-lagging consumer.  It's almost as if all lagging
consumers are mimicking the consumption rate of the fully caught up
consumer, and won't go any faster.

What I'm thinking right now to mitigate the issue is adding some context to
consumers (they all live within a single app), so that we'll be able to
pause consumption if lag becomes too high, and let the other consumers
catch up.

Any thoughts/suggestions on that?

Thanks,

Marcos Juarez

On Sun, Jan 24, 2016 at 6:10 AM, Jens Rantil <je...@tink.se> wrote:

> Hi,
>
> How are you dealing with a slow consumer in Kafka? In the best of world,
> each consumer will have the exact same specs and the exact same workload.
> But unfortunately that's rarely true: Virtual machines share hardware with
> other VMs, some Kafka tasks takes longer to process, some partition keys
> occasionally make the Kafka cluster unbalanced etc.
>
> On a larger perspective, maybe it would be nice if a consumer group would
> occasionally rebalance consumers based on lag.
>
> Cheers,
> Jens
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.rantil@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook <https://www.facebook.com/#!/tink.se> Linkedin
> <
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> >
>  Twitter <https://twitter.com/tink>
>