You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jef G <je...@dataminr.com> on 2018/03/07 22:06:45 UTC

reading from multiple topics at the same rate with one KafkaConsumer

Hello friends.

Say I have a bunch of consumer jobs in the same consumer group. They want
to read topics A and B and they want these topics to be co-partitioned. So
each consumer job creates one KafkaConsumer for both topics. Everyone's
happy.

Now say these consumer jobs fall behind on topics A and B. There are plenty
of messages available in both topics to read. I believe KafkaConsumer
doesn't make any promises about which topics messages come from on a poll -
is that right? Does that mean that topic A could "starve out" topic B until
A is caught up? Is there any way to guarantee reading from both topics "at
the same rate" (or give the caller control over which topic to prioritize
per poll)?

An alternative is to create separate KafkaConsumers for separate topics but
then we lose co-partitioning unless the caller manages partitioning itself.
Partition management is a major convenience of Kafka so having to
re-implement that would be a big loss.

Thanks!

-- 
Jef G
Senior Data Scientist | Dataminr | dataminr.com
6 East 32nd Street Floor 2 | New York, NY 10016
jefg@dataminr.com