You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Yardena Meymann <ya...@feature.fm> on 2016/07/04 14:50:03 UTC

Question on partitions while consuming multiple topics

Hi,

We have several topics, same number of partitions for each, same key used
for all topics.
We also have several processes consuming the topics (one consumer group).
What we wish would happen is that messages with the same key would end up
consumed by the same process, regardless of the topic.
Can it be achieved with Kafka? What is needed for that?

Thanks in advance,
  Yardena

Re: Question on partitions while consuming multiple topics

Posted by Michael Noll <mi...@confluent.io>.
PS:  The previous example links that I shared are for the latest `trunk`
version of Kafka.  If you want to use the latest official release instead
(Kafka 0.10.0.0), which most probably is what you want, then please use the
following links to these examples.  Note the `kafka-0.10.0.0-cp-3.0.0`
branch identifier in the urls, which stands for Kafka 0.10.0.0 release and
Confluent Platform 3.0.0 release.

https://github.com/confluentinc/examples/blob/kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/main/java/io/confluent/examples/streams/PageViewRegionLambdaExample.java

https://github.com/confluentinc/examples/blob/kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/confluent/examples/streams/JoinLambdaIntegrationTest.java

-Michael



On Wed, Jul 6, 2016 at 11:49 AM, Michael Noll <mi...@confluent.io> wrote:

> Snehal beat me to it, as my suggestion would have also been to take a look
> at Kafka Streams. :-)  Kafka Streams should be the easiest way to achieve
> what you're describing.  Snehal's links are good starting points.
>
> Further pointers are:
>
>
> https://github.com/confluentinc/examples/blob/master/kafka-streams/src/main/java/io/confluent/examples/streams/PageViewRegionLambdaExample.java
>
>
> https://github.com/confluentinc/examples/blob/master/kafka-streams/src/test/java/io/confluent/examples/streams/JoinLambdaIntegrationTest.java
>
> Both of these examples demonstrate how to work on topics that have the
> same key (here: a user id).
>
> -Michael
>
>
> On Wed, Jul 6, 2016 at 8:44 AM, Snehal Nagmote <na...@gmail.com>
> wrote:
>
>> Just an update, as I was reading about Kafka Streams, this functionality
>> is
>> by default supported with Kafka Streams Library.
>> Following links are really helpful
>>
>>
>> http://docs.confluent.io/3.0.0/streams/developer-guide.html#partition-grouper
>>
>> https://github.com/apache/kafka/blob/0.10.0/streams/src/main/java/org/apache/kafka/streams/processor/DefaultPartitionGrouper.java
>>
>> (Kafka Stream is supported with 0.10.0)
>>
>> Thanks,
>> Snehal
>>
>> On 5 July 2016 at 11:47, Snehal Nagmote <na...@gmail.com> wrote:
>>
>> > Hello Yardena ,
>> >
>> > You may want to take a look at manual assignment for partitions section
>> > mentioned here ,
>> >
>> >
>> http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client
>> >  .
>> >
>> > However I have not tried using this for multiple topics , but looking at
>> > api , it should be doable.
>> >
>> > You have to use same partitioning method which was used by producer to
>> > determine the correct partition for consumer process for multiple
>> topics.
>> >
>> > Note that , you would lose the ordering guarantee with this approach
>> since
>> > Kafka guarantees ordering within partition for a single topic ,
>> >
>> > Thanks,
>> > Snehal
>> >
>> >
>> > On 4 July 2016 at 07:50, Yardena Meymann <ya...@feature.fm> wrote:
>> >
>> >> Hi,
>> >>
>> >> We have several topics, same number of partitions for each, same key
>> used
>> >> for all topics.
>> >> We also have several processes consuming the topics (one consumer
>> group).
>> >> What we wish would happen is that messages with the same key would end
>> up
>> >> consumed by the same process, regardless of the topic.
>> >> Can it be achieved with Kafka? What is needed for that?
>> >>
>> >> Thanks in advance,
>> >>   Yardena
>> >>
>> >
>> >
>>
>
>
>


-- 
Best regards,
Michael Noll



*Michael G. Noll | Product Manager | Confluent | +1 650.453.5860Download
Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*

Re: Question on partitions while consuming multiple topics

Posted by Michael Noll <mi...@confluent.io>.
Snehal beat me to it, as my suggestion would have also been to take a look
at Kafka Streams. :-)  Kafka Streams should be the easiest way to achieve
what you're describing.  Snehal's links are good starting points.

Further pointers are:

https://github.com/confluentinc/examples/blob/master/kafka-streams/src/main/java/io/confluent/examples/streams/PageViewRegionLambdaExample.java

https://github.com/confluentinc/examples/blob/master/kafka-streams/src/test/java/io/confluent/examples/streams/JoinLambdaIntegrationTest.java

Both of these examples demonstrate how to work on topics that have the same
key (here: a user id).

-Michael


On Wed, Jul 6, 2016 at 8:44 AM, Snehal Nagmote <na...@gmail.com>
wrote:

> Just an update, as I was reading about Kafka Streams, this functionality is
> by default supported with Kafka Streams Library.
> Following links are really helpful
>
>
> http://docs.confluent.io/3.0.0/streams/developer-guide.html#partition-grouper
>
> https://github.com/apache/kafka/blob/0.10.0/streams/src/main/java/org/apache/kafka/streams/processor/DefaultPartitionGrouper.java
>
> (Kafka Stream is supported with 0.10.0)
>
> Thanks,
> Snehal
>
> On 5 July 2016 at 11:47, Snehal Nagmote <na...@gmail.com> wrote:
>
> > Hello Yardena ,
> >
> > You may want to take a look at manual assignment for partitions section
> > mentioned here ,
> >
> >
> http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client
> >  .
> >
> > However I have not tried using this for multiple topics , but looking at
> > api , it should be doable.
> >
> > You have to use same partitioning method which was used by producer to
> > determine the correct partition for consumer process for multiple topics.
> >
> > Note that , you would lose the ordering guarantee with this approach
> since
> > Kafka guarantees ordering within partition for a single topic ,
> >
> > Thanks,
> > Snehal
> >
> >
> > On 4 July 2016 at 07:50, Yardena Meymann <ya...@feature.fm> wrote:
> >
> >> Hi,
> >>
> >> We have several topics, same number of partitions for each, same key
> used
> >> for all topics.
> >> We also have several processes consuming the topics (one consumer
> group).
> >> What we wish would happen is that messages with the same key would end
> up
> >> consumed by the same process, regardless of the topic.
> >> Can it be achieved with Kafka? What is needed for that?
> >>
> >> Thanks in advance,
> >>   Yardena
> >>
> >
> >
>

Re: Question on partitions while consuming multiple topics

Posted by Snehal Nagmote <na...@gmail.com>.
Just an update, as I was reading about Kafka Streams, this functionality is
by default supported with Kafka Streams Library.
Following links are really helpful

http://docs.confluent.io/3.0.0/streams/developer-guide.html#partition-grouper
https://github.com/apache/kafka/blob/0.10.0/streams/src/main/java/org/apache/kafka/streams/processor/DefaultPartitionGrouper.java

(Kafka Stream is supported with 0.10.0)

Thanks,
Snehal

On 5 July 2016 at 11:47, Snehal Nagmote <na...@gmail.com> wrote:

> Hello Yardena ,
>
> You may want to take a look at manual assignment for partitions section
> mentioned here ,
>
> http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client
>  .
>
> However I have not tried using this for multiple topics , but looking at
> api , it should be doable.
>
> You have to use same partitioning method which was used by producer to
> determine the correct partition for consumer process for multiple topics.
>
> Note that , you would lose the ordering guarantee with this approach since
> Kafka guarantees ordering within partition for a single topic ,
>
> Thanks,
> Snehal
>
>
> On 4 July 2016 at 07:50, Yardena Meymann <ya...@feature.fm> wrote:
>
>> Hi,
>>
>> We have several topics, same number of partitions for each, same key used
>> for all topics.
>> We also have several processes consuming the topics (one consumer group).
>> What we wish would happen is that messages with the same key would end up
>> consumed by the same process, regardless of the topic.
>> Can it be achieved with Kafka? What is needed for that?
>>
>> Thanks in advance,
>>   Yardena
>>
>
>

Re: Question on partitions while consuming multiple topics

Posted by Snehal Nagmote <na...@gmail.com>.
Hello Yardena ,

You may want to take a look at manual assignment for partitions section
mentioned here ,
http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client
 .

However I have not tried using this for multiple topics , but looking at
api , it should be doable.

You have to use same partitioning method which was used by producer to
determine the correct partition for consumer process for multiple topics.

Note that , you would lose the ordering guarantee with this approach since
Kafka guarantees ordering within partition for a single topic ,

Thanks,
Snehal


On 4 July 2016 at 07:50, Yardena Meymann <ya...@feature.fm> wrote:

> Hi,
>
> We have several topics, same number of partitions for each, same key used
> for all topics.
> We also have several processes consuming the topics (one consumer group).
> What we wish would happen is that messages with the same key would end up
> consumed by the same process, regardless of the topic.
> Can it be achieved with Kafka? What is needed for that?
>
> Thanks in advance,
>   Yardena
>