You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Zijing Guo <al...@yahoo.com.INVALID> on 2015/02/04 12:43:32 UTC

kafka consumers parallel consuming at consumer level or thread level?

Hi,I have some question regarding how kafka consumers achieve parallel consuming for one topic. Say I have 2 partitions for topic1 and I have a consumer Group A, now:1: If no consumer under consumer Group A subscribe topic1, then no message will be delivery to this consumer group.2: If there is only 1 consumer under consumer Group A that subscribe topic1, then this consumer will consume all the data from topic1 that among the 2 partitions?3:in the High level consumer example https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Exampleit says: "if you provide more threads than there are partitions on the topic, some threads will never see a message   
   - if you have more partitions than you have threads, some threads will receive data from multiple partitions"
 so back to my example, If I create 2 threads under one consumer instance that I created and will each thread corresponding to a specific partition under the topic1? what is the difference that I create 2 consumers, each with only 1 thread under same consumer group and consume topic1? Is kafka parallel consuming really down to the thread level that with 1 consumer instance or multiple consumer instance, each with 1 thread?
ThanksEdwin

Re: kafka consumers parallel consuming at consumer level or thread level?

Posted by Zijing Guo <al...@yahoo.com.INVALID>.
Hi Guozhang,Thanks for your clarification for this, it's start to making sense now.Edwin 

     On Wednesday, February 4, 2015 11:04 AM, Guozhang Wang <wa...@gmail.com> wrote:
   

 Hi Edwin,

1. Yes.
2. Yes.
3. Yes; and there is no difference in terms of parallelism.

In the new consumer client (org.apache.kafka.clients.consumer.KafkaConsumer) that is going to be out in the next release, each consumer instance is single-threaded, i.e. it will only have one thread for fetching data.

On Wed, Feb 4, 2015 at 3:43 AM, Zijing Guo <al...@yahoo.com.invalid> wrote:

Hi,I have some question regarding how kafka consumers achieve parallel consuming for one topic. Say I have 2 partitions for topic1 and I have a consumer Group A, now:1: If no consumer under consumer Group A subscribe topic1, then no message will be delivery to this consumer group.2: If there is only 1 consumer under consumer Group A that subscribe topic1, then this consumer will consume all the data from topic1 that among the 2 partitions?3:in the High level consumer example https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Exampleit says: "if you provide more threads than there are partitions on the topic, some threads will never see a message
   - if you have more partitions than you have threads, some threads will receive data from multiple partitions"
 so back to my example, If I create 2 threads under one consumer instance that I created and will each thread corresponding to a specific partition under the topic1? what is the difference that I create 2 consumers, each with only 1 thread under same consumer group and consume topic1? Is kafka parallel consuming really down to the thread level that with 1 consumer instance or multiple consumer instance, each with 1 thread?
ThanksEdwin



-- 
-- Guozhang


   

Re: kafka consumers parallel consuming at consumer level or thread level?

Posted by Guozhang Wang <wa...@gmail.com>.
Hi Edwin,

1. Yes.
2. Yes.
3. Yes; and there is no difference in terms of parallelism.

In the new consumer client
(org.apache.kafka.clients.consumer.KafkaConsumer) that is going to be out
in the next release, each consumer instance is single-threaded, i.e. it
will only have one thread for fetching data.

On Wed, Feb 4, 2015 at 3:43 AM, Zijing Guo <al...@yahoo.com.invalid>
wrote:

> Hi,I have some question regarding how kafka consumers achieve parallel
> consuming for one topic. Say I have 2 partitions for topic1 and I have a
> consumer Group A, now:1: If no consumer under consumer Group A subscribe
> topic1, then no message will be delivery to this consumer group.2: If there
> is only 1 consumer under consumer Group A that subscribe topic1, then this
> consumer will consume all the data from topic1 that among the 2
> partitions?3:in the High level consumer example
> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Exampleit
> says: "if you provide more threads than there are partitions on the topic,
> some threads will never see a message
>    - if you have more partitions than you have threads, some threads will
> receive data from multiple partitions"
>  so back to my example, If I create 2 threads under one consumer instance
> that I created and will each thread corresponding to a specific partition
> under the topic1? what is the difference that I create 2 consumers, each
> with only 1 thread under same consumer group and consume topic1? Is kafka
> parallel consuming really down to the thread level that with 1 consumer
> instance or multiple consumer instance, each with 1 thread?
> ThanksEdwin




-- 
-- Guozhang