You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com> on 2016/11/22 19:03:06 UTC

Kafka consumers are not equally distributed

Hi there,

We are doing the load test in Kafka with 25tps and first 9 hours it went fine almost 80K/hr messages were processed after that we see a lot of lags and we stopped the incoming load.

Currently we see 15K/hr messages are processing. We have 40 consumer instances with concurrency 4 and 2 topics and both is having 160 partitions so each consumer with each partition.

What we found that some of the partitions are sitting idle and some of are overloaded and its really slowing down the consumer message processing.

Why rebalancing is not happening and existing messages are not distributed equally among the instances? We tried to restart the app still the same pace. Any idea what could be the reason?

Thanks
Achintya


RE: Kafka consumers are not equally distributed

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Thank you, Guozhang.

I see a lot of this exception:

org.apache.kafka.clients.consumer.internals.ConsumerCoordinator::Offset commit failed. : TimeoutException: The request timed out. We are committing the offset manually by Asynch mode and session.timeout.ms is 5 mins and poll time 10 secs and still we see a lot of this exception. So could you please let us know what could be the reason for this exception. Here is our consumer configuration. 

enable.auto.commit=false
session.timeout.ms=299999
request.timeout.ms=300000 auto.offset.reset=earliest
kafka.consumer.concurreny=40
max.partitions_fetch_bytes=20485760
kafka.consumer.poll.timeout=10000
kafka.consumer.syncCommits=false

Thanks
Achintya

-----Original Message-----
From: Guozhang Wang [mailto:wangguoz@gmail.com] 
Sent: Friday, November 25, 2016 7:14 PM
To: users@kafka.apache.org
Subject: Re: Kafka consumers are not equally distributed

You can take a look at this FAQ wiki:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
?

And even if you are using the new Java producer, if you specify the key and key distribution is not even, then it will not be evenly distributed.

Guozhang

On Fri, Nov 25, 2016 at 9:12 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> So what is the option to messages make it equally distributed from 
> that point? I mean is any other option to make the consumers to speed up?
>
> Thanks
> Acintya
>
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Friday, November 25, 2016 12:09 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> Note that consumer's fetching parallelism is per-partition, i.e., one 
> partition is fetched by only a single consumer instance, so even if 
> some partitions have heavy load other idle consumers will not come to 
> share the messages.
>
> If you observed that some partitions have no messages while others 
> have a lot, then it means the producing load on the partitions are not 
> evenly distributed, as I mentioned in the previous comment it is not a 
> consumer issue but a producer issue.
>
>
> Guozhang
>
> On Fri, Nov 25, 2016 at 7:11 AM, Ghosh, Achintya (Contractor) < 
> Achintya_Ghosh@comcast.com> wrote:
>
> > Thank you Guozhang.
> >
> > Let me clarify : "some of the partitions are sitting idle and some 
> > of are overloaded", I mean we stopped the load after 9 hours as see 
> > the messages were processing very slow. That time we observed that 
> > some partitions had lot of messages and some were sitting idle. So 
> > my question why messages were not shared if we see some are 
> > overloaded and some are having 0 messages. Even we started the kafka 
> > servers and application servers too but nothing happened, still it 
> > was processing very slow and messages were not distributed. So we 
> > are concerned what should do this kind of situation and make the consumers more speedy.
> >
> > Thanks
> > Achintya
> >
> > -----Original Message-----
> > From: Guozhang Wang [mailto:wangguoz@gmail.com]
> > Sent: Thursday, November 24, 2016 11:21 PM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka consumers are not equally distributed
> >
> > The default partition assignment strategy is the RangePartitioner.
> > Note it is per-topic, so if you use the default partitioner then in 
> > your case 160 partitions of each of the topic will be assigned to 
> > the first 160 consumer instances, each getting two partitions, one 
> > partition from each. So the consumer should be balanced  on the
> consumer-instance basis.
> >
> > I'm not sure what you meant by "some of the partitions are sitting 
> > idle and some of are overloaded", do you mean that some partitions 
> > does not have new data coming in and others keep getting high 
> > traffic producing to it that the consumer cannot keep up? In this 
> > case it is no the consumer's issue, but the producer not producing 
> > in a balanced
> manner.
> >
> >
> >
> >
> > Guozhang
> >
> >
> >
> >
> > On Thu, Nov 24, 2016 at 7:45 PM, Ghosh, Achintya (Contractor) < 
> > Achintya_Ghosh@comcast.com> wrote:
> >
> > > Java consumer. 0.9.1
> > >
> > > Thanks
> > > Achintya
> > >
> > > -----Original Message-----
> > > From: Guozhang Wang [mailto:wangguoz@gmail.com]
> > > Sent: Thursday, November 24, 2016 8:28 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Kafka consumers are not equally distributed
> > >
> > > Which version of Kafka are you using with your consumer? Is it 
> > > Scala or Java consumers?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) < 
> > > Achintya_Ghosh@comcast.com> wrote:
> > >
> > > > No, that is not the reason. Initially all the partitions were 
> > > > assigned the messages and those were processed very fast and sit 
> > > > idle even other partitions  are having a lot of messages to be
> > processed.
> > > > So I was under impression  that rebalance should be triggered 
> > > > and messages will be re-distributed equally again.
> > > >
> > > > Thanks
> > > > Achintya
> > > >
> > > > -----Original Message-----
> > > > From: Sharninder [mailto:sharninder@gmail.com]
> > > > Sent: Wednesday, November 23, 2016 12:33 AM
> > > > To: users@kafka.apache.org
> > > > Cc: dev@kafka.apache.org
> > > > Subject: Re: Kafka consumers are not equally distributed
> > > >
> > > > Could it be because of the partition key ?
> > > >
> > > > On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < 
> > > > Achintya_Ghosh@comcast.com> wrote:
> > > >
> > > > > Hi there,
> > > > >
> > > > > We are doing the load test in Kafka with 25tps and first 9 
> > > > > hours it went fine almost 80K/hr messages were processed after 
> > > > > that we see a lot of lags and we stopped the incoming load.
> > > > >
> > > > > Currently we see 15K/hr messages are processing. We have 40 
> > > > > consumer instances with concurrency 4 and 2 topics and both is 
> > > > > having 160 partitions so each consumer with each partition.
> > > > >
> > > > > What we found that some of the partitions are sitting idle and 
> > > > > some of are overloaded and its really slowing down the 
> > > > > consumer message
> > > > processing.
> > > > >
> > > > > Why rebalancing is not happening and existing messages are not 
> > > > > distributed equally among the instances? We tried to restart 
> > > > > the app still the same pace. Any idea what could be the reason?
> > > > >
> > > > > Thanks
> > > > > Achintya
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > --
> > > > Sharninder
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



--
-- Guozhang

Re: Kafka consumers are not equally distributed

Posted by Guozhang Wang <wa...@gmail.com>.
You can take a look at this FAQ wiki:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
?

And even if you are using the new Java producer, if you specify the key and
key distribution is not even, then it will not be evenly distributed.

Guozhang

On Fri, Nov 25, 2016 at 9:12 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> So what is the option to messages make it equally distributed from that
> point? I mean is any other option to make the consumers to speed up?
>
> Thanks
> Acintya
>
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Friday, November 25, 2016 12:09 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> Note that consumer's fetching parallelism is per-partition, i.e., one
> partition is fetched by only a single consumer instance, so even if some
> partitions have heavy load other idle consumers will not come to share the
> messages.
>
> If you observed that some partitions have no messages while others have a
> lot, then it means the producing load on the partitions are not evenly
> distributed, as I mentioned in the previous comment it is not a consumer
> issue but a producer issue.
>
>
> Guozhang
>
> On Fri, Nov 25, 2016 at 7:11 AM, Ghosh, Achintya (Contractor) <
> Achintya_Ghosh@comcast.com> wrote:
>
> > Thank you Guozhang.
> >
> > Let me clarify : "some of the partitions are sitting idle and some of
> > are overloaded", I mean we stopped the load after 9 hours as see the
> > messages were processing very slow. That time we observed that some
> > partitions had lot of messages and some were sitting idle. So my
> > question why messages were not shared if we see some are overloaded
> > and some are having 0 messages. Even we started the kafka servers and
> > application servers too but nothing happened, still it was processing
> > very slow and messages were not distributed. So we are concerned what
> > should do this kind of situation and make the consumers more speedy.
> >
> > Thanks
> > Achintya
> >
> > -----Original Message-----
> > From: Guozhang Wang [mailto:wangguoz@gmail.com]
> > Sent: Thursday, November 24, 2016 11:21 PM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka consumers are not equally distributed
> >
> > The default partition assignment strategy is the RangePartitioner.
> > Note it is per-topic, so if you use the default partitioner then in
> > your case 160 partitions of each of the topic will be assigned to the
> > first 160 consumer instances, each getting two partitions, one
> > partition from each. So the consumer should be balanced  on the
> consumer-instance basis.
> >
> > I'm not sure what you meant by "some of the partitions are sitting
> > idle and some of are overloaded", do you mean that some partitions
> > does not have new data coming in and others keep getting high traffic
> > producing to it that the consumer cannot keep up? In this case it is
> > no the consumer's issue, but the producer not producing in a balanced
> manner.
> >
> >
> >
> >
> > Guozhang
> >
> >
> >
> >
> > On Thu, Nov 24, 2016 at 7:45 PM, Ghosh, Achintya (Contractor) <
> > Achintya_Ghosh@comcast.com> wrote:
> >
> > > Java consumer. 0.9.1
> > >
> > > Thanks
> > > Achintya
> > >
> > > -----Original Message-----
> > > From: Guozhang Wang [mailto:wangguoz@gmail.com]
> > > Sent: Thursday, November 24, 2016 8:28 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Kafka consumers are not equally distributed
> > >
> > > Which version of Kafka are you using with your consumer? Is it Scala
> > > or Java consumers?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) <
> > > Achintya_Ghosh@comcast.com> wrote:
> > >
> > > > No, that is not the reason. Initially all the partitions were
> > > > assigned the messages and those were processed very fast and sit
> > > > idle even other partitions  are having a lot of messages to be
> > processed.
> > > > So I was under impression  that rebalance should be triggered and
> > > > messages will be re-distributed equally again.
> > > >
> > > > Thanks
> > > > Achintya
> > > >
> > > > -----Original Message-----
> > > > From: Sharninder [mailto:sharninder@gmail.com]
> > > > Sent: Wednesday, November 23, 2016 12:33 AM
> > > > To: users@kafka.apache.org
> > > > Cc: dev@kafka.apache.org
> > > > Subject: Re: Kafka consumers are not equally distributed
> > > >
> > > > Could it be because of the partition key ?
> > > >
> > > > On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) <
> > > > Achintya_Ghosh@comcast.com> wrote:
> > > >
> > > > > Hi there,
> > > > >
> > > > > We are doing the load test in Kafka with 25tps and first 9 hours
> > > > > it went fine almost 80K/hr messages were processed after that we
> > > > > see a lot of lags and we stopped the incoming load.
> > > > >
> > > > > Currently we see 15K/hr messages are processing. We have 40
> > > > > consumer instances with concurrency 4 and 2 topics and both is
> > > > > having 160 partitions so each consumer with each partition.
> > > > >
> > > > > What we found that some of the partitions are sitting idle and
> > > > > some of are overloaded and its really slowing down the consumer
> > > > > message
> > > > processing.
> > > > >
> > > > > Why rebalancing is not happening and existing messages are not
> > > > > distributed equally among the instances? We tried to restart the
> > > > > app still the same pace. Any idea what could be the reason?
> > > > >
> > > > > Thanks
> > > > > Achintya
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > --
> > > > Sharninder
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang

RE: Kafka consumers are not equally distributed

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
So what is the option to messages make it equally distributed from that point? I mean is any other option to make the consumers to speed up?

Thanks
Acintya

-----Original Message-----
From: Guozhang Wang [mailto:wangguoz@gmail.com] 
Sent: Friday, November 25, 2016 12:09 PM
To: users@kafka.apache.org
Subject: Re: Kafka consumers are not equally distributed

Note that consumer's fetching parallelism is per-partition, i.e., one partition is fetched by only a single consumer instance, so even if some partitions have heavy load other idle consumers will not come to share the messages.

If you observed that some partitions have no messages while others have a lot, then it means the producing load on the partitions are not evenly distributed, as I mentioned in the previous comment it is not a consumer issue but a producer issue.


Guozhang

On Fri, Nov 25, 2016 at 7:11 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Thank you Guozhang.
>
> Let me clarify : "some of the partitions are sitting idle and some of 
> are overloaded", I mean we stopped the load after 9 hours as see the 
> messages were processing very slow. That time we observed that some 
> partitions had lot of messages and some were sitting idle. So my 
> question why messages were not shared if we see some are overloaded 
> and some are having 0 messages. Even we started the kafka servers and 
> application servers too but nothing happened, still it was processing 
> very slow and messages were not distributed. So we are concerned what 
> should do this kind of situation and make the consumers more speedy.
>
> Thanks
> Achintya
>
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Thursday, November 24, 2016 11:21 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> The default partition assignment strategy is the RangePartitioner. 
> Note it is per-topic, so if you use the default partitioner then in 
> your case 160 partitions of each of the topic will be assigned to the 
> first 160 consumer instances, each getting two partitions, one 
> partition from each. So the consumer should be balanced  on the consumer-instance basis.
>
> I'm not sure what you meant by "some of the partitions are sitting 
> idle and some of are overloaded", do you mean that some partitions 
> does not have new data coming in and others keep getting high traffic 
> producing to it that the consumer cannot keep up? In this case it is 
> no the consumer's issue, but the producer not producing in a balanced manner.
>
>
>
>
> Guozhang
>
>
>
>
> On Thu, Nov 24, 2016 at 7:45 PM, Ghosh, Achintya (Contractor) < 
> Achintya_Ghosh@comcast.com> wrote:
>
> > Java consumer. 0.9.1
> >
> > Thanks
> > Achintya
> >
> > -----Original Message-----
> > From: Guozhang Wang [mailto:wangguoz@gmail.com]
> > Sent: Thursday, November 24, 2016 8:28 PM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka consumers are not equally distributed
> >
> > Which version of Kafka are you using with your consumer? Is it Scala 
> > or Java consumers?
> >
> >
> > Guozhang
> >
> >
> > On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) < 
> > Achintya_Ghosh@comcast.com> wrote:
> >
> > > No, that is not the reason. Initially all the partitions were 
> > > assigned the messages and those were processed very fast and sit 
> > > idle even other partitions  are having a lot of messages to be
> processed.
> > > So I was under impression  that rebalance should be triggered and 
> > > messages will be re-distributed equally again.
> > >
> > > Thanks
> > > Achintya
> > >
> > > -----Original Message-----
> > > From: Sharninder [mailto:sharninder@gmail.com]
> > > Sent: Wednesday, November 23, 2016 12:33 AM
> > > To: users@kafka.apache.org
> > > Cc: dev@kafka.apache.org
> > > Subject: Re: Kafka consumers are not equally distributed
> > >
> > > Could it be because of the partition key ?
> > >
> > > On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < 
> > > Achintya_Ghosh@comcast.com> wrote:
> > >
> > > > Hi there,
> > > >
> > > > We are doing the load test in Kafka with 25tps and first 9 hours 
> > > > it went fine almost 80K/hr messages were processed after that we 
> > > > see a lot of lags and we stopped the incoming load.
> > > >
> > > > Currently we see 15K/hr messages are processing. We have 40 
> > > > consumer instances with concurrency 4 and 2 topics and both is 
> > > > having 160 partitions so each consumer with each partition.
> > > >
> > > > What we found that some of the partitions are sitting idle and 
> > > > some of are overloaded and its really slowing down the consumer 
> > > > message
> > > processing.
> > > >
> > > > Why rebalancing is not happening and existing messages are not 
> > > > distributed equally among the instances? We tried to restart the 
> > > > app still the same pace. Any idea what could be the reason?
> > > >
> > > > Thanks
> > > > Achintya
> > > >
> > > >
> > >
> > >
> > > --
> > > --
> > > Sharninder
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



--
-- Guozhang

Re: Kafka consumers are not equally distributed

Posted by Guozhang Wang <wa...@gmail.com>.
Note that consumer's fetching parallelism is per-partition, i.e., one
partition is fetched by only a single consumer instance, so even if some
partitions have heavy load other idle consumers will not come to share the
messages.

If you observed that some partitions have no messages while others have a
lot, then it means the producing load on the partitions are not evenly
distributed, as I mentioned in the previous comment it is not a consumer
issue but a producer issue.


Guozhang

On Fri, Nov 25, 2016 at 7:11 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> Thank you Guozhang.
>
> Let me clarify : "some of the partitions are sitting idle and some of are
> overloaded", I mean we stopped the load after 9 hours as see the messages
> were processing very slow. That time we observed that some partitions had
> lot of messages and some were sitting idle. So my question why messages
> were not shared if we see some are overloaded and some are having 0
> messages. Even we started the kafka servers and application servers too but
> nothing happened, still it was processing very slow and messages were not
> distributed. So we are concerned what should do this kind of situation and
> make the consumers more speedy.
>
> Thanks
> Achintya
>
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Thursday, November 24, 2016 11:21 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> The default partition assignment strategy is the RangePartitioner. Note it
> is per-topic, so if you use the default partitioner then in your case 160
> partitions of each of the topic will be assigned to the first 160 consumer
> instances, each getting two partitions, one partition from each. So the
> consumer should be balanced  on the consumer-instance basis.
>
> I'm not sure what you meant by "some of the partitions are sitting idle
> and some of are overloaded", do you mean that some partitions does not have
> new data coming in and others keep getting high traffic producing to it
> that the consumer cannot keep up? In this case it is no the consumer's
> issue, but the producer not producing in a balanced manner.
>
>
>
>
> Guozhang
>
>
>
>
> On Thu, Nov 24, 2016 at 7:45 PM, Ghosh, Achintya (Contractor) <
> Achintya_Ghosh@comcast.com> wrote:
>
> > Java consumer. 0.9.1
> >
> > Thanks
> > Achintya
> >
> > -----Original Message-----
> > From: Guozhang Wang [mailto:wangguoz@gmail.com]
> > Sent: Thursday, November 24, 2016 8:28 PM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka consumers are not equally distributed
> >
> > Which version of Kafka are you using with your consumer? Is it Scala
> > or Java consumers?
> >
> >
> > Guozhang
> >
> >
> > On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) <
> > Achintya_Ghosh@comcast.com> wrote:
> >
> > > No, that is not the reason. Initially all the partitions were
> > > assigned the messages and those were processed very fast and sit
> > > idle even other partitions  are having a lot of messages to be
> processed.
> > > So I was under impression  that rebalance should be triggered and
> > > messages will be re-distributed equally again.
> > >
> > > Thanks
> > > Achintya
> > >
> > > -----Original Message-----
> > > From: Sharninder [mailto:sharninder@gmail.com]
> > > Sent: Wednesday, November 23, 2016 12:33 AM
> > > To: users@kafka.apache.org
> > > Cc: dev@kafka.apache.org
> > > Subject: Re: Kafka consumers are not equally distributed
> > >
> > > Could it be because of the partition key ?
> > >
> > > On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) <
> > > Achintya_Ghosh@comcast.com> wrote:
> > >
> > > > Hi there,
> > > >
> > > > We are doing the load test in Kafka with 25tps and first 9 hours
> > > > it went fine almost 80K/hr messages were processed after that we
> > > > see a lot of lags and we stopped the incoming load.
> > > >
> > > > Currently we see 15K/hr messages are processing. We have 40
> > > > consumer instances with concurrency 4 and 2 topics and both is
> > > > having 160 partitions so each consumer with each partition.
> > > >
> > > > What we found that some of the partitions are sitting idle and
> > > > some of are overloaded and its really slowing down the consumer
> > > > message
> > > processing.
> > > >
> > > > Why rebalancing is not happening and existing messages are not
> > > > distributed equally among the instances? We tried to restart the
> > > > app still the same pace. Any idea what could be the reason?
> > > >
> > > > Thanks
> > > > Achintya
> > > >
> > > >
> > >
> > >
> > > --
> > > --
> > > Sharninder
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang

RE: Kafka consumers are not equally distributed

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Thank you Guozhang.

Let me clarify : "some of the partitions are sitting idle and some of are overloaded", I mean we stopped the load after 9 hours as see the messages were processing very slow. That time we observed that some partitions had lot of messages and some were sitting idle. So my question why messages were not shared if we see some are overloaded and some are having 0 messages. Even we started the kafka servers and application servers too but nothing happened, still it was processing very slow and messages were not distributed. So we are concerned what should do this kind of situation and make the consumers more speedy.

Thanks
Achintya

-----Original Message-----
From: Guozhang Wang [mailto:wangguoz@gmail.com] 
Sent: Thursday, November 24, 2016 11:21 PM
To: users@kafka.apache.org
Subject: Re: Kafka consumers are not equally distributed

The default partition assignment strategy is the RangePartitioner. Note it is per-topic, so if you use the default partitioner then in your case 160 partitions of each of the topic will be assigned to the first 160 consumer instances, each getting two partitions, one partition from each. So the consumer should be balanced  on the consumer-instance basis.

I'm not sure what you meant by "some of the partitions are sitting idle and some of are overloaded", do you mean that some partitions does not have new data coming in and others keep getting high traffic producing to it that the consumer cannot keep up? In this case it is no the consumer's issue, but the producer not producing in a balanced manner.




Guozhang




On Thu, Nov 24, 2016 at 7:45 PM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Java consumer. 0.9.1
>
> Thanks
> Achintya
>
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Thursday, November 24, 2016 8:28 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> Which version of Kafka are you using with your consumer? Is it Scala 
> or Java consumers?
>
>
> Guozhang
>
>
> On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) < 
> Achintya_Ghosh@comcast.com> wrote:
>
> > No, that is not the reason. Initially all the partitions were 
> > assigned the messages and those were processed very fast and sit 
> > idle even other partitions  are having a lot of messages to be processed.
> > So I was under impression  that rebalance should be triggered and 
> > messages will be re-distributed equally again.
> >
> > Thanks
> > Achintya
> >
> > -----Original Message-----
> > From: Sharninder [mailto:sharninder@gmail.com]
> > Sent: Wednesday, November 23, 2016 12:33 AM
> > To: users@kafka.apache.org
> > Cc: dev@kafka.apache.org
> > Subject: Re: Kafka consumers are not equally distributed
> >
> > Could it be because of the partition key ?
> >
> > On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < 
> > Achintya_Ghosh@comcast.com> wrote:
> >
> > > Hi there,
> > >
> > > We are doing the load test in Kafka with 25tps and first 9 hours 
> > > it went fine almost 80K/hr messages were processed after that we 
> > > see a lot of lags and we stopped the incoming load.
> > >
> > > Currently we see 15K/hr messages are processing. We have 40 
> > > consumer instances with concurrency 4 and 2 topics and both is 
> > > having 160 partitions so each consumer with each partition.
> > >
> > > What we found that some of the partitions are sitting idle and 
> > > some of are overloaded and its really slowing down the consumer 
> > > message
> > processing.
> > >
> > > Why rebalancing is not happening and existing messages are not 
> > > distributed equally among the instances? We tried to restart the 
> > > app still the same pace. Any idea what could be the reason?
> > >
> > > Thanks
> > > Achintya
> > >
> > >
> >
> >
> > --
> > --
> > Sharninder
> >
>
>
>
> --
> -- Guozhang
>



--
-- Guozhang

Re: Kafka consumers are not equally distributed

Posted by Guozhang Wang <wa...@gmail.com>.
The default partition assignment strategy is the RangePartitioner. Note it
is per-topic, so if you use the default partitioner then in your case 160
partitions of each of the topic will be assigned to the first 160 consumer
instances, each getting two partitions, one partition from each. So the
consumer should be balanced  on the consumer-instance basis.

I'm not sure what you meant by "some of the partitions are sitting idle and
some of are overloaded", do you mean that some partitions does not have new
data coming in and others keep getting high traffic producing to it that
the consumer cannot keep up? In this case it is no the consumer's issue,
but the producer not producing in a balanced manner.




Guozhang




On Thu, Nov 24, 2016 at 7:45 PM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> Java consumer. 0.9.1
>
> Thanks
> Achintya
>
> -----Original Message-----
> From: Guozhang Wang [mailto:wangguoz@gmail.com]
> Sent: Thursday, November 24, 2016 8:28 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> Which version of Kafka are you using with your consumer? Is it Scala or
> Java consumers?
>
>
> Guozhang
>
>
> On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) <
> Achintya_Ghosh@comcast.com> wrote:
>
> > No, that is not the reason. Initially all the partitions were assigned
> > the messages and those were processed very fast and sit idle even
> > other partitions  are having a lot of messages to be processed.
> > So I was under impression  that rebalance should be triggered and
> > messages will be re-distributed equally again.
> >
> > Thanks
> > Achintya
> >
> > -----Original Message-----
> > From: Sharninder [mailto:sharninder@gmail.com]
> > Sent: Wednesday, November 23, 2016 12:33 AM
> > To: users@kafka.apache.org
> > Cc: dev@kafka.apache.org
> > Subject: Re: Kafka consumers are not equally distributed
> >
> > Could it be because of the partition key ?
> >
> > On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) <
> > Achintya_Ghosh@comcast.com> wrote:
> >
> > > Hi there,
> > >
> > > We are doing the load test in Kafka with 25tps and first 9 hours it
> > > went fine almost 80K/hr messages were processed after that we see a
> > > lot of lags and we stopped the incoming load.
> > >
> > > Currently we see 15K/hr messages are processing. We have 40 consumer
> > > instances with concurrency 4 and 2 topics and both is having 160
> > > partitions so each consumer with each partition.
> > >
> > > What we found that some of the partitions are sitting idle and some
> > > of are overloaded and its really slowing down the consumer message
> > processing.
> > >
> > > Why rebalancing is not happening and existing messages are not
> > > distributed equally among the instances? We tried to restart the app
> > > still the same pace. Any idea what could be the reason?
> > >
> > > Thanks
> > > Achintya
> > >
> > >
> >
> >
> > --
> > --
> > Sharninder
> >
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang

RE: Kafka consumers are not equally distributed

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Java consumer. 0.9.1

Thanks
Achintya

-----Original Message-----
From: Guozhang Wang [mailto:wangguoz@gmail.com] 
Sent: Thursday, November 24, 2016 8:28 PM
To: users@kafka.apache.org
Subject: Re: Kafka consumers are not equally distributed

Which version of Kafka are you using with your consumer? Is it Scala or Java consumers?


Guozhang


On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> No, that is not the reason. Initially all the partitions were assigned 
> the messages and those were processed very fast and sit idle even 
> other partitions  are having a lot of messages to be processed.
> So I was under impression  that rebalance should be triggered and 
> messages will be re-distributed equally again.
>
> Thanks
> Achintya
>
> -----Original Message-----
> From: Sharninder [mailto:sharninder@gmail.com]
> Sent: Wednesday, November 23, 2016 12:33 AM
> To: users@kafka.apache.org
> Cc: dev@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> Could it be because of the partition key ?
>
> On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < 
> Achintya_Ghosh@comcast.com> wrote:
>
> > Hi there,
> >
> > We are doing the load test in Kafka with 25tps and first 9 hours it 
> > went fine almost 80K/hr messages were processed after that we see a 
> > lot of lags and we stopped the incoming load.
> >
> > Currently we see 15K/hr messages are processing. We have 40 consumer 
> > instances with concurrency 4 and 2 topics and both is having 160 
> > partitions so each consumer with each partition.
> >
> > What we found that some of the partitions are sitting idle and some 
> > of are overloaded and its really slowing down the consumer message
> processing.
> >
> > Why rebalancing is not happening and existing messages are not 
> > distributed equally among the instances? We tried to restart the app 
> > still the same pace. Any idea what could be the reason?
> >
> > Thanks
> > Achintya
> >
> >
>
>
> --
> --
> Sharninder
>



--
-- Guozhang

Re: Kafka consumers are not equally distributed

Posted by Guozhang Wang <wa...@gmail.com>.
Which version of Kafka are you using with your consumer? Is it Scala or
Java consumers?


Guozhang


On Wed, Nov 23, 2016 at 6:38 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> No, that is not the reason. Initially all the partitions were assigned the
> messages and those were processed very fast and sit idle even other
> partitions  are having a lot of messages to be processed.
> So I was under impression  that rebalance should be triggered and messages
> will be re-distributed equally again.
>
> Thanks
> Achintya
>
> -----Original Message-----
> From: Sharninder [mailto:sharninder@gmail.com]
> Sent: Wednesday, November 23, 2016 12:33 AM
> To: users@kafka.apache.org
> Cc: dev@kafka.apache.org
> Subject: Re: Kafka consumers are not equally distributed
>
> Could it be because of the partition key ?
>
> On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) <
> Achintya_Ghosh@comcast.com> wrote:
>
> > Hi there,
> >
> > We are doing the load test in Kafka with 25tps and first 9 hours it
> > went fine almost 80K/hr messages were processed after that we see a
> > lot of lags and we stopped the incoming load.
> >
> > Currently we see 15K/hr messages are processing. We have 40 consumer
> > instances with concurrency 4 and 2 topics and both is having 160
> > partitions so each consumer with each partition.
> >
> > What we found that some of the partitions are sitting idle and some of
> > are overloaded and its really slowing down the consumer message
> processing.
> >
> > Why rebalancing is not happening and existing messages are not
> > distributed equally among the instances? We tried to restart the app
> > still the same pace. Any idea what could be the reason?
> >
> > Thanks
> > Achintya
> >
> >
>
>
> --
> --
> Sharninder
>



-- 
-- Guozhang

RE: Kafka consumers are not equally distributed

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
No, that is not the reason. Initially all the partitions were assigned the messages and those were processed very fast and sit idle even other partitions  are having a lot of messages to be processed.
So I was under impression  that rebalance should be triggered and messages will be re-distributed equally again.

Thanks
Achintya 

-----Original Message-----
From: Sharninder [mailto:sharninder@gmail.com] 
Sent: Wednesday, November 23, 2016 12:33 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumers are not equally distributed

Could it be because of the partition key ?

On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are doing the load test in Kafka with 25tps and first 9 hours it 
> went fine almost 80K/hr messages were processed after that we see a 
> lot of lags and we stopped the incoming load.
>
> Currently we see 15K/hr messages are processing. We have 40 consumer 
> instances with concurrency 4 and 2 topics and both is having 160 
> partitions so each consumer with each partition.
>
> What we found that some of the partitions are sitting idle and some of 
> are overloaded and its really slowing down the consumer message processing.
>
> Why rebalancing is not happening and existing messages are not 
> distributed equally among the instances? We tried to restart the app 
> still the same pace. Any idea what could be the reason?
>
> Thanks
> Achintya
>
>


--
--
Sharninder

RE: Kafka consumers are not equally distributed

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
No, that is not the reason. Initially all the partitions were assigned the messages and those were processed very fast and sit idle even other partitions  are having a lot of messages to be processed.
So I was under impression  that rebalance should be triggered and messages will be re-distributed equally again.

Thanks
Achintya 

-----Original Message-----
From: Sharninder [mailto:sharninder@gmail.com] 
Sent: Wednesday, November 23, 2016 12:33 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumers are not equally distributed

Could it be because of the partition key ?

On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are doing the load test in Kafka with 25tps and first 9 hours it 
> went fine almost 80K/hr messages were processed after that we see a 
> lot of lags and we stopped the incoming load.
>
> Currently we see 15K/hr messages are processing. We have 40 consumer 
> instances with concurrency 4 and 2 topics and both is having 160 
> partitions so each consumer with each partition.
>
> What we found that some of the partitions are sitting idle and some of 
> are overloaded and its really slowing down the consumer message processing.
>
> Why rebalancing is not happening and existing messages are not 
> distributed equally among the instances? We tried to restart the app 
> still the same pace. Any idea what could be the reason?
>
> Thanks
> Achintya
>
>


--
--
Sharninder

Re: Kafka consumers are not equally distributed

Posted by Sharninder <sh...@gmail.com>.
Could it be because of the partition key ?

On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are doing the load test in Kafka with 25tps and first 9 hours it went
> fine almost 80K/hr messages were processed after that we see a lot of lags
> and we stopped the incoming load.
>
> Currently we see 15K/hr messages are processing. We have 40 consumer
> instances with concurrency 4 and 2 topics and both is having 160 partitions
> so each consumer with each partition.
>
> What we found that some of the partitions are sitting idle and some of are
> overloaded and its really slowing down the consumer message processing.
>
> Why rebalancing is not happening and existing messages are not distributed
> equally among the instances? We tried to restart the app still the same
> pace. Any idea what could be the reason?
>
> Thanks
> Achintya
>
>


-- 
--
Sharninder

Re: Kafka consumers are not equally distributed

Posted by Sharninder <sh...@gmail.com>.
Could it be because of the partition key ?

On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are doing the load test in Kafka with 25tps and first 9 hours it went
> fine almost 80K/hr messages were processed after that we see a lot of lags
> and we stopped the incoming load.
>
> Currently we see 15K/hr messages are processing. We have 40 consumer
> instances with concurrency 4 and 2 topics and both is having 160 partitions
> so each consumer with each partition.
>
> What we found that some of the partitions are sitting idle and some of are
> overloaded and its really slowing down the consumer message processing.
>
> Why rebalancing is not happening and existing messages are not distributed
> equally among the instances? We tried to restart the app still the same
> pace. Any idea what could be the reason?
>
> Thanks
> Achintya
>
>


-- 
--
Sharninder