You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Pal, Satarupa" <Sa...@intuit.com> on 2018/08/29 05:47:38 UTC

Query related to Kafka Consumer Limit

Hi,

I am from Intuit. We want to use Kafka as message bus where Single Producer produces message and 1 Million Consumer listens it.


Requirement –


  1.  Single producer and 1 Million Consumer and one particular Topic with message.
  2.  When Pushed Message thru producer, should be received by all consumers
  3.  Consumers can be added any time and may be removed any time.

Query –


  1.  Can I use a Single Consumer Group for the above requirement?
  2.  Do I need to config 1 Million Partitions for all the Consumers manually? Or Kafka will automatically do load balancing?
  3.  Should Consumer need to subscribe every time, it listens?
  4.  Or should consumer need to assign itself for the particular topic?
  5.  Can all consumer listen to same host with post 9092 of Zoo Keeper?

Need help to finalize my design. I just did a POC with One topic and One consumer.

Thank you,
Satarupa


Re: Query related to Kafka Consumer Limit

Posted by "Pal, Satarupa" <Sa...@intuit.com>.
Hi Ryanne,

Yes, using WebSocket/REST API call is the another way to achieve as you mentioned.
But we wanted to check if Kafka can be also considered here.

We wanted to have PUSH model, where the moment message is published, consumer listens it and as KAFKA can do load balancing automatically, we thought that would be good to check.

Could you please let me know what kind of drawback you are seeing here if we use Kafka than any traditional pub-sub model?

Thank you,
Satarupa

From: Ryanne Dolan <ry...@gmail.com>
Date: Thursday, August 30, 2018 at 8:44 AM
To: "Pal, Satarupa" <Sa...@intuit.com>
Cc: "users@kafka.apache.org" <us...@kafka.apache.org>
Subject: Re: Query related to Kafka Consumer Limit

Satarupa,

Glad I could help, and thanks for the additional context. It doesn't sound like this use-case requires real-time notification. Why not just poll a web service periodically, say every 5 minutes?

If you need something more real-time, I'd suggest using a more traditional publish-subscribe model, not Kafka. Start with a web service that just returns the most recent message(s). Then have clients connect to a central "hub" which broadcasts notifications to all connected clients. That way, clients can poll the web server periodically and/or connect to the hub to be notified when the message changes.

Ryanne


On Wed, Aug 29, 2018 at 9:31 PM Pal, Satarupa <Sa...@intuit.com>> wrote:
Hi Ryanne,

Thank you so much for detailed explanation.

Here is couple of more asks -

1) Here Consumers are not short lived but we want to listen to the message and become idle. Is there a way to notify the Kafka server that message is reached to Consumer? So that post which server does not preserve the state.
2) If we turn off Auto-commit , and client is offline, then next time, when client is up, consumer will not get the message, right?
3) What about the performance here when there are 1 million Consumers are listening to the same topic?

Is it a good design to suggest ? To give some background, we want to push message to all the 1million consumers (each installed application will act as consumer here) when there is a Hit fix/ Critical fix is released. So that all our customers are notified to take the update.

Thank you,
Satarupa

On 8/29/18, 11:38 PM, "Ryanne Dolan" <ry...@gmail.com>> wrote:

    Satarupa, it sounds like you are conflating some concepts here. Some
    clarifying points:

    - Only one consumer in a consumer group receives any given record from a
    topic. So in your scenario of 1 million consumers, they could not be
    members of the same group. You'd need 1 million consumer "groups" to
    achieve this behavior.

    - You don't need 1 million partitions, unless you have a consumer group
    with 1 million consumers. You could do this with a single partition, since
    each consumer group is essentially a group of one.

    As hinted at above, one way to sorta achieve this is: have each consumer
    use a distinct consumer group, i.e. use a UUID as group id, s.t. each
    consumer is in a group of one. Then each consumer will receive every record
    in the topic.

    Kafka stores client state for each consumer -- an architecture which really
    isn't designed for millions of consumers. But it sounds like your clients
    are ephemeral, so perhaps they don't actually need to preserve state in
    Kafka. Maybe turn off auto-commit.

    Ryanne

    On Wed, Aug 29, 2018 at 12:47 AM Pal, Satarupa <Sa...@intuit.com>>
    wrote:

    > Hi,
    >
    > I am from Intuit. We want to use Kafka as message bus where Single
    > Producer produces message and 1 Million Consumer listens it.
    >
    >
    > Requirement –
    >
    >
    >   1.  Single producer and 1 Million Consumer and one particular Topic with
    > message.
    >   2.  When Pushed Message thru producer, should be received by all
    > consumers
    >   3.  Consumers can be added any time and may be removed any time.
    >
    > Query –
    >
    >
    >   1.  Can I use a Single Consumer Group for the above requirement?
    >   2.  Do I need to config 1 Million Partitions for all the Consumers
    > manually? Or Kafka will automatically do load balancing?
    >   3.  Should Consumer need to subscribe every time, it listens?
    >   4.  Or should consumer need to assign itself for the particular topic?
    >   5.  Can all consumer listen to same host with post 9092 of Zoo Keeper?
    >
    > Need help to finalize my design. I just did a POC with One topic and One
    > consumer.
    >
    > Thank you,
    > Satarupa
    >
    >


Re: Query related to Kafka Consumer Limit

Posted by Ryanne Dolan <ry...@gmail.com>.
Satarupa,

Glad I could help, and thanks for the additional context. It doesn't sound
like this use-case requires real-time notification. Why not just poll a web
service periodically, say every 5 minutes?

If you need something more real-time, I'd suggest using a more traditional
publish-subscribe model, not Kafka. Start with a web service that just
returns the most recent message(s). Then have clients connect to a central
"hub" which broadcasts notifications to all connected clients. That way,
clients can poll the web server periodically and/or connect to the hub to
be notified when the message changes.

Ryanne


On Wed, Aug 29, 2018 at 9:31 PM Pal, Satarupa <Sa...@intuit.com>
wrote:

> Hi Ryanne,
>
> Thank you so much for detailed explanation.
>
> Here is couple of more asks -
>
> 1) Here Consumers are not short lived but we want to listen to the message
> and become idle. Is there a way to notify the Kafka server that message is
> reached to Consumer? So that post which server does not preserve the state.
> 2) If we turn off Auto-commit , and client is offline, then next time,
> when client is up, consumer will not get the message, right?
> 3) What about the performance here when there are 1 million Consumers are
> listening to the same topic?
>
> Is it a good design to suggest ? To give some background, we want to push
> message to all the 1million consumers (each installed application will act
> as consumer here) when there is a Hit fix/ Critical fix is released. So
> that all our customers are notified to take the update.
>
> Thank you,
> Satarupa
>
> On 8/29/18, 11:38 PM, "Ryanne Dolan" <ry...@gmail.com> wrote:
>
>     Satarupa, it sounds like you are conflating some concepts here. Some
>     clarifying points:
>
>     - Only one consumer in a consumer group receives any given record from
> a
>     topic. So in your scenario of 1 million consumers, they could not be
>     members of the same group. You'd need 1 million consumer "groups" to
>     achieve this behavior.
>
>     - You don't need 1 million partitions, unless you have a consumer group
>     with 1 million consumers. You could do this with a single partition,
> since
>     each consumer group is essentially a group of one.
>
>     As hinted at above, one way to sorta achieve this is: have each
> consumer
>     use a distinct consumer group, i.e. use a UUID as group id, s.t. each
>     consumer is in a group of one. Then each consumer will receive every
> record
>     in the topic.
>
>     Kafka stores client state for each consumer -- an architecture which
> really
>     isn't designed for millions of consumers. But it sounds like your
> clients
>     are ephemeral, so perhaps they don't actually need to preserve state in
>     Kafka. Maybe turn off auto-commit.
>
>     Ryanne
>
>     On Wed, Aug 29, 2018 at 12:47 AM Pal, Satarupa <
> Satarupa_Pal@intuit.com>
>     wrote:
>
>     > Hi,
>     >
>     > I am from Intuit. We want to use Kafka as message bus where Single
>     > Producer produces message and 1 Million Consumer listens it.
>     >
>     >
>     > Requirement –
>     >
>     >
>     >   1.  Single producer and 1 Million Consumer and one particular
> Topic with
>     > message.
>     >   2.  When Pushed Message thru producer, should be received by all
>     > consumers
>     >   3.  Consumers can be added any time and may be removed any time.
>     >
>     > Query –
>     >
>     >
>     >   1.  Can I use a Single Consumer Group for the above requirement?
>     >   2.  Do I need to config 1 Million Partitions for all the Consumers
>     > manually? Or Kafka will automatically do load balancing?
>     >   3.  Should Consumer need to subscribe every time, it listens?
>     >   4.  Or should consumer need to assign itself for the particular
> topic?
>     >   5.  Can all consumer listen to same host with post 9092 of Zoo
> Keeper?
>     >
>     > Need help to finalize my design. I just did a POC with One topic and
> One
>     > consumer.
>     >
>     > Thank you,
>     > Satarupa
>     >
>     >
>
>
>

Re: Query related to Kafka Consumer Limit

Posted by "Pal, Satarupa" <Sa...@intuit.com>.
Hi Ryanne,

Thank you so much for detailed explanation.

Here is couple of more asks - 

1) Here Consumers are not short lived but we want to listen to the message and become idle. Is there a way to notify the Kafka server that message is reached to Consumer? So that post which server does not preserve the state.
2) If we turn off Auto-commit , and client is offline, then next time, when client is up, consumer will not get the message, right? 
3) What about the performance here when there are 1 million Consumers are listening to the same topic?

Is it a good design to suggest ? To give some background, we want to push message to all the 1million consumers (each installed application will act as consumer here) when there is a Hit fix/ Critical fix is released. So that all our customers are notified to take the update.

Thank you,
Satarupa

On 8/29/18, 11:38 PM, "Ryanne Dolan" <ry...@gmail.com> wrote:

    Satarupa, it sounds like you are conflating some concepts here. Some
    clarifying points:
    
    - Only one consumer in a consumer group receives any given record from a
    topic. So in your scenario of 1 million consumers, they could not be
    members of the same group. You'd need 1 million consumer "groups" to
    achieve this behavior.
    
    - You don't need 1 million partitions, unless you have a consumer group
    with 1 million consumers. You could do this with a single partition, since
    each consumer group is essentially a group of one.
    
    As hinted at above, one way to sorta achieve this is: have each consumer
    use a distinct consumer group, i.e. use a UUID as group id, s.t. each
    consumer is in a group of one. Then each consumer will receive every record
    in the topic.
    
    Kafka stores client state for each consumer -- an architecture which really
    isn't designed for millions of consumers. But it sounds like your clients
    are ephemeral, so perhaps they don't actually need to preserve state in
    Kafka. Maybe turn off auto-commit.
    
    Ryanne
    
    On Wed, Aug 29, 2018 at 12:47 AM Pal, Satarupa <Sa...@intuit.com>
    wrote:
    
    > Hi,
    >
    > I am from Intuit. We want to use Kafka as message bus where Single
    > Producer produces message and 1 Million Consumer listens it.
    >
    >
    > Requirement –
    >
    >
    >   1.  Single producer and 1 Million Consumer and one particular Topic with
    > message.
    >   2.  When Pushed Message thru producer, should be received by all
    > consumers
    >   3.  Consumers can be added any time and may be removed any time.
    >
    > Query –
    >
    >
    >   1.  Can I use a Single Consumer Group for the above requirement?
    >   2.  Do I need to config 1 Million Partitions for all the Consumers
    > manually? Or Kafka will automatically do load balancing?
    >   3.  Should Consumer need to subscribe every time, it listens?
    >   4.  Or should consumer need to assign itself for the particular topic?
    >   5.  Can all consumer listen to same host with post 9092 of Zoo Keeper?
    >
    > Need help to finalize my design. I just did a POC with One topic and One
    > consumer.
    >
    > Thank you,
    > Satarupa
    >
    >
    


Re: Query related to Kafka Consumer Limit

Posted by Ryanne Dolan <ry...@gmail.com>.
Satarupa, it sounds like you are conflating some concepts here. Some
clarifying points:

- Only one consumer in a consumer group receives any given record from a
topic. So in your scenario of 1 million consumers, they could not be
members of the same group. You'd need 1 million consumer "groups" to
achieve this behavior.

- You don't need 1 million partitions, unless you have a consumer group
with 1 million consumers. You could do this with a single partition, since
each consumer group is essentially a group of one.

As hinted at above, one way to sorta achieve this is: have each consumer
use a distinct consumer group, i.e. use a UUID as group id, s.t. each
consumer is in a group of one. Then each consumer will receive every record
in the topic.

Kafka stores client state for each consumer -- an architecture which really
isn't designed for millions of consumers. But it sounds like your clients
are ephemeral, so perhaps they don't actually need to preserve state in
Kafka. Maybe turn off auto-commit.

Ryanne

On Wed, Aug 29, 2018 at 12:47 AM Pal, Satarupa <Sa...@intuit.com>
wrote:

> Hi,
>
> I am from Intuit. We want to use Kafka as message bus where Single
> Producer produces message and 1 Million Consumer listens it.
>
>
> Requirement –
>
>
>   1.  Single producer and 1 Million Consumer and one particular Topic with
> message.
>   2.  When Pushed Message thru producer, should be received by all
> consumers
>   3.  Consumers can be added any time and may be removed any time.
>
> Query –
>
>
>   1.  Can I use a Single Consumer Group for the above requirement?
>   2.  Do I need to config 1 Million Partitions for all the Consumers
> manually? Or Kafka will automatically do load balancing?
>   3.  Should Consumer need to subscribe every time, it listens?
>   4.  Or should consumer need to assign itself for the particular topic?
>   5.  Can all consumer listen to same host with post 9092 of Zoo Keeper?
>
> Need help to finalize my design. I just did a POC with One topic and One
> consumer.
>
> Thank you,
> Satarupa
>
>