You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Harold Nguyen <ha...@nexgate.com> on 2014/12/15 19:29:22 UTC

Kafka design pattern question - multiple user ids

Hello Kafka Experts!

Sorry if this has been answered before - I was hoping for a quick response
to a naive questions for a newbie like myself!

If I have multiple users, how do I split the streams so that they
correspond with different user ids ?

Suppose I have tens of thousands of user ids that I want to keep track of.
Is there a way to write to Kafka and associate a "key" with it ? (The key
being the user id?) Or is there a better way to do this ?

Thanks so much for your time!

Harold

Re: Kafka design pattern question - multiple user ids

Posted by Jayesh Thakrar <j_...@yahoo.com.INVALID>.
Some more things to think about:What is the data volume you are dealing with?Do you need to have multiple partitions to support the data/throughput?Are you looking at each partition to be dedicated to a single user or a group of users?Is the data balanced across all your users or is it skewed?How do you forsee things changing in the future?
It might be worthwhile to consider a composite key - user-id and some other data element (say time as as a naive choice) and then use hash partition.
      From: Gwen Shapira <gs...@cloudera.com>
 To: "users@kafka.apache.org" <us...@kafka.apache.org> 
 Sent: Monday, December 15, 2014 12:55 PM
 Subject: Re: Kafka design pattern question - multiple user ids
   
AFAIK, you can have as many keys as you want - but if you are looking
to have a separate partition for each key, you are more limited. I
can't give an exact limit since it depends on multiple factors, but
probably not over 10,000 (and even 1000 for a single topic can be
"pushing it" in some cases).

I recommend using HashPartition for placing multiple user_ids in one
partition while making sure that all messages for this user will go to
the same partition.



On Mon, Dec 15, 2014 at 10:48 AM, Harold Nguyen <ha...@nexgate.com> wrote:
> Hi Gwen,
>
> Thanks for the great and fast reply! How many different keys can Kafka
> support ?
>
> Harold
>
> On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira <gs...@cloudera.com>
> wrote:
>>
>> When you send messages to Kafka you send a <key,value> pair. The key
>> can include the user id.
>>
>> Here's how:
>>
>> KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
>>          (user_id, user_id, event);
>>
>> producer.send(data);
>>
>> Hope this helps,
>> Gwen
>>
>> On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com>
>> wrote:
>> > Hello Kafka Experts!
>> >
>> > Sorry if this has been answered before - I was hoping for a quick
>> response
>> > to a naive questions for a newbie like myself!
>> >
>> > If I have multiple users, how do I split the streams so that they
>> > correspond with different user ids ?
>> >
>> > Suppose I have tens of thousands of user ids that I want to keep track
>> of.
>> > Is there a way to write to Kafka and associate a "key" with it ? (The key
>> > being the user id?) Or is there a better way to do this ?
>> >
>> > Thanks so much for your time!
>> >
>> > Harold
>>


  

Re: Kafka design pattern question - multiple user ids

Posted by Gwen Shapira <gs...@cloudera.com>.
AFAIK, you can have as many keys as you want - but if you are looking
to have a separate partition for each key, you are more limited. I
can't give an exact limit since it depends on multiple factors, but
probably not over 10,000 (and even 1000 for a single topic can be
"pushing it" in some cases).

I recommend using HashPartition for placing multiple user_ids in one
partition while making sure that all messages for this user will go to
the same partition.

On Mon, Dec 15, 2014 at 10:48 AM, Harold Nguyen <ha...@nexgate.com> wrote:
> Hi Gwen,
>
> Thanks for the great and fast reply! How many different keys can Kafka
> support ?
>
> Harold
>
> On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira <gs...@cloudera.com>
> wrote:
>>
>> When you send messages to Kafka you send a <key,value> pair. The key
>> can include the user id.
>>
>> Here's how:
>>
>> KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
>>           (user_id, user_id, event);
>>
>> producer.send(data);
>>
>> Hope this helps,
>> Gwen
>>
>> On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com>
>> wrote:
>> > Hello Kafka Experts!
>> >
>> > Sorry if this has been answered before - I was hoping for a quick
>> response
>> > to a naive questions for a newbie like myself!
>> >
>> > If I have multiple users, how do I split the streams so that they
>> > correspond with different user ids ?
>> >
>> > Suppose I have tens of thousands of user ids that I want to keep track
>> of.
>> > Is there a way to write to Kafka and associate a "key" with it ? (The key
>> > being the user id?) Or is there a better way to do this ?
>> >
>> > Thanks so much for your time!
>> >
>> > Harold
>>

Re: Kafka design pattern question - multiple user ids

Posted by Harold Nguyen <ha...@nexgate.com>.
Hi Gwen,

Thanks for the great and fast reply! How many different keys can Kafka
support ?

Harold

On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira <gs...@cloudera.com>
wrote:
>
> When you send messages to Kafka you send a <key,value> pair. The key
> can include the user id.
>
> Here's how:
>
> KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
>           (user_id, user_id, event);
>
> producer.send(data);
>
> Hope this helps,
> Gwen
>
> On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com>
> wrote:
> > Hello Kafka Experts!
> >
> > Sorry if this has been answered before - I was hoping for a quick
> response
> > to a naive questions for a newbie like myself!
> >
> > If I have multiple users, how do I split the streams so that they
> > correspond with different user ids ?
> >
> > Suppose I have tens of thousands of user ids that I want to keep track
> of.
> > Is there a way to write to Kafka and associate a "key" with it ? (The key
> > being the user id?) Or is there a better way to do this ?
> >
> > Thanks so much for your time!
> >
> > Harold
>

Re: Kafka design pattern question - multiple user ids

Posted by Gwen Shapira <gs...@cloudera.com>.
When you send messages to Kafka you send a <key,value> pair. The key
can include the user id.

Here's how:

KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
          (user_id, user_id, event);

producer.send(data);

Hope this helps,
Gwen

On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com> wrote:
> Hello Kafka Experts!
>
> Sorry if this has been answered before - I was hoping for a quick response
> to a naive questions for a newbie like myself!
>
> If I have multiple users, how do I split the streams so that they
> correspond with different user ids ?
>
> Suppose I have tens of thousands of user ids that I want to keep track of.
> Is there a way to write to Kafka and associate a "key" with it ? (The key
> being the user id?) Or is there a better way to do this ?
>
> Thanks so much for your time!
>
> Harold