You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Harold Nguyen <ha...@nexgate.com> on 2014/12/15 19:29:22 UTC
Kafka design pattern question - multiple user ids
Hello Kafka Experts!
Sorry if this has been answered before - I was hoping for a quick response
to a naive questions for a newbie like myself!
If I have multiple users, how do I split the streams so that they
correspond with different user ids ?
Suppose I have tens of thousands of user ids that I want to keep track of.
Is there a way to write to Kafka and associate a "key" with it ? (The key
being the user id?) Or is there a better way to do this ?
Thanks so much for your time!
Harold
Re: Kafka design pattern question - multiple user ids
Posted by Jayesh Thakrar <j_...@yahoo.com.INVALID>.
Some more things to think about:What is the data volume you are dealing with?Do you need to have multiple partitions to support the data/throughput?Are you looking at each partition to be dedicated to a single user or a group of users?Is the data balanced across all your users or is it skewed?How do you forsee things changing in the future?
It might be worthwhile to consider a composite key - user-id and some other data element (say time as as a naive choice) and then use hash partition.
From: Gwen Shapira <gs...@cloudera.com>
To: "users@kafka.apache.org" <us...@kafka.apache.org>
Sent: Monday, December 15, 2014 12:55 PM
Subject: Re: Kafka design pattern question - multiple user ids
AFAIK, you can have as many keys as you want - but if you are looking
to have a separate partition for each key, you are more limited. I
can't give an exact limit since it depends on multiple factors, but
probably not over 10,000 (and even 1000 for a single topic can be
"pushing it" in some cases).
I recommend using HashPartition for placing multiple user_ids in one
partition while making sure that all messages for this user will go to
the same partition.
On Mon, Dec 15, 2014 at 10:48 AM, Harold Nguyen <ha...@nexgate.com> wrote:
> Hi Gwen,
>
> Thanks for the great and fast reply! How many different keys can Kafka
> support ?
>
> Harold
>
> On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira <gs...@cloudera.com>
> wrote:
>>
>> When you send messages to Kafka you send a <key,value> pair. The key
>> can include the user id.
>>
>> Here's how:
>>
>> KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
>> (user_id, user_id, event);
>>
>> producer.send(data);
>>
>> Hope this helps,
>> Gwen
>>
>> On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com>
>> wrote:
>> > Hello Kafka Experts!
>> >
>> > Sorry if this has been answered before - I was hoping for a quick
>> response
>> > to a naive questions for a newbie like myself!
>> >
>> > If I have multiple users, how do I split the streams so that they
>> > correspond with different user ids ?
>> >
>> > Suppose I have tens of thousands of user ids that I want to keep track
>> of.
>> > Is there a way to write to Kafka and associate a "key" with it ? (The key
>> > being the user id?) Or is there a better way to do this ?
>> >
>> > Thanks so much for your time!
>> >
>> > Harold
>>
Re: Kafka design pattern question - multiple user ids
Posted by Gwen Shapira <gs...@cloudera.com>.
AFAIK, you can have as many keys as you want - but if you are looking
to have a separate partition for each key, you are more limited. I
can't give an exact limit since it depends on multiple factors, but
probably not over 10,000 (and even 1000 for a single topic can be
"pushing it" in some cases).
I recommend using HashPartition for placing multiple user_ids in one
partition while making sure that all messages for this user will go to
the same partition.
On Mon, Dec 15, 2014 at 10:48 AM, Harold Nguyen <ha...@nexgate.com> wrote:
> Hi Gwen,
>
> Thanks for the great and fast reply! How many different keys can Kafka
> support ?
>
> Harold
>
> On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira <gs...@cloudera.com>
> wrote:
>>
>> When you send messages to Kafka you send a <key,value> pair. The key
>> can include the user id.
>>
>> Here's how:
>>
>> KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
>> (user_id, user_id, event);
>>
>> producer.send(data);
>>
>> Hope this helps,
>> Gwen
>>
>> On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com>
>> wrote:
>> > Hello Kafka Experts!
>> >
>> > Sorry if this has been answered before - I was hoping for a quick
>> response
>> > to a naive questions for a newbie like myself!
>> >
>> > If I have multiple users, how do I split the streams so that they
>> > correspond with different user ids ?
>> >
>> > Suppose I have tens of thousands of user ids that I want to keep track
>> of.
>> > Is there a way to write to Kafka and associate a "key" with it ? (The key
>> > being the user id?) Or is there a better way to do this ?
>> >
>> > Thanks so much for your time!
>> >
>> > Harold
>>
Re: Kafka design pattern question - multiple user ids
Posted by Harold Nguyen <ha...@nexgate.com>.
Hi Gwen,
Thanks for the great and fast reply! How many different keys can Kafka
support ?
Harold
On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira <gs...@cloudera.com>
wrote:
>
> When you send messages to Kafka you send a <key,value> pair. The key
> can include the user id.
>
> Here's how:
>
> KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
> (user_id, user_id, event);
>
> producer.send(data);
>
> Hope this helps,
> Gwen
>
> On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com>
> wrote:
> > Hello Kafka Experts!
> >
> > Sorry if this has been answered before - I was hoping for a quick
> response
> > to a naive questions for a newbie like myself!
> >
> > If I have multiple users, how do I split the streams so that they
> > correspond with different user ids ?
> >
> > Suppose I have tens of thousands of user ids that I want to keep track
> of.
> > Is there a way to write to Kafka and associate a "key" with it ? (The key
> > being the user id?) Or is there a better way to do this ?
> >
> > Thanks so much for your time!
> >
> > Harold
>
Re: Kafka design pattern question - multiple user ids
Posted by Gwen Shapira <gs...@cloudera.com>.
When you send messages to Kafka you send a <key,value> pair. The key
can include the user id.
Here's how:
KeyedMessage<String, byte[]> data = new KeyedMessage<String, byte[]>
(user_id, user_id, event);
producer.send(data);
Hope this helps,
Gwen
On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen <ha...@nexgate.com> wrote:
> Hello Kafka Experts!
>
> Sorry if this has been answered before - I was hoping for a quick response
> to a naive questions for a newbie like myself!
>
> If I have multiple users, how do I split the streams so that they
> correspond with different user ids ?
>
> Suppose I have tens of thousands of user ids that I want to keep track of.
> Is there a way to write to Kafka and associate a "key" with it ? (The key
> being the user id?) Or is there a better way to do this ?
>
> Thanks so much for your time!
>
> Harold