You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Joseph Pachod <jo...@gmail.com> on 2015/01/07 00:14:05 UTC

Is it possible to enforce an "unique constraint" through Kafka?

Hi

Having read a lot about kafka and its use at linkedin, I'm still unsure
whether Kafka can be used, with some mindset change for sure, as a general
purpose data store.

For example, would someone use Kafka to enforce an "unique constraint"?

A simple use case is, in the case of linkedin, unicity of users' login.

What would be you recommended implementation for such a need?

Thanks in advance

Best,
Joseph

Re: Is it possible to enforce an "unique constraint" through Kafka?

Posted by Joseph Pachod <jo...@gmail.com>.
Thanks for your answers.

@Mark
Well, basically we agree. My question was more to figure out the limits of
kafka, that's why I picked unicity to figure this out. Unicity doesn't
imply ACID, yet it's already way more than a stream. I was wondering if
some clever trick could allow to achieve it.

Actually, one point I'm still unsure about kafka is whether, at linkedin,
they stopped using their oracle cluster. I guess they kept it, albeit in a
different way, a CQRS one I would say, for the Command part, and properly
separated by domains. But I would love a firm answer on this question...
(hence the unicity question, which seems like a must have feature for
linkedin's usecases).

@Todd
Well, compaction doesn't help in reaching unicity AFAIK. One can make sure
to have the latest email for an user, not that a new one wasn't used
before. Am I wrong somehowe?

On Wed, Jan 7, 2015 at 3:20 AM, Todd Hughes <ju...@hotmail.com> wrote:

> Log compaction though allows it to work as a data store quite well for
> some use cases .  It's exactly why I started looking hard at Kafka lately.
>
> "The general idea is quite simple. Rather than maintaining only recent
> log entries in the log and throwing away old log segments we maintain
> the most recent entry for each unique key. This ensures that the log
> contains a complete dataset and can be used for reloading key-based
> state."
> https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction
>
> > Date: Tue, 6 Jan 2015 16:34:06 -0800
> > Subject: Re: Is it possible to enforce an "unique constraint" through
> Kafka?
> > From: wizzat@gmail.com
> > To: users@kafka.apache.org
> >
> > Kafka is more of a message queue than a data store. You can use it to
> store
> > history of the queue (certainly a powerful use case for disaster
> recovery),
> > but it's still not really a data store.
> >
> > From the Kafka website (kafka.apache.org):
> > Apache Kafka is a publish-subscribe messaging [queue] rethought as a
> > distributed commit log.
> >
> > -Mark
> >
> > On Tue, Jan 6, 2015 at 3:14 PM, Joseph Pachod <jo...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > Having read a lot about kafka and its use at linkedin, I'm still unsure
> > > whether Kafka can be used, with some mindset change for sure, as a
> general
> > > purpose data store.
> > >
> > > For example, would someone use Kafka to enforce an "unique constraint"?
> > >
> > > A simple use case is, in the case of linkedin, unicity of users' login.
> > >
> > > What would be you recommended implementation for such a need?
> > >
> > > Thanks in advance
> > >
> > > Best,
> > > Joseph
> > >
>
>

RE: Is it possible to enforce an "unique constraint" through Kafka?

Posted by Todd Hughes <ju...@hotmail.com>.
Log compaction though allows it to work as a data store quite well for some use cases .  It's exactly why I started looking hard at Kafka lately.

"The general idea is quite simple. Rather than maintaining only recent 
log entries in the log and throwing away old log segments we maintain 
the most recent entry for each unique key. This ensures that the log 
contains a complete dataset and can be used for reloading key-based 
state."
https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction

> Date: Tue, 6 Jan 2015 16:34:06 -0800
> Subject: Re: Is it possible to enforce an "unique constraint" through Kafka?
> From: wizzat@gmail.com
> To: users@kafka.apache.org
> 
> Kafka is more of a message queue than a data store. You can use it to store
> history of the queue (certainly a powerful use case for disaster recovery),
> but it's still not really a data store.
> 
> From the Kafka website (kafka.apache.org):
> Apache Kafka is a publish-subscribe messaging [queue] rethought as a
> distributed commit log.
> 
> -Mark
> 
> On Tue, Jan 6, 2015 at 3:14 PM, Joseph Pachod <jo...@gmail.com>
> wrote:
> 
> > Hi
> >
> > Having read a lot about kafka and its use at linkedin, I'm still unsure
> > whether Kafka can be used, with some mindset change for sure, as a general
> > purpose data store.
> >
> > For example, would someone use Kafka to enforce an "unique constraint"?
> >
> > A simple use case is, in the case of linkedin, unicity of users' login.
> >
> > What would be you recommended implementation for such a need?
> >
> > Thanks in advance
> >
> > Best,
> > Joseph
> >
 		 	   		  

Re: Is it possible to enforce an "unique constraint" through Kafka?

Posted by Mark Roberts <wi...@gmail.com>.
Kafka is more of a message queue than a data store. You can use it to store
history of the queue (certainly a powerful use case for disaster recovery),
but it's still not really a data store.

>From the Kafka website (kafka.apache.org):
Apache Kafka is a publish-subscribe messaging [queue] rethought as a
distributed commit log.

-Mark

On Tue, Jan 6, 2015 at 3:14 PM, Joseph Pachod <jo...@gmail.com>
wrote:

> Hi
>
> Having read a lot about kafka and its use at linkedin, I'm still unsure
> whether Kafka can be used, with some mindset change for sure, as a general
> purpose data store.
>
> For example, would someone use Kafka to enforce an "unique constraint"?
>
> A simple use case is, in the case of linkedin, unicity of users' login.
>
> What would be you recommended implementation for such a need?
>
> Thanks in advance
>
> Best,
> Joseph
>