You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by howard chen <ho...@gmail.com> on 2012/10/08 18:08:57 UTC

What are the most significant differences between kestrel and kafka?

Hi,

Someone asked in Quora
(http://www.quora.com/Apache-Kafka/What-are-the-most-significant-differences-between-kestrel-and-kafka)
and I found this question is particularly interesting, since not many
people has the experience in dealing with these systems.

My understanding is as following, so pls correct me if I am wrong:

Same:
- Both are durable and queue's size is limited by disk storage instead of memory
- There are both very fast

Difference:

- No strict ordering kestrel
- Not transactional for kestrel
- No unique delivery guarantee for kestrel
- No zookeeper coordination for kestrel

So it seems to me that kafka is a more complete solution for message
streaming solution while kestrel is a simplistic implementation of
message queue, am I right?

Anything to add?
Thanks.

Re: What are the most significant differences between kestrel and kafka?

Posted by Jay Kreps <ja...@gmail.com>.
I am not very knowledgeable about Kestrel, but here are the differences I
am aware of:
1. Kafka keeps a single log for any number of subscribers. I think in
Kestrel you need to store each message once per subscriber. This makes
Kafka consumers very very cheap (basically just the bandwidth usage). We
take advantage of this even for "queue" processing to add monitoring
consumers, do an occational ad hoc "tail -f" on the queue and other things
like that.
2. Kestrel does not depend on Zookeeper which means it is operationally
less complex if you don't already have a zk installation.
3. As Even says Kafka has significantly better throughput.
4. Kestrel doesn't support ordered consumption.
5. As Even says, all messages in Kafka use consumer acknowledgement.
Kestrel supports this but it is a more expensive option, and the default is
"at most once delivery".
6. I think the consumer model for Kestrel requires either manually mapping
consumers to specific servers, or randomly load balancing consumers over
servers. Manual configuration is problematic because of the lack of
elasticity: if a server fails some consumers will be underutilized, and if
a consumer fails some server may be unconsumed. This may lead to having
to over-provision your consumers so you can still keep up in the case where
a consumer fails. Randomly load balancing has the problem that due to bad
luck some server will have no consumers for a period of time leading to
poor end-to-end latency and potential problems if the server doesn't handle
large backlog well.

Cheers,

-Jay

On Mon, Oct 8, 2012 at 9:48 AM, Evan Chan <ev...@ooyala.com> wrote:

> Howard,
>
> The Kafka model is very different from the Kestrel model, at least as far
> as consumers are concerned.
>
> Kestrel is like AMQP / RabbitMQ in that it keeps track of consumer state,
> and you can individually check out messages, and Kestrel will only remove
> it from the queue when you "commit".   Keeping track of this state slows
> down Kestrel signifiantly though.
>
> Kafka does not keep track of consumer state.  Each consumer keeps track of
> an "offset" on its own, and load balancing and offset rollbacks are done
> completely in the consumer.   Kafka doesn't support RabbitMQ-style message
> checkout and commit, in the server, although you could in theory implement
> such a consumer yourself.  The benefit, though is that Kafka has very
> little state to track and is wicked fast.
>
> In production, Kafka throughput easily trumps Kestrel throughput when you
> use the checkout/commit features.
>
> -Evan
>
>
> On Mon, Oct 8, 2012 at 9:08 AM, howard chen <ho...@gmail.com> wrote:
>
> > Hi,
> >
> > Someone asked in Quora
> > (
> >
> http://www.quora.com/Apache-Kafka/What-are-the-most-significant-differences-between-kestrel-and-kafka
> > )
> > and I found this question is particularly interesting, since not many
> > people has the experience in dealing with these systems.
> >
> > My understanding is as following, so pls correct me if I am wrong:
> >
> > Same:
> > - Both are durable and queue's size is limited by disk storage instead of
> > memory
> > - There are both very fast
> >
> > Difference:
> >
> > - No strict ordering kestrel
> > - Not transactional for kestrel
> > - No unique delivery guarantee for kestrel
> > - No zookeeper coordination for kestrel
> >
> > So it seems to me that kafka is a more complete solution for message
> > streaming solution while kestrel is a simplistic implementation of
> > message queue, am I right?
> >
> > Anything to add?
> > Thanks.
> >
>
>
>
> --
> --
> *Evan Chan*
> Senior Software Engineer |
> ev@ooyala.com | (650) 996-4600
> www.ooyala.com | blog <http://www.ooyala.com/blog> |
> @ooyala<http://www.twitter.com/ooyala>
>

Re: What are the most significant differences between kestrel and kafka?

Posted by Evan Chan <ev...@ooyala.com>.
Howard,

The Kafka model is very different from the Kestrel model, at least as far
as consumers are concerned.

Kestrel is like AMQP / RabbitMQ in that it keeps track of consumer state,
and you can individually check out messages, and Kestrel will only remove
it from the queue when you "commit".   Keeping track of this state slows
down Kestrel signifiantly though.

Kafka does not keep track of consumer state.  Each consumer keeps track of
an "offset" on its own, and load balancing and offset rollbacks are done
completely in the consumer.   Kafka doesn't support RabbitMQ-style message
checkout and commit, in the server, although you could in theory implement
such a consumer yourself.  The benefit, though is that Kafka has very
little state to track and is wicked fast.

In production, Kafka throughput easily trumps Kestrel throughput when you
use the checkout/commit features.

-Evan


On Mon, Oct 8, 2012 at 9:08 AM, howard chen <ho...@gmail.com> wrote:

> Hi,
>
> Someone asked in Quora
> (
> http://www.quora.com/Apache-Kafka/What-are-the-most-significant-differences-between-kestrel-and-kafka
> )
> and I found this question is particularly interesting, since not many
> people has the experience in dealing with these systems.
>
> My understanding is as following, so pls correct me if I am wrong:
>
> Same:
> - Both are durable and queue's size is limited by disk storage instead of
> memory
> - There are both very fast
>
> Difference:
>
> - No strict ordering kestrel
> - Not transactional for kestrel
> - No unique delivery guarantee for kestrel
> - No zookeeper coordination for kestrel
>
> So it seems to me that kafka is a more complete solution for message
> streaming solution while kestrel is a simplistic implementation of
> message queue, am I right?
>
> Anything to add?
> Thanks.
>



-- 
--
*Evan Chan*
Senior Software Engineer |
ev@ooyala.com | (650) 996-4600
www.ooyala.com | blog <http://www.ooyala.com/blog> |
@ooyala<http://www.twitter.com/ooyala>

Re: What are the most significant differences between kestrel and kafka?

Posted by Felix GV <fe...@mate1inc.com>.
Rumor has it that Twitter runs a Kafka cluster of just four nodes that can
ingest all of the fire hose, all of their clicks and other smaller things
;) ...

--
Felix



On Wed, Oct 10, 2012 at 1:07 AM, howard chen <ho...@gmail.com> wrote:

> Hi,
>
> On Wed, Oct 10, 2012 at 1:33 AM, Jun Rao <ju...@gmail.com> wrote:
> > One of the key differences is that Kafka supports multi-subscription
> while
> > Kestrel does not. This is the primary reason why storm is integrated with
> > Kafka.
> >
>
> Good point!
>
> It is interesting that Kestrel and Storm are both from Twitter, so now
> Twitter also use Kafka?
>
> :)
>

Re: What are the most significant differences between kestrel and kafka?

Posted by howard chen <ho...@gmail.com>.
Hi,

On Wed, Oct 10, 2012 at 1:33 AM, Jun Rao <ju...@gmail.com> wrote:
> One of the key differences is that Kafka supports multi-subscription while
> Kestrel does not. This is the primary reason why storm is integrated with
> Kafka.
>

Good point!

It is interesting that Kestrel and Storm are both from Twitter, so now
Twitter also use Kafka?

:)

Re: What are the most significant differences between kestrel and kafka?

Posted by Jun Rao <ju...@gmail.com>.
One of the key differences is that Kafka supports multi-subscription while
Kestrel does not. This is the primary reason why storm is integrated with
Kafka.

Thanks,

Jun

On Mon, Oct 8, 2012 at 9:08 AM, howard chen <ho...@gmail.com> wrote:

> Hi,
>
> Someone asked in Quora
> (
> http://www.quora.com/Apache-Kafka/What-are-the-most-significant-differences-between-kestrel-and-kafka
> )
> and I found this question is particularly interesting, since not many
> people has the experience in dealing with these systems.
>
> My understanding is as following, so pls correct me if I am wrong:
>
> Same:
> - Both are durable and queue's size is limited by disk storage instead of
> memory
> - There are both very fast
>
> Difference:
>
> - No strict ordering kestrel
> - Not transactional for kestrel
> - No unique delivery guarantee for kestrel
> - No zookeeper coordination for kestrel
>
> So it seems to me that kafka is a more complete solution for message
> streaming solution while kestrel is a simplistic implementation of
> message queue, am I right?
>
> Anything to add?
> Thanks.
>