You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Taylor Gautier <tg...@tagged.com> on 2012/01/10 22:49:27 UTC

Second Tagged use case is going live this week

Happy New Year all.

Just to let you know - Tagged Group Chat is 50% live today and should be
100% in a day or two.

Kafka is going strong for us!

Re: Second Tagged use case is going live this week

Posted by Taylor Gautier <tg...@tagged.com>.
We've just implemented a log cleaner feature that we are currently testing.
 This actually cleans up entire topic directories on the fly.  It's a bit
unorthodox in the Kafka philosophy because we have lots of ephemeral topics
that we need to clear out.  Just for the curious - we've found that with
EXT3 you can get to about 20k topics before throughput takes a nosedive
(another possible feature here is a tree structure to hold the topics so we
don't run into files per directory limits).

I had previously worked on a feature that Jun incorporated which doesn't
create topics on read.  This is useful, but the on the fly cleaner is
better, because we have lots of topics with 1 or 10 msgs total that have to
disappear.  So while there is lots of read traffic for topics that don't
get traffic, there's also a lot of traffic for topics that get some
messages, and those have to be written to disk.  But later those topics
need to disappear - so the cleaner feature is actually more useful than the
no-create-on-read feature.

Keep in mind these issues are more pertinent to our other use case where
every user on the site has one or more "topics" that we write change events
to.

The next up on the list that will help us quite a bit is the long-poll.
 Right now we're generating loads of unneccessary traffic because we are
polling large numbers of topics that have no traffic.  On the flip side, as
we start getting more messages on those topics, the overall work done will
probably only increase marginally.

Anyway, those are the burning issues.  Eventually we are going to be
archiving (Hadoop), supporting more sophisticated partitioning, and tackle
replication.

And hopefully we'll start pumping logs and other internal data through
Kafka.  Sounds silly but we started with the non-critical use cases first -
site-wide event bus and chat messages - before adopting Kafka more fully.


On Tue, Jan 10, 2012 at 2:02 PM, Jay Kreps <ja...@gmail.com> wrote:

> Great news! Thanks for the update. Let us know how we can help you guys out
> as things progress.
>
> -Jay
>
> On Tue, Jan 10, 2012 at 1:49 PM, Taylor Gautier <tg...@tagged.com>
> wrote:
>
> > Happy New Year all.
> >
> > Just to let you know - Tagged Group Chat is 50% live today and should be
> > 100% in a day or two.
> >
> > Kafka is going strong for us!
> >
>

Re: Second Tagged use case is going live this week

Posted by Jay Kreps <ja...@gmail.com>.
Great news! Thanks for the update. Let us know how we can help you guys out
as things progress.

-Jay

On Tue, Jan 10, 2012 at 1:49 PM, Taylor Gautier <tg...@tagged.com> wrote:

> Happy New Year all.
>
> Just to let you know - Tagged Group Chat is 50% live today and should be
> 100% in a day or two.
>
> Kafka is going strong for us!
>