You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Parviz deyhim <de...@gmail.com> on 2012/10/05 05:43:51 UTC

Re: S3 Consumer for Super Duper Blog Post!

Wanted to see if I can resurrect this thread. I'm looking for anyone
who's running Kafka on AWS. And S3 consumer for Kafka is particularly
interesting.

any help would be truly appreciated.

>>it does not require your Hadoop cluster to be permanent. Like any MR job
>>that outputs to S3, once the data is in S3 it is there for good (unless you
>>explicitly delete it).

On Sun, Aug 19, 2012 at 10:10 AM, Russell Jurney
<ru...@gmail.com>wrote:

> Thanks, I'll check this out.
>
> On Sun, Aug 19, 2012 at 7:09 AM, Russell Jurney <russell.jurney@gmail.com
> >wrote:
>
> > Thanks for your response, and glad to hear you need this as well and are
> > working on it.
> >
> > Does using s3n:// file-path require that you have a Hadoop cluster
> > running? I use S3 and EMR, so my Hadoop clusters are temporary.  I do use
> > Hadoop with S3 to consume the data Kafka produces, so I am fine with
> Hadoop
> > as a dependency - at the library level, but not if a cluster must persist
> > for the Kafka S3 consumer to work.
> >
> >
> > On Sat, Aug 18, 2012 at 9:20 AM, Matthew Rathbone <
> matthew@foursquare.com>wrote:
> >
> >> Hey Russell,
> >>
> >> We're actually about to start work on this exact thing here at
> foursquare
> >> as we're about to start prototyping kafka to replace our aging log
> >> infrastructure.
> >>
> >> We'd planned on just using the hadoop-consumer, but setting the output
> >> directory to a S3n:// file-path.
> >>
> >> I'm assuming that you want to build a consumer that operates outside of
> >> hadoop?
> >>
> >>
> >>
> >> On Sat, Aug 18, 2012 at 12:49 AM, Russell Jurney
> >> <ru...@gmail.com>wrote:
> >>
> >> > Ok, this is the last time I'm gonna beg for an S3 sink for Kafka. I'm
> >> > not trolling, and this is Your Big Chance to help!
> >> >
> >> > I'm gonna blog about using Whirr to boot Zookeeper and then to boot
> >> > Kafka in the cloud and then create events in an application that get
> >> > sunk to Amazon S3, where they will be processed by
> >> > Pig/Hadoop/ElasticMapReduce, mined into gems and republished in some
> >> > esoteric NoSQL DB and then served in the very app that generated the
> >> > events in the first place.
> >> >
> >> > So, if someone else doesn't contribute an S3 consumer for Kafka in the
> >> > next month or so... so help me Bob, I'm gonna write it myself. Now,
> >> > some of you may not know me, but I am the 3rd best software engineer
> >> > in the world:
> >> >
> http://www.quora.com/Who-are-some-of-the-best-software-engineers-alive
> >> >
> >> > Those of you that have seen my code, however, are aware that as a
> >> > programmer, I am substandard. There's a gene that imparts exception
> >> > handling and algorithms, and they're missing from my genome.
> >> >
> >> > So let me be clear: you don't want me to write the S3 sink. A Kafka
> >> > committer or someone with a real job should write the S3 sink. As soon
> >> > as that thing is written and my blog post goes out, Kafka use will
> >> > spike and you'll all be famous.
> >> >
> >> > So this is a direct threat: I am writing an S3 consumer for Kafka
> >> > unless one of you steps up. And you will rue the day that piece of
> >> > crap ships.
> >> >
> >> > In return for your contribution, you will be named in my blog post as
> >> > open source citizen of the month, to be accompanied by a commemorative
> >> > plaque with a pixelated photo of me.
> >> >
> >> > Yours truly,
> >> >
> >> > Russell Jurney http://datasyndrome.com
> >> >
> >>
> >>
> >>
> >> --
> >> Matthew Rathbone
> >> Foursquare | Software Engineer | Server Engineering Team
> >> matthew@foursquare.com | @rathboma <http://twitter.com/rathboma> |
> >> 4sq<http://foursquare.com/rathboma>
> >>
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.comdatasyndrome.
> > com
> >
>
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>



-- 
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com | @rathboma <http://twitter.com/rathboma> |
4sq<http://foursquare.com/rathboma>