You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Jibran Saithi <ji...@qubitdigital.com> on 2013/01/31 10:29:02 UTC

S3 Archiving for Kafka topics (with Zookeeper resume)

Hey,

I know this has come up a few times, so thought I'd share a bit of code
we've been using to archive topics to S3.

Particularly unimaginatively named, but is available here:
https://github.com/jibs/kafka-s3-consumer

We needed something with Zookeeper support for storing the offsets, but
didn't come across anything so I quickly put this together. For the moment
I've removed graphite stats reporting because it has a few internal
dependencies, but plan to sort that out soon.

Hope this helps,
Jibran

Re: S3 Archiving for Kafka topics (with Zookeeper resume)

Posted by Jibran Saithi <ji...@qubitdigital.com>.

I should probably call it something else to avoid confusion, it started as
a fork but since became a re-write (the linked one was using simpleconsumer
and we needed some Zookeeper goodness).



On 31 January 2013 17:04, Jay Kreps <ja...@gmail.com> wrote:

> This is super cool. Are we linking the right repo on the ecosystem page?
>
> https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem
>
> -Jay
>
>
> On Thu, Jan 31, 2013 at 1:29 AM, Jibran Saithi <jibran@qubitdigital.com
> >wrote:
>
> > Hey,
> >
> > I know this has come up a few times, so thought I'd share a bit of code
> > we've been using to archive topics to S3.
> >
> > Particularly unimaginatively named, but is available here:
> > https://github.com/jibs/kafka-s3-consumer
> >
> > We needed something with Zookeeper support for storing the offsets, but
> > didn't come across anything so I quickly put this together. For the
> moment
> > I've removed graphite stats reporting because it has a few internal
> > dependencies, but plan to sort that out soon.
> >
> > Hope this helps,
> > Jibran
> >
>



-- 
Jibran Saithi
Product Engineer


QuBit Digital Ltd
www.qubitdigital.com
*
*
Office: +44 (0)203 411 9130

QuBit Digital Ltd
20 Broadwick Street
London
W1F 8HT



This email may be confidential or privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone to
the wrong person. Thanks

Re: S3 Archiving for Kafka topics (with Zookeeper resume)

Posted by Jay Kreps <ja...@gmail.com>.

This is super cool. Are we linking the right repo on the ecosystem page?

https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem

-Jay


On Thu, Jan 31, 2013 at 1:29 AM, Jibran Saithi <ji...@qubitdigital.com>wrote:

> Hey,
>
> I know this has come up a few times, so thought I'd share a bit of code
> we've been using to archive topics to S3.
>
> Particularly unimaginatively named, but is available here:
> https://github.com/jibs/kafka-s3-consumer
>
> We needed something with Zookeeper support for storing the offsets, but
> didn't come across anything so I quickly put this together. For the moment
> I've removed graphite stats reporting because it has a few internal
> dependencies, but plan to sort that out soon.
>
> Hope this helps,
> Jibran
>

Re: S3 Archiving for Kafka topics (with Zookeeper resume)

Posted by San <sa...@gmail.com>.

I see the main method, once you run this will it stop execution or it will keep waiting and processing is what I mean?

Supervisord from what I understand is used to ensure the process stats up after reboots and restarts if it crashes etc

Sent from my iPhone

On 2013-01-31, at 12:38 PM, Jibran Saithi <ji...@qubitdigital.com> wrote:

> Yeah, we use supervisor which seems to work fine for us.
> 
> On 31 January 2013 14:57, S Ahmed <sa...@gmail.com> wrote:
> 
>> Great thanks.
>> 
>> BTW, that's not a daemon server is it?  Is it something you have to wrap to
>> run as a service/daemon?
>> 
>> 
>> On Thu, Jan 31, 2013 at 4:29 AM, Jibran Saithi <jibran@qubitdigital.com
>>> wrote:
>> 
>>> Hey,
>>> 
>>> I know this has come up a few times, so thought I'd share a bit of code
>>> we've been using to archive topics to S3.
>>> 
>>> Particularly unimaginatively named, but is available here:
>>> https://github.com/jibs/kafka-s3-consumer
>>> 
>>> We needed something with Zookeeper support for storing the offsets, but
>>> didn't come across anything so I quickly put this together. For the
>> moment
>>> I've removed graphite stats reporting because it has a few internal
>>> dependencies, but plan to sort that out soon.
>>> 
>>> Hope this helps,
>>> Jibran
> 
> 
> 
> -- 
> Jibran Saithi
> Product Engineer
> 
> 
> QuBit Digital Ltd
> www.qubitdigital.com
> *
> *
> Office: +44 (0)203 411 9130
> 
> QuBit Digital Ltd
> 20 Broadwick Street
> London
> W1F 8HT
> 
> 
> 
> This email may be confidential or privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone to
> the wrong person. Thanks

Re: S3 Archiving for Kafka topics (with Zookeeper resume)

Posted by Jay Kreps <ja...@gmail.com>.

Cool, well definitely add it to that ecosystem page to help people find it.

-Jay

On Thu, Jan 31, 2013 at 9:38 AM, Jibran Saithi <ji...@qubitdigital.com>wrote:

> Yeah, we use supervisor which seems to work fine for us.
>
> On 31 January 2013 14:57, S Ahmed <sa...@gmail.com> wrote:
>
> > Great thanks.
> >
> > BTW, that's not a daemon server is it?  Is it something you have to wrap
> to
> > run as a service/daemon?
> >
> >
> > On Thu, Jan 31, 2013 at 4:29 AM, Jibran Saithi <jibran@qubitdigital.com
> > >wrote:
> >
> > > Hey,
> > >
> > > I know this has come up a few times, so thought I'd share a bit of code
> > > we've been using to archive topics to S3.
> > >
> > > Particularly unimaginatively named, but is available here:
> > > https://github.com/jibs/kafka-s3-consumer
> > >
> > > We needed something with Zookeeper support for storing the offsets, but
> > > didn't come across anything so I quickly put this together. For the
> > moment
> > > I've removed graphite stats reporting because it has a few internal
> > > dependencies, but plan to sort that out soon.
> > >
> > > Hope this helps,
> > > Jibran
> > >
> >
>
>
>
> --
> Jibran Saithi
> Product Engineer
>
>
> QuBit Digital Ltd
> www.qubitdigital.com
> *
> *
> Office: +44 (0)203 411 9130
>
> QuBit Digital Ltd
> 20 Broadwick Street
> London
> W1F 8HT
>
>
>
> This email may be confidential or privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to
> the wrong person. Thanks
>

Re: S3 Archiving for Kafka topics (with Zookeeper resume)

Posted by Jibran Saithi <ji...@qubitdigital.com>.

Yeah, we use supervisor which seems to work fine for us.

On 31 January 2013 14:57, S Ahmed <sa...@gmail.com> wrote:

> Great thanks.
>
> BTW, that's not a daemon server is it?  Is it something you have to wrap to
> run as a service/daemon?
>
>
> On Thu, Jan 31, 2013 at 4:29 AM, Jibran Saithi <jibran@qubitdigital.com
> >wrote:
>
> > Hey,
> >
> > I know this has come up a few times, so thought I'd share a bit of code
> > we've been using to archive topics to S3.
> >
> > Particularly unimaginatively named, but is available here:
> > https://github.com/jibs/kafka-s3-consumer
> >
> > We needed something with Zookeeper support for storing the offsets, but
> > didn't come across anything so I quickly put this together. For the
> moment
> > I've removed graphite stats reporting because it has a few internal
> > dependencies, but plan to sort that out soon.
> >
> > Hope this helps,
> > Jibran
> >
>



-- 
Jibran Saithi
Product Engineer


QuBit Digital Ltd
www.qubitdigital.com
*
*
Office: +44 (0)203 411 9130

QuBit Digital Ltd
20 Broadwick Street
London
W1F 8HT



This email may be confidential or privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone to
the wrong person. Thanks

Re: S3 Archiving for Kafka topics (with Zookeeper resume)

Posted by S Ahmed <sa...@gmail.com>.

Great thanks.

BTW, that's not a daemon server is it?  Is it something you have to wrap to
run as a service/daemon?


On Thu, Jan 31, 2013 at 4:29 AM, Jibran Saithi <ji...@qubitdigital.com>wrote:

> Hey,
>
> I know this has come up a few times, so thought I'd share a bit of code
> we've been using to archive topics to S3.
>
> Particularly unimaginatively named, but is available here:
> https://github.com/jibs/kafka-s3-consumer
>
> We needed something with Zookeeper support for storing the offsets, but
> didn't come across anything so I quickly put this together. For the moment
> I've removed graphite stats reporting because it has a few internal
> dependencies, but plan to sort that out soon.
>
> Hope this helps,
> Jibran
>