You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Anatoly Deyneka <ad...@gmail.com> on 2014/06/16 12:02:28 UTC

data archiving

Hi all,

I'm looking for the way of archiving data.
The data is hot for few days in our system.
After that it can rarely be used. Speed is not so important for archive.

Lets say we have kafka cluster and storage system.
It would be great if kafka supported moving data to storage system instead
of eviction and end user could specify what storage system is used(dynamo,
s3, hadoop, etc...).
Is it possible to implement?

What other solutions you can advice?

Regards,
Anatoly

Re: data archiving

Posted by Robert Hodges <be...@gmail.com>.
Have you looked at Pinterest Secor?  (
http://engineering.pinterest.com/post/84276775924/introducing-pinterest-secor
)

Cheers, Robert


On Mon, Jun 16, 2014 at 5:17 AM, Mark Godfrey <ms...@gmail.com> wrote:

> There is Bifrost, which archives Kafka data to S3:
> https://github.com/uswitch/bifrost
>
> Obviously that's a fairly specific archive solution, but it might work for
> you.
>
>
> Mark.
>
> On Mon, Jun 16, 2014 at 11:02 AM, Anatoly Deyneka <ad...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I'm looking for the way of archiving data.
> > The data is hot for few days in our system.
> > After that it can rarely be used. Speed is not so important for archive.
> >
> > Lets say we have kafka cluster and storage system.
> > It would be great if kafka supported moving data to storage system
> instead
> > of eviction and end user could specify what storage system is
> used(dynamo,
> > s3, hadoop, etc...).
> > Is it possible to implement?
> >
> > What other solutions you can advice?
> >
> > Regards,
> > Anatoly
> >
>

Re: data archiving

Posted by Mark Godfrey <ms...@gmail.com>.
There is Bifrost, which archives Kafka data to S3:
https://github.com/uswitch/bifrost

Obviously that's a fairly specific archive solution, but it might work for
you.


Mark.

On Mon, Jun 16, 2014 at 11:02 AM, Anatoly Deyneka <ad...@gmail.com>
wrote:

> Hi all,
>
> I'm looking for the way of archiving data.
> The data is hot for few days in our system.
> After that it can rarely be used. Speed is not so important for archive.
>
> Lets say we have kafka cluster and storage system.
> It would be great if kafka supported moving data to storage system instead
> of eviction and end user could specify what storage system is used(dynamo,
> s3, hadoop, etc...).
> Is it possible to implement?
>
> What other solutions you can advice?
>
> Regards,
> Anatoly
>

Re: data archiving

Posted by Joe Stein <jo...@stealth.ly>.
You should do this as a consumer (i.e. "archiveDataConsumer")

Take a look at the AWS section of the eco system
https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem (e.g.
https://github.com/pinterest/secor ).

Also the tools is a good place to check out
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools (e.g.
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-MirrorMaker
).

If there isn't a consumer you need you could write one (most often what
folks do) or google and maybe find it and let the community know.

Thanks!

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/


On Mon, Jun 16, 2014 at 6:02 AM, Anatoly Deyneka <ad...@gmail.com> wrote:

> Hi all,
>
> I'm looking for the way of archiving data.
> The data is hot for few days in our system.
> After that it can rarely be used. Speed is not so important for archive.
>
> Lets say we have kafka cluster and storage system.
> It would be great if kafka supported moving data to storage system instead
> of eviction and end user could specify what storage system is used(dynamo,
> s3, hadoop, etc...).
> Is it possible to implement?
>
> What other solutions you can advice?
>
> Regards,
> Anatoly
>