You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Mark <st...@gmail.com> on 2013/08/30 20:18:41 UTC

Kafka -> HDFS

What is the quickest and easiest way to write message from Kafka into HDFS? I've come across Camus but before we go the whole route of writing Avro messages we want to test plain old vanilla messages.

Thanks

Re: Kafka -> HDFS

Posted by Andrew Otto <ot...@wikimedia.org>.
Mark,

I had the same question!  Camus is super awesome, but doesn't have out of the box support for just writing Strings into HDFS.  I submitted this pull request to support that:

https://github.com/linkedin/camus/pull/28

You can clone this directly from the wikimedia branch of Camus:

https://github.com/wikimedia/camus/tree/wikimedia


On Aug 30, 2013, at 2:18 PM, Mark <st...@gmail.com> wrote:

> What is the quickest and easiest way to write message from Kafka into HDFS? I've come across Camus but before we go the whole route of writing Avro messages we want to test plain old vanilla messages.
> 
> Thanks


Re: Kafka -> HDFS

Posted by Jun Rao <ju...@gmail.com>.
You can take a look at the hadoop consumer under contrib.

Thanks,

Jun


On Fri, Aug 30, 2013 at 11:18 AM, Mark <st...@gmail.com> wrote:

> What is the quickest and easiest way to write message from Kafka into
> HDFS? I've come across Camus but before we go the whole route of writing
> Avro messages we want to test plain old vanilla messages.
>
> Thanks

Re: Kafka -> HDFS

Posted by Joe Stein <cr...@gmail.com>.
Can you elaborate on your use case a bit?

At what point would your business logic decide that the file is complete
(by time or other decision to cut a file as completed)? And then when do
you batch process from what the stream has pilled up for you ?

Writing to HDFS http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample is
pretty straight forward and doing so in a consumer is not a lot of fuss

Whether you need more layers and overhead all gets back to what you are
trying to accomplish and such :)  You might need to use Zookeeper or
something to coordinate what to run the batch process (depending on how you
kick this off) so you know what is going on in the Consumers is completed
in the other system.

On Fri, Aug 30, 2013 at 2:18 PM, Mark <st...@gmail.com> wrote:

> What is the quickest and easiest way to write message from Kafka into
> HDFS? I've come across Camus but before we go the whole route of writing
> Avro messages we want to test plain old vanilla messages.
>
> Thanks