You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Marcelo Valle <mv...@redoop.org> on 2014/02/13 17:18:16 UTC

Linkedin Camus vs kafka-hadoop-loader vs hadoop-consumer

Hello,

I've been studying different options to consume messages from kafka to
hadoop(hdfs) and found three odds.

Linkedin Camus - https://github.com/linkedin/camus
kafka-hadoop-loader - https://github.com/michal-harish/kafka-hadoop-loader
hadoop-consumer -
https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer

I suppose Camus is the most robust tool, and from performance point of view
is the best too. But is more complex to use and develop than other options.
But not support raw text messages... and only Avro serializad messages can
be used.

kafka-hadoop-loader have no support since one year ago, and doesn't work
with hadoop 2 so is descarded.

hadoop-consumer is native in kafka trunk, is simple and easy to use,
support Avro an raw test, but I have doubts about performance and fault
tolerance.

I'm right in my conclusions?
Do you know about any alternive?
Can you help me to choose the best?

Thanks!

Re: Linkedin Camus vs kafka-hadoop-loader vs hadoop-consumer

Posted by Maxime Nay <ma...@gmail.com>.
Hi,

Camus does support raw text messages. If I remember correctly, you just
need to provide your own record decoder and record writer.
We are using Camus to consume messages from Kafka and store them to S3 and
it works quite well.


Maxime


On Thu, Feb 13, 2014 at 8:18 AM, Marcelo Valle <mv...@redoop.org> wrote:

> Hello,
>
> I've been studying different options to consume messages from kafka to
> hadoop(hdfs) and found three odds.
>
> Linkedin Camus - https://github.com/linkedin/camus
> kafka-hadoop-loader - https://github.com/michal-harish/kafka-hadoop-loader
> hadoop-consumer -
> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer
>
> I suppose Camus is the most robust tool, and from performance point of view
> is the best too. But is more complex to use and develop than other options.
> But not support raw text messages... and only Avro serializad messages can
> be used.
>
> kafka-hadoop-loader have no support since one year ago, and doesn't work
> with hadoop 2 so is descarded.
>
> hadoop-consumer is native in kafka trunk, is simple and easy to use,
> support Avro an raw test, but I have doubts about performance and fault
> tolerance.
>
> I'm right in my conclusions?
> Do you know about any alternive?
> Can you help me to choose the best?
>
> Thanks!
>

Re: Linkedin Camus vs kafka-hadoop-loader vs hadoop-consumer

Posted by Cliff Resnick <cr...@conductor.com>.
We’ve been using Miniway’s hadoop-consumer in production for over a year without any problems. It stores offsets in zookeeper rather than HDFS and it uses the more recent mapreduce api.

https://github.com/miniway/kafka-hadoop-consumer


On Feb 13, 2014, at 11:18 AM, Marcelo Valle <mv...@redoop.org> wrote:

> Hello,
> 
> I've been studying different options to consume messages from kafka to
> hadoop(hdfs) and found three odds.
> 
> Linkedin Camus - https://github.com/linkedin/camus
> kafka-hadoop-loader - https://github.com/michal-harish/kafka-hadoop-loader
> hadoop-consumer -
> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer
> 
> I suppose Camus is the most robust tool, and from performance point of view
> is the best too. But is more complex to use and develop than other options.
> But not support raw text messages... and only Avro serializad messages can
> be used.
> 
> kafka-hadoop-loader have no support since one year ago, and doesn't work
> with hadoop 2 so is descarded.
> 
> hadoop-consumer is native in kafka trunk, is simple and easy to use,
> support Avro an raw test, but I have doubts about performance and fault
> tolerance.
> 
> I'm right in my conclusions?
> Do you know about any alternive?
> Can you help me to choose the best?
> 
> Thanks!