You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Abhi Basu <90...@gmail.com> on 2013/11/22 18:33:22 UTC

Re: Kafka/Hadoop consumers and producers

I agree with you. We are looking for a simple solution for data from Kafka 
to Hadoop. I have tried using Camus earlier (Non-Avro) and documentation is 
lacking to make it work correctly, as we do not need to introduce another 
component to the solution. In the meantime, can the Kafka Hadoop 
Consumer/Producer be documented well so we can try it out ASAP. :)  Thanks.

On Friday, August 9, 2013 12:27:12 PM UTC-7, Ken Goodhope wrote:
>
> I just checked and that patch is in .8 branch.   Thanks for working on 
> back porting it Andrew.  We'd be happy to commit that work to master.
>
> As for the kafka contrib project vs Camus, they are similar but not quite 
> identical.  Camus is intended to be a high throughput ETL for bulk 
> ingestion of Kafka data into HDFS.  Where as what we have in contrib is 
> more of a simple KafkaInputFormat.  Neither can really replace the other.  
> If you had a complex hadoop workflow and wanted to introduce some Kafka 
> data into that workflow, using Camus would be a gigantic overkill and a 
> pain to setup.  On the flipside, if what you want is frequent reliable 
> ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you 
> with that.
>
> I think it would be preferable to simplify the existing contrib 
> Input/OutputFormats by refactoring them to use the more stable higher level 
> Kafka APIs.  Currently they use the lower level APIs.  This should make 
> them easier to maintain, and user friendly enough to avoid the need for 
> extensive documentation.
>
> Ken
>
>
> On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis <psaltis...@gmail.com<javascript:>
> > wrote:
>
>> Dibyendu,
>> According to the pull request: https://github.com/linkedin/camus/pull/15<https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Flinkedin%2Fcamus%2Fpull%2F15&sa=D&sntz=1&usg=AFQjCNENlPRS_I-7w_drkTC09rmQKGNNVg>it was merged into the camus-kafka-0.8 
>> branch. I have not checked if the code was subsequently removed, however, 
>> two at least one the important files from this patch (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java) 
>> is still present.
>>
>> Thanks,
>> Andrew
>>
>>
>>  On Fri, Aug 9, 2013 at 9:39 AM, <dibyendu.b...@pearson.com <javascript:>
>> > wrote:
>>
>>>  Hi Ken,
>>>
>>> I am also working on making the Camus fit for Non Avro message for our 
>>> requirement.
>>>
>>> I see you mentioned about this patch (
>>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8<https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Flinkedin%2Fcamus%2Fcommit%2F87917a2aea46da9d21c8f67129f6463af52f7aa8&sa=D&sntz=1&usg=AFQjCNGxLfUhDjxOiEp-zHUb14dlNYwriw>) 
>>> which supports custom data writer for Camus. But this patch is not pulled 
>>> into camus-kafka-0.8 branch. Is there any plan for doing the same ?
>>>
>>> Regards,
>>> Dibyendu
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the 
>>> Google Groups "Camus - Kafka ETL for Hadoop" group.
>>> To unsubscribe from this topic, visit 
>>> https://groups.google.com/d/topic/camus_etl/KKS6t5-O-Ng/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to 
>>> camus_etl+...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Camus - Kafka ETL for Hadoop" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to camus_etl+...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>  
>>  
>>
>
>