You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Murtaza Doctor <mu...@richrelevance.com> on 2012/08/11 00:24:47 UTC

Kafka Avro Based Consumer

Hello Folks,

We are trying to write an Avro based HDFS Consumer which is capable of reading offsets from zookeeper. Currently we do have a producer which writes avro messages. We are trying to utilize the avro map-reduce API and running into challenges. Additionally there is a big schema management piece here as you can imagine. Ideally there should a schema service which could be utilized by the producer and consumer.

 Any thoughts or ideas on the avro based consumer will be greatly appreciated. If there is sample avro based consumer code or anything else we could look at as well to debug the issue. Finally there were talks around LinkedIn open sourcing the avro schema management piece has this already happened or in progress?

Exception Stacktrace when we run the Map Only Job (Just for context)

Additional Context:
org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.NullPointerException: in com.rr.avro.ViewEvent in union null of union in field viewGuid of com.rr.avro.ViewEvent
    at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:261)
    at org.apache.avro.mapred.AvroOutputFormat$1.write(AvroOutputFormat.java:122)
    at org.apache.avro.mapred.AvroOutputFormat$1.write(AvroOutputFormat.java:119)
    at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.collect(MapTask.java:706)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:499)
    at com.rr.hadoop.consumer.HadoopConsumer$KafkaAvroMapper.map(HadoopConsumer.java:92)
    at com.rr.hadoop.consumer.HadoopConsumer$KafkaAvroMapper.map(HadoopConsumer.java:68)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    at java.security.AccessC
attempt_201207041152_56118_m_000000_0: KafkaContext

Thanks,
murtaza