You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Scott Carey <sc...@richrelevance.com> on 2010/03/25 00:30:30 UTC

Avro files in Hive

I would like to be able to read an Avro file as a Table in Hive.

I have looked at the documentation, but cannot see anything about how to extend Hive to other file types. 

>From what it looks like, Serializer and Deserializer are for fields, and stuff them into a SequenceFile.

I don't want to control how to serialize a field, or a record individually.  I want to specify how to read from and write records/fields to an Avro file.

Where would I look for information on how to support a custom file format?

Thanks,

-Scott

Re: Avro files in Hive

Posted by Zheng Shao <zs...@gmail.com>.
Yes, take a look at IgnoreKeyInputFormat and HiveSequenceFileOutputFormat.

The Hive query will be something like:

CREATE TABLE xxx (a string) ... STORED AS INPUTFORMAT
'com.my.avroinputformat' OUTPUTFORMAT 'com.my.avrooutputformat';

Zheng

On Wed, Mar 24, 2010 at 4:30 PM, Scott Carey <sc...@richrelevance.com> wrote:
> I would like to be able to read an Avro file as a Table in Hive.
>
> I have looked at the documentation, but cannot see anything about how to extend Hive to other file types.
>
> From what it looks like, Serializer and Deserializer are for fields, and stuff them into a SequenceFile.
>
> I don't want to control how to serialize a field, or a record individually.  I want to specify how to read from and write records/fields to an Avro file.
>
> Where would I look for information on how to support a custom file format?
>
> Thanks,
>
> -Scott



-- 
Yours,
Zheng