You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Lior Schachter <li...@gmail.com> on 2013/08/01 17:02:50 UTC

Avro schema

Hi all,

When writing Avro schema to the a data file, what will be the expected
behavior if the file is used as M/R input. How does the second/third/...
splits get the schema (the schema is always written to the first split) ?

Thanks,
Lior

Re: Avro schema

Posted by Harsh J <ha...@cloudera.com>.
Yes, we seek to 0 and we read the header then seek back to the split offset.
On Aug 1, 2013 11:16 PM, "Lior Schachter" <li...@gmail.com> wrote:

> Hi Harsh,
> So for each split you first read the header of the file directly from HDFS
> ?
>
> Thanks,
> Lior
>
>
>
>
> On Thu, Aug 1, 2013 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> We read it from the top of the file at start (just the schema bytes)
>> and then initialize the reader.
>>
>> On Thu, Aug 1, 2013 at 8:32 PM, Lior Schachter <li...@gmail.com> wrote:
>> > Hi all,
>> >
>> > When writing Avro schema to the a data file, what will be the expected
>> > behavior if the file is used as M/R input. How does the second/third/...
>> > splits get the schema (the schema is always written to the first split)
>> ?
>> >
>> > Thanks,
>> > Lior
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: Avro schema

Posted by Lior Schachter <li...@gmail.com>.
Hi Harsh,
So for each split you first read the header of the file directly from HDFS ?

Thanks,
Lior




On Thu, Aug 1, 2013 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:

> We read it from the top of the file at start (just the schema bytes)
> and then initialize the reader.
>
> On Thu, Aug 1, 2013 at 8:32 PM, Lior Schachter <li...@gmail.com> wrote:
> > Hi all,
> >
> > When writing Avro schema to the a data file, what will be the expected
> > behavior if the file is used as M/R input. How does the second/third/...
> > splits get the schema (the schema is always written to the first split) ?
> >
> > Thanks,
> > Lior
> >
> >
>
>
>
> --
> Harsh J
>

Re: Avro schema

Posted by Harsh J <ha...@cloudera.com>.
We read it from the top of the file at start (just the schema bytes)
and then initialize the reader.

On Thu, Aug 1, 2013 at 8:32 PM, Lior Schachter <li...@gmail.com> wrote:
> Hi all,
>
> When writing Avro schema to the a data file, what will be the expected
> behavior if the file is used as M/R input. How does the second/third/...
> splits get the schema (the schema is always written to the first split) ?
>
> Thanks,
> Lior
>
>



-- 
Harsh J