You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Ivan Tsyba <iv...@gmail.com> on 2022/08/05 21:24:11 UTC

Re: GenericDatumReader writer's schema question

Hello Oscar,
Yes, I've looked inside DataFileReader and now it's clear for me
Thank you

пт, 22 лип. 2022 р. о 12:46 Oscar Westra van Holthe - Kind <
oscar@westravanholthe.nl> пише:

> Hi Ivan,
>
> You're correct about the GenericDatumReader javadoc, but the writer
> schema can be adjusted after creation. This is what the DataFileReader
> does.
>
> So after the DataFileReader is initialised, the underlying
> GenericDatumReader uses the the schema in the file as write schema (to
> understand the data), and the schema you provided as read schema (to give
> data to you via dataFileReader.next(user)).
>
> Does that clarify things for you?
>
>
> Kind regards,
> Oscar
>
>
> On Wed, 20 Jul 2022 at 10:37, Ivan Tsyba <iv...@gmail.com> wrote:
>
>> Hello
>>
>> As stated in Avro Getting Started
>> <https://avro.apache.org/docs/current/gettingstartedjava.html#Deserializing> about
>> deserialization without code generation: "The data will be read using the
>> writer's schema included in the file, and the reader's schema provided to
>> the GenericDatumReader". Here is how GenericDatumReader is created in the
>> example
>>
>> DatumReader<GenericRecord> datumReader = new
>> GenericDatumReader<GenericRecord>(schema);
>>
>> But when you look at this GenericDatumReader constructor Javadoc it
>> states "Construct where the writer's and reader's schemas are the same."
>> (and actual code corresponds to this).
>>
>> So the writer's schema isn’t taken from a serialized file but from a
>> constructor parameter?
>>
>
>
> --
>
> ✉️ Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>
>
>