You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@avro.apache.org by Milind Vaidya <ka...@gmail.com> on 2013/04/05 19:26:12 UTC

avro_value_t or avro_datum_t

To best of my knowledge use of avro_datum_t is discouraged. So I was trying
following code

//Assume path variable to be having proper value and proper exception
handling in place

PART A:
avro_value_t data;
avro_file_reader_t fileReader;

result = avro_file_reader(path, &fileReader);
result = avro_file_reader_read_value(fileReader, &data);

This above call leads to "segmentation fault"

PART B:
 avro_datum_t d_data;
 avro_file_reader_t fileReader;
 result = avro_file_reader(path, &fileReader);
 result = avro_file_reader_read(fileReader, NULL, &d_data);
 avro_datum_as_value(&data, d_data);

This works absolutely fine.


Am I missing anything in part A?

Re: avro_value_t or avro_datum_t

Posted by Milind Vaidya <ka...@gmail.com>.

Cool.

Thanks again Doug.

Worked like a charm,




On Mon, Apr 8, 2013 at 8:32 AM, Douglas Creager <do...@creagertino.net>wrote:

> > //Assume path variable to be having proper value and proper exception
> > handling in place
> >
> > PART A:
> > avro_value_t data;
> > avro_file_reader_t fileReader;
> >
> > result = avro_file_reader(path, &fileReader);
> > result = avro_file_reader_read_value(fileReader, &data);
> >
> > This above call leads to "segmentation fault"
>
> > Am I missing anything in part A?
>
> With the old datum API, avro_file_reader_read would allocate a new datum
> instance for each record read from the file.  The new value API doesn't
> allocate anything for you, so that if you're reading millions of records
> from a file, you don't incur malloc/free overhead for each one of those
> records.  That means that you have to allocate a value instance that
> avro_file_reader_read_value can read into:
>
>     avro_file_reader_t  reader;
>     avro_schema_t  file_schema;
>     avro_value_iface_t  *file_iface;
>     avro_value_t  data;
>
>     // Open the file and create an avro_value_t to read into.
>     avro_file_reader(path, &reader);
>     file_schema = avro_file_reader_get_writer_schema(reader);
>     file_iface = avro_generic_class_from_schema(file_schema);
>     avro_generic_value_new(file_iface, &data);
>
>     // Read two records from the file.
>     result = avro_file_reader_read_value(reader, &data);
>     result = avro_file_reader_read_value(reader, &data);
>
> Note that we're grabbing the writer schema from the file that we just
> opened, so that we know that "data" is always an instance of the right
> schema type.  Also note that when we read multiple records from the
> file, we can reuse the "data" value instance.  Its contents will be
> overwritten with each successive record from the file.
>
>

Re: avro_value_t or avro_datum_t

Posted by Douglas Creager <do...@creagertino.net>.

> //Assume path variable to be having proper value and proper exception
> handling in place
> 
> PART A:
> avro_value_t data;
> avro_file_reader_t fileReader;
>  
> result = avro_file_reader(path, &fileReader);
> result = avro_file_reader_read_value(fileReader, &data);
> 
> This above call leads to "segmentation fault"

> Am I missing anything in part A?

With the old datum API, avro_file_reader_read would allocate a new datum
instance for each record read from the file.  The new value API doesn't
allocate anything for you, so that if you're reading millions of records
from a file, you don't incur malloc/free overhead for each one of those
records.  That means that you have to allocate a value instance that
avro_file_reader_read_value can read into:

    avro_file_reader_t  reader;
    avro_schema_t  file_schema;
    avro_value_iface_t  *file_iface;
    avro_value_t  data;

    // Open the file and create an avro_value_t to read into.
    avro_file_reader(path, &reader);
    file_schema = avro_file_reader_get_writer_schema(reader);
    file_iface = avro_generic_class_from_schema(file_schema);
    avro_generic_value_new(file_iface, &data);

    // Read two records from the file.
    result = avro_file_reader_read_value(reader, &data);
    result = avro_file_reader_read_value(reader, &data);

Note that we're grabbing the writer schema from the file that we just
opened, so that we know that "data" is always an instance of the right
schema type.  Also note that when we read multiple records from the
file, we can reuse the "data" value instance.  Its contents will be
overwritten with each successive record from the file.