You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Titouan Christophe <ti...@railnova.eu> on 2020/01/16 17:33:16 UTC
avro-c error: "Datum too large for file block size"
Hello everyone,
I am new to the avro-c library, and trying to get some experience with
it. At the moment, I wrote a very simple program that is supposed to
write a single record of a simple schema to a file.
The program source is over there:
https://gist.github.com/titouanc/0df61b807d06ca7611cc6708f12fc938.
When I run my program, I obtain the following output:
Unable to write Avro record to file: Datum too large for file block size
which is produced by
https://gist.github.com/titouanc/0df61b807d06ca7611cc6708f12fc938#file-test-avro-c-L34
. The only reference to a similar error message is
http://apache-avro.679487.n3.nabble.com/Value-too-large-for-file-block-size-td4028424.html
but i really doubt I have the same issue, as the size of my record
should be much more lower than 16kiB.
To write this small piece of code, I draw inspiration from the quickstop
example
(https://github.com/apache/avro/blob/release-1.9.1/lang/c/examples/quickstop.c).
If I compile and run this example, it runs just fine.
Finally, I made a small modification to my program to use the same
schema as the quickstop example. Here is the modified version:
https://gist.github.com/titouanc/b18c0c54657db4e1f0361e0be9f710f3.
This one actually works perfectly !
Could anyone help me to understand this issue ?
Best regards,
Titouan Christophe
Re: avro-c error: "Datum too large for file block size"
Posted by Titouan Christophe <ti...@railnova.eu>.
Hello Dan and all,
On 1/16/20 7:00 PM, Dan Schmitt wrote:
> Only difference I can see is the null default/union possibly not being
> handled well by avro_record_set
> or avro_record(schema)
>
> Without reading the source I'd expect avro_record(schema) to default
> the default union values to null
Yes, I was assuming that a nullable record field would be NULL if no
value was provided.
> leading to some sort of let's keep reading this memory issue because
> we don't know where the end is.
>
> You can test if that's not happening by setting hours in your original
> program, and/or explicitly setting
> the fields to null types.
Thank you for this insight ! I wrote a few more variations [1] of my
program:
- with or without nullable fields
- filling them explicitely with avro_null() or a value
- also with another union type: long/double
In the end, I did not manage to have a working program that constructs a
record with an union type, and write it to a file, but it works with
non-union types (see [2]).
Maybe there is an additional step to perform when constructing or
encoding union types in avro-c ?
>
> Probably room for improvement on the C++ side (should validate/throw
> if the avro_dataum_t isn't valid
> or matching the writer schema instead of doing whatever it's doing, in
> addition to having the null/union
> default work for avro_record(schema)).
>
I created a repository with all the variation of my test program:
[1]
https://github.com/titouanc/test-avro-record/tree/bd4f63824489d0b5802cb05bcbd6f9e1b3251a7c
The tests results are visible there:
[2]
https://github.com/titouanc/test-avro-record/commit/bd4f63824489d0b5802cb05bcbd6f9e1b3251a7c/checks?check_suite_id=405071388#step:4:1
Best regards,
Titouan
Re: avro-c error: "Datum too large for file block size"
Posted by Dan Schmitt <da...@gmail.com>.
Only difference I can see is the null default/union possibly not being
handled well by avro_record_set
or avro_record(schema)
Without reading the source I'd expect avro_record(schema) to default
the default union values to null
leading to some sort of let's keep reading this memory issue because
we don't know where the end is.
You can test if that's not happening by setting hours in your original
program, and/or explicitly setting
the fields to null types.
Probably room for improvement on the C++ side (should validate/throw
if the avro_dataum_t isn't valid
or matching the writer schema instead of doing whatever it's doing, in
addition to having the null/union
default work for avro_record(schema)).
On Thu, Jan 16, 2020 at 11:33 AM Titouan Christophe
<ti...@railnova.eu> wrote:
>
> Hello everyone,
>
> I am new to the avro-c library, and trying to get some experience with
> it. At the moment, I wrote a very simple program that is supposed to
> write a single record of a simple schema to a file.
>
> The program source is over there:
> https://gist.github.com/titouanc/0df61b807d06ca7611cc6708f12fc938.
>
> When I run my program, I obtain the following output:
>
> Unable to write Avro record to file: Datum too large for file block size
>
> which is produced by
> https://gist.github.com/titouanc/0df61b807d06ca7611cc6708f12fc938#file-test-avro-c-L34
> . The only reference to a similar error message is
> http://apache-avro.679487.n3.nabble.com/Value-too-large-for-file-block-size-td4028424.html
> but i really doubt I have the same issue, as the size of my record
> should be much more lower than 16kiB.
>
>
> To write this small piece of code, I draw inspiration from the quickstop
> example
> (https://github.com/apache/avro/blob/release-1.9.1/lang/c/examples/quickstop.c).
> If I compile and run this example, it runs just fine.
>
>
> Finally, I made a small modification to my program to use the same
> schema as the quickstop example. Here is the modified version:
> https://gist.github.com/titouanc/b18c0c54657db4e1f0361e0be9f710f3.
> This one actually works perfectly !
>
>
> Could anyone help me to understand this issue ?
>
> Best regards,
>
> Titouan Christophe