You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Vivek Nadkarni (Commented) (JIRA)" <ji...@apache.org> on 2011/12/20 18:22:30 UTC

[jira] [Commented] (AVRO-984) Avro C - Resolved reader fails to read nested arrays and reads uninitialized memory

    [ https://issues.apache.org/jira/browse/AVRO-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173322#comment-13173322 ] 

Vivek Nadkarni commented on AVRO-984:
-------------------------------------

I did a few more tests, and found the following: 

1. When I nest an array inside an array, i.e.
       {"type":"array", "items": {"type": "array", "items": "long"}}
   the function avro_resolved_array_writer_init() is called only once
   for the top level array. (This is the schema in the attached test program.)

2. When I place an array inside a record, i.e.
       {"type" : "record", 
        "name" : "myrecord", 
        "fields" : [ { "name" : "myfield", "type":"long"}, 
                     { "name" : "myarray", "type": { "type":"array", "items" : "long"} }
                   ] }
   the functions avro_resolved_record_writer_init(), initializes all
   of its children, and in that process calls avro_resolved_array_writer_init().

3. When I place a record inside an array, i.e. 
       {"type":"array", "items": {"type" : "record", 
                                  "name" : "myrecord", 
                                  "fields" : [ { "name" : "myfield", "type":"long"}
                                             ] } }
   The function avro_resolved_array_writer_init() is called but the function
   avro_resolved_record_writer_init() is not called. 

I think that the function avro_resolved_array_writer_init() should
recursively call the init function on its items, but I am not sure how
to go about doing this yet. This would be similar to how the function
avro_resolved_record_writer_init() recursively calls the init function
on its children.

For comparision when resolving schemas, the function try_array() calls
avro_resolved_writer_new_memoized() recursively on the schemas of its
items.

I would appreciate any ideas on how to go about recursively calling the 
init functions on the array items, and also any comments on whether that 
indeed is the way to fix the behavior I am seeing. 

Thanks,
Vivek

                
> Avro C - Resolved reader fails to read nested arrays and reads uninitialized memory
> -----------------------------------------------------------------------------------
>
>                 Key: AVRO-984
>                 URL: https://issues.apache.org/jira/browse/AVRO-984
>             Project: Avro
>          Issue Type: Bug
>          Components: c
>    Affects Versions: 1.6.1, 1.6.2, 1.7.0
>         Environment: GNU/Linux Ubuntu 11.10 64-bit 
>            Reporter: Vivek Nadkarni
>             Fix For: 1.6.2, 1.7.0
>
>         Attachments: avro-984-output.txt, avro-984-test.c, avro-984-valgrind-output.txt
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Summary: 
> I created a test program that creates an avro value corresponding to
> the following schema: 
>   {"type":"array", "items": {"type": "array", "items": "long"}}
> and tries to read it back using a matched and resolved reader. The
> matched reader is able to reconstruct the avro value, but the resolved
> reader fails to read the value. 
> Additionally valgrind indicates that conditional jumps are being
> performed based on uninitialized memory, when trying to reconstruct
> the value using the resolved reader.
> More Details:
> I created a test program that creates an avro value corresponding to
> the following writer schema: 
>   {"type":"array", "items": {"type": "array", "items": "long"}}
> The avro value is serialized to memory. 
> Then this memory is read back into two readers. In both cases the
> reader schema is set to be identical to the writer schema.
> The first reader is a matched reader -- i.e. the reader knows that the
> writer and reader schema are identical. 
> The second reader is a resolved reader -- i.e. the reader tries to
> resolve differences between the writer and reader schema. The schemas
> should resolve perfectly, since the writer and reader schema are
> identical.
> When we try to deserialize the binary buffer with the matched reader,
> the value is reconstructed perfectly.
> When we try to deserialize the binary buffer with the resolved reader,
> the code fails to populate the avro value. This failure indicates that
> the resolved reading of the nested array not working
> properly. Additionally valgrind indicates that conditional jumps are
> being performed based on uninitialized values, in this second case.
> I will attach a test program that shows the issues. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira