You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Vivek Nadkarni (JIRA)" <ji...@apache.org> on 2012/09/15 02:41:08 UTC

[jira] [Updated] (AVRO-766) C: Memory leak from reference count cycles

     [ https://issues.apache.org/jira/browse/AVRO-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vivek Nadkarni updated AVRO-766:
--------------------------------

    Attachment: AVRO-766.patch

The current AVRO specification requires that AVRO_LINKs are internal to a schema. See comments in AVRO-530, which indicate that external links are not backwards compatible and may only be introduced in Avro 2.0.

Currently, when an AVRO_LINK is created in a schema, the reference count of the target schema is incremented. If the schema is recursive, then the reference count of the top-level schema is incremented -- and decrementing the reference count of the top-level schema doesn't deallocate the schema. Thus, a memory leak is formed.

Since, all AVRO_LINKs have targets that are internal to the same top-level schema, we could decide that we would not increment the reference count of the targets of any links. The targets would be available as long as the top-level schema is available. But if the top-level schema is not available, then all the internal links would be destroyed too. Therefore, as long as the link itself is valid, the targets would also be valid. Using this internal structural knowledge of the schema, gives us an implicit guarantee of link target validity, while breaking the reference count cycles for recursive schemas.

To implement this mechanism, we would need to ensure that no AVRO_LINKs are created with targets outside the top-level schema. While we cannot enforce this rule, we can document that external link targets would violate the spec, and could result in memory leaks. 

Unfortunately, avro_schema_copy() currently implements a link to an external target - described in AVRO-1167. Therefore, AVRO-766 should not be fixed using the described mechanism until AVRO-1167 is also fixed.

This patch removes the increment and decrement of reference counts for link targets as described above. It also contains a test case test_avro_766.c (derived from ref-cycle.c), which shows the memory leak. It also contains a macro called TEST_AVRO_1167, that is currently enabled. If the test is disabled, you can see that this patch works.

With TEST_AVRO_1167 set to (0):

==21796== HEAP SUMMARY:
==21796==     in use at exit: 0 bytes in 0 blocks
==21796==   total heap usage: 129 allocs, 129 frees, 6,090 bytes allocated
==21796== 
==21796== All heap blocks were freed -- no leaks are possible
==21796== 
==21796== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

With TEST_AVRO_1167 set to (1):

==21417== 4,240 (32 direct, 4,208 indirect) bytes in 1 blocks are definitely lost in loss record 30 of 30
==21417==    at 0x4C28F9F: malloc (vg_replace_malloc.c:236)
==21417==    by 0x4C29019: realloc (vg_replace_malloc.c:525)
==21417==    by 0x40D34C: avro_default_allocator (allocation.c:36)
==21417==    by 0x404647: avro_schema_union (schema.c:310)
==21417==    by 0x406886: avro_schema_copy (schema.c:1250)
==21417==    by 0x40670A: avro_schema_copy (schema.c:1183)
==21417==    by 0x403EEA: main (test_avro_766.c:64)
==21417== 
==21417== LEAK SUMMARY:
==21417==    definitely lost: 56 bytes in 2 blocks
==21417==    indirectly lost: 4,232 bytes in 44 blocks
==21417==      possibly lost: 0 bytes in 0 blocks
==21417==    still reachable: 0 bytes in 0 blocks
==21417==         suppressed: 0 bytes in 0 blocks
==21417== 
==21417== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4)


I am posting this test with TEST_AVRO_1167 set to (1) because I don't know the implications of applying this patch before fixing AVRO-1167.


                
> C: Memory leak from reference count cycles
> ------------------------------------------
>
>                 Key: AVRO-766
>                 URL: https://issues.apache.org/jira/browse/AVRO-766
>             Project: Avro
>          Issue Type: Bug
>          Components: c
>    Affects Versions: 1.5.0
>            Reporter: Douglas Creager
>         Attachments: AVRO-766.patch, ref-cycle.c
>
>
> If you parse a recursive Avro schema, you end up with a cycle in the reference graph for the avro_schema_t objects that are created.  The reference counting mechanism that we're using can't detect this, and so you get a memory leak.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira