You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Ryan Skraba (Jira)" <ji...@apache.org> on 2021/01/14 14:15:00 UTC

[jira] [Updated] (AVRO-2999) Optimize Ruby union serialization

     [ https://issues.apache.org/jira/browse/AVRO-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Skraba updated AVRO-2999:
------------------------------
    Fix Version/s: 1.11.0

> Optimize Ruby union serialization
> ---------------------------------
>
>                 Key: AVRO-2999
>                 URL: https://issues.apache.org/jira/browse/AVRO-2999
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: ruby
>    Affects Versions: 1.10.0
>            Reporter: Joel Turkel
>            Assignee: Joel Turkel
>            Priority: Major
>             Fix For: 1.11.0
>
>
> Profiling Avro serialization in our union heavy schema shows some memory and throughput bottlenecks:
>  * Validation calls repeatedly allocate constant hashes
>  * Validation calls repeatedly allocate constant strings
>  * Validation calls are expensive and can be avoided when determining of a datum matches a null union member type (a common pattern for "optional" fields)
> Optimizing these codepaths reduces memory allocations by 78% and improves throughput 1.9X in our encoding benchmarks. A Github PR is coming shortly.
> Note: Encoding unions is still expensive because the code must determine which member of the union a datum is targeting. Allowing clients to explicitly specify this would speed up serialization even further but that requires a larger API change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)