You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Ryan Skraba (Jira)" <ji...@apache.org> on 2021/09/22 15:51:00 UTC

[jira] [Resolved] (AVRO-3167) Simplify Codec Buffer Allocation

     [ https://issues.apache.org/jira/browse/AVRO-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan Skraba resolved AVRO-3167.
-------------------------------
    Fix Version/s: 1.11.0
       Resolution: Fixed

> Simplify Codec Buffer Allocation
> --------------------------------
>
>                 Key: AVRO-3167
>                 URL: https://issues.apache.org/jira/browse/AVRO-3167
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.11.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Some performance testing of another product highlighted some weirdness to me in Avro library.  In particular the way that blocks are compressed/decompressed in {{DeflateCodec}}.
> For each block of raw data, it is compressed/decompressed into a new buffer.  That new buffer is then immediately written out to disk.  Well, that buffer is requested by the caller with a requested size, but it's a bit odd because the buffer is cached, so only the first call has any affect.  Also, that buffer is expanded as needed, but then is maintained at that size for the life of the application, it is never resized smaller, so it could hold that large (underutilized) buffer for awhile.
> Finally, even if the requested size was working as expected, the "requested" size is quite dubious.  Right now, the requested size is equal to the size of the raw block, which means that the buffer requested for a decompress will always be too small and the buffer created for a compression will always be too big.  Instead, I propose that we just fix a sensible default value for all buffers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)