You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Andrew Baranec (Jira)" <ji...@apache.org> on 2022/09/04 04:26:00 UTC

[jira] [Created] (PARQUET-2184) Improve SnappyCompressor buffer expansion performance

Andrew Baranec created PARQUET-2184:
---------------------------------------

             Summary: Improve SnappyCompressor buffer expansion performance
                 Key: PARQUET-2184
                 URL: https://issues.apache.org/jira/browse/PARQUET-2184
             Project: Parquet
          Issue Type: Improvement
          Components: parquet-mr
    Affects Versions: 1.13.0
            Reporter: Andrew Baranec


The existing implementation of SnappyCompressor will only allocate enough bytes for the buffer passed into setInput().  This leads to suboptimal performance when there are patterns of writes that cause repeated buffer expansions.  In the worst case it must copy the entire buffer for every single invocation of setInput()

Instead of allocating a buffer of size current + write length,  there should be an expansion strategy that reduces the amount of copying required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)