You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2023/04/28 19:31:00 UTC

[jira] [Created] (IMPALA-12108) Add support for writing data with LZ4's high compression mode

Joe McDonnell created IMPALA-12108:
--------------------------------------

             Summary: Add support for writing data with LZ4's high compression mode
                 Key: IMPALA-12108
                 URL: https://issues.apache.org/jira/browse/IMPALA-12108
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 4.3.0
            Reporter: Joe McDonnell


LZ4 has a high compression mode that gets higher compression ratios than Snappy while maintaining high decompression speeds. The tradeoff is that compression is very slow. We should add support for writing data with LZ4 high compression mode. This would let us get a sense of the performance for writing and reading.

See this benchmark on the LZ4 page:

https://github.com/lz4/lz4#benchmarks

In my hand tests, Parquet/LZ4 is about 13% smaller than Parquet/Snappy, but it retains the fast decompression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org