You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/04/08 20:48:00 UTC

[jira] [Commented] (IMPALA-9998) Investigate updating zstd version

    [ https://issues.apache.org/jira/browse/IMPALA-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317482#comment-17317482 ] 

ASF subversion and git services commented on IMPALA-9998:
---------------------------------------------------------

Commit d7cc510c95c4850190ca02ae1397aef95cde3d98 in impala's branch refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d7cc510 ]

IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions

This updates several compression libraries to their latest versions:
 - Bzip2 1.0.8
 - LZ4 1.9.3
 - Snappy 1.1.8
 - Zlib 1.2.11
 - ZStd 1.4.9
Several of these claim minor performance improvements.

Testing:
 - Ran release exhaustive job and debug core job
 - Ran TPC-H scale 42 with Parquet/Snappy and Parquet/ZSTD.
   (ZSTD tests ran with default compression level.)
   Parquet/Snappy was unchanged. Parquet/ZSTD improved:

+----------+------------------------+---------+------------+------------+----------------+
| Workload | File Format            | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+------------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / zstd / block | 8.50    | -2.10%     | 5.46       | -2.63%         |
+----------+------------------------+---------+------------+------------+----------------+

Change-Id: I858f82f773023bd0aea14543f18bd74071758468
Reviewed-on: http://gerrit.cloudera.org:8080/17254
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Investigate updating zstd version
> ---------------------------------
>
>                 Key: IMPALA-9998
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9998
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: Joe McDonnell
>            Priority: Major
>              Labels: native-toolchain
>
> Impala currently uses zstd version 1.4.0. It looks like there are some performance improvements in more recent versions:
> {noformat}
> v1.4.5
> perf: Improved decompression speed: x64 : +10% (clang) / +5% (gcc); ARM : from +15% to +50%, depending on SoC, by @terrelln
> perf: Automatically downsizes ZSTD_DCtx when too large for too long (#2069, by @bimbashreshta)
> perf: Improved fast compression speed on aarch64 (#2040, ~+3%, by @caoyzh)
> perf: Small level 1 compression speed gains (depending on compiler)
> v1.4.4
> perf: Improved decompression speed, by > 10%, by @terrelln
> perf: Better compression speed when re-using a context, by @felixhandte
> perf: Fix compression ratio when compressing large files with small dictionary, by @senhuang42
> perf: zstd reference encoder can generate RLE blocks, by @bimbashrestha
> perf: minor generic speed optimization, by @davidbolvansky
> v1.4.1
> perf: Improve decode speed by ~7% @mgrice (#1668)
> perf: Slightly improved compression ratio of level 3 and 4 (ZSTD_dfast) by @cyan4973 (#1681)
> perf: Slightly faster compression speed when re-using a context by @cyan4973 (#1658)
> perf: Improve compression ratio for small windowLog by @cyan4973 (#1624)
> perf: Faster compression speed in high compression mode for repetitive data by @terrelln (#1635){noformat}
> [https://github.com/facebook/zstd/blob/dev/CHANGELOG]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org