You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/04/08 20:48:00 UTC
[jira] [Commented] (IMPALA-9998) Investigate updating zstd version
[ https://issues.apache.org/jira/browse/IMPALA-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317482#comment-17317482 ]
ASF subversion and git services commented on IMPALA-9998:
---------------------------------------------------------
Commit d7cc510c95c4850190ca02ae1397aef95cde3d98 in impala's branch refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d7cc510 ]
IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
This updates several compression libraries to their latest versions:
- Bzip2 1.0.8
- LZ4 1.9.3
- Snappy 1.1.8
- Zlib 1.2.11
- ZStd 1.4.9
Several of these claim minor performance improvements.
Testing:
- Ran release exhaustive job and debug core job
- Ran TPC-H scale 42 with Parquet/Snappy and Parquet/ZSTD.
(ZSTD tests ran with default compression level.)
Parquet/Snappy was unchanged. Parquet/ZSTD improved:
+----------+------------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+------------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / zstd / block | 8.50 | -2.10% | 5.46 | -2.63% |
+----------+------------------------+---------+------------+------------+----------------+
Change-Id: I858f82f773023bd0aea14543f18bd74071758468
Reviewed-on: http://gerrit.cloudera.org:8080/17254
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
> Investigate updating zstd version
> ---------------------------------
>
> Key: IMPALA-9998
> URL: https://issues.apache.org/jira/browse/IMPALA-9998
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 4.0
> Reporter: Joe McDonnell
> Priority: Major
> Labels: native-toolchain
>
> Impala currently uses zstd version 1.4.0. It looks like there are some performance improvements in more recent versions:
> {noformat}
> v1.4.5
> perf: Improved decompression speed: x64 : +10% (clang) / +5% (gcc); ARM : from +15% to +50%, depending on SoC, by @terrelln
> perf: Automatically downsizes ZSTD_DCtx when too large for too long (#2069, by @bimbashreshta)
> perf: Improved fast compression speed on aarch64 (#2040, ~+3%, by @caoyzh)
> perf: Small level 1 compression speed gains (depending on compiler)
> v1.4.4
> perf: Improved decompression speed, by > 10%, by @terrelln
> perf: Better compression speed when re-using a context, by @felixhandte
> perf: Fix compression ratio when compressing large files with small dictionary, by @senhuang42
> perf: zstd reference encoder can generate RLE blocks, by @bimbashrestha
> perf: minor generic speed optimization, by @davidbolvansky
> v1.4.1
> perf: Improve decode speed by ~7% @mgrice (#1668)
> perf: Slightly improved compression ratio of level 3 and 4 (ZSTD_dfast) by @cyan4973 (#1681)
> perf: Slightly faster compression speed when re-using a context by @cyan4973 (#1658)
> perf: Improve compression ratio for small windowLog by @cyan4973 (#1624)
> perf: Faster compression speed in high compression mode for repetitive data by @terrelln (#1635){noformat}
> [https://github.com/facebook/zstd/blob/dev/CHANGELOG]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org