You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Michael Smith (Jira)" <ji...@apache.org> on 2022/08/09 23:27:00 UTC

[jira] [Resolved] (IMPALA-11458) Update to newer zlib/zstd

     [ https://issues.apache.org/jira/browse/IMPALA-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Smith resolved IMPALA-11458.
------------------------------------
    Resolution: Fixed

> Update to newer zlib/zstd
> -------------------------
>
>                 Key: IMPALA-11458
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11458
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.1.0
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>             Fix For: Impala 4.2.0
>
>
> Update:
>  * zlib 1.2.11 to 1.2.12
>  * zstd 1.4.9 to 1.5.2
>  * -snappy 1.1.8 to 1.1.9- Update: removed due to performance regressions in TPCH
> h3. zlib
> Changes in 1.2.12 (27 Mar 2022)
> - Cygwin does not have _wopen(), so do not create gzopen_w() there
> - Permit a deflateParams() parameter change as soon as possible
> - Limit hash table inserts after switch from stored deflate
> - Fix bug when window full in deflate_stored()
> - Fix CLEAR_HASH macro to be usable as a single statement
> - Avoid a conversion error in gzseek when off_t type too small
> - Have Makefile return non-zero error code on test failure
> - Avoid some conversion warnings in gzread.c and gzwrite.c
> - Update use of errno for newer Windows CE versions
> - Small speedup to inflate [psumbera]
> - Return an error if the gzputs string length can't fit in an int
> - Add address checking in clang to -w option of configure
> - Don't compute check value for raw inflate if asked to validate
> - Handle case where inflateSync used when header never processed
> - Avoid the use of ptrdiff_t
> - Avoid an undefined behavior of memcpy() in gzappend()
> - Avoid undefined behaviors of memcpy() in gz*printf()
> - Avoid an undefined behavior of memcpy() in _tr_stored_block()
> - Make the names in functions declarations identical to definitions
> - Remove old assembler code in which bugs have manifested
> - Fix deflateEnd() to not report an error at start of raw deflate
> - Add legal disclaimer to README
> - Emphasize the need to continue decompressing gzip members
> - Correct the initialization requirements for deflateInit2()
> - Fix a bug that can crash deflate on some input when using Z_FIXED
> - Assure that the number of bits for deflatePrime() is valid
> - Use a structure to make globals in enough.c evident
> - Use a macro for the printf format of big_t in enough.c
> - Clean up code style in enough.c, update version
> - Use inline function instead of macro for index in enough.c
> - Clarify that prefix codes are counted in enough.c
> - Show all the codes for the maximum tables size in enough.c
> - Add gznorm.c example, which normalizes gzip files
> - Fix the zran.c example to work on a multiple-member gzip file
> - Add tables for crc32_combine(), to speed it up by a factor of 200
> - Add crc32_combine_gen() and crc32_combine_op() for fast combines
> - Speed up software CRC-32 computation by a factor of 1.5 to 3
> - Use atomic test and set, if available, for dynamic CRC tables
> - Don't bother computing check value after successful inflateSync()
> - Correct comment in crc32.c
> - Add use of the ARMv8 crc32 instructions when requested
> - Use ARM crc32 instructions if the ARM architecture has them
> - Explicitly note that the 32-bit check values are 32 bits
> - Avoid adding empty gzip member after gzflush with Z_FINISH
> - Fix memory leak on error in gzlog.c
> - Fix error in comment on the polynomial representation of a byte
> - Clarify gz* function interfaces, referring to parameter names
> - Change macro name in inflate.c to avoid collision in VxWorks
> - Correct typo in blast.c
> - Improve portability of contrib/minizip
> - Fix indentation in minizip's zip.c
> - Replace black/white with allow/block. (theresa-m)
> - minizip warning fix if MAXU32 already defined. (gvollant)
> - Fix unztell64() in minizip to work past 4GB. (Daniël Hörchner)
> - Clean up minizip to reduce warnings for testing
> - Add fallthrough comments for gcc
> - Eliminate use of ULL constants
> - Separate out address sanitizing from warnings in configure
> - Remove destructive aspects of make distclean
> - Check for cc masquerading as gcc or clang in configure
> - Fix crc32.c to compile local functions only if used
> h3. zstd
> v1.5.2 (Jan, 2022)
> perf: Regain Minimal memset()-ing During Reuse of Compression Contexts (@Cyan4973, #2969)
> build: Build Zstd with `noexecstack` on All Architectures (@felixhandte, #2964)
> doc: Clarify Licensing (@terrelln, #2981)
> v1.5.1 (Dec, 2021)
> perf: rebalanced compression levels, to better match the intended speed/level curve, by @senhuang42
> perf: faster huffman decoder, using x64 assembly, by @terrelln
> perf: slightly faster high speed modes (strategies fast & dfast), by @felixhandte
> perf: improved binary size and faster compilation times, by @terrelln
> perf: new row64 mode, used notably in level 12, by @senhuang42
> perf: faster mid-level compression speed in presence of highly repetitive patterns, by @senhuang42
> perf: minor compression ratio improvements for small data at high levels, by @cyan4973
> perf: reduced stack usage (mostly useful for Linux Kernel), by @terrelln
> perf: faster compression speed on incompressible data, by @bindhvo
> perf: on-demand reduced ZSTD_DCtx state size, using build macro ZSTD_DECODER_INTERNAL_BUFFER, at a small cost of performance, by @bindhvo
> build: allows hiding static symbols in the dynamic library, using build macro, by @skitt
> build: support for m68k (Motorola 68000's), by @cyan4973
> build: improved AIX support, by @Helflym
> build: improved meson unofficial build, by @eli-schwartz
> cli : custom memory limit when training dictionary (#2925), by @embg
> cli : report advanced parameters information when compressing in very verbose mode (``-vv`), by @Svetlitski-FB
> v1.5.0  (May 11, 2021)
> api: Various functions promoted from experimental to stable API: (#2579-2581, @senhuang42)
>   `ZSTD_defaultCLevel()`
>   `ZSTD_getDictID_fromCDict()`
> api: Several experimental functions have been deprecated and will emit a compiler warning (#2582, @senhuang42)
>   `ZSTD_compress_advanced()`
>   `ZSTD_compress_usingCDict_advanced()`
>   `ZSTD_compressBegin_advanced()`
>   `ZSTD_compressBegin_usingCDict_advanced()`
>   `ZSTD_initCStream_srcSize()`
>   `ZSTD_initCStream_usingDict()`
>   `ZSTD_initCStream_usingCDict()`
>   `ZSTD_initCStream_advanced()`
>   `ZSTD_initCStream_usingCDict_advanced()`
>   `ZSTD_resetCStream()`
> api: ZSTDMT_NBWORKERS_MAX reduced to 64 for 32-bit environments (@Cyan4973)
> perf: Significant speed improvements for middle compression levels (#2494, @senhuang42 @terrelln)
> perf: Block splitter to improve compression ratio, enabled by default for high compression levels (#2447, @senhuang42)
> perf: Decompression loop refactor, speed improvements on `clang` and for `--long` modes (#2614 #2630, @Cyan4973)
> perf: Reduced stack usage during compression and decompression entropy stage (#2522 #2524, @terrelln)
> bug: Improve setting permissions of created files (#2525, @felixhandte)
> bug: Fix large dictionary non-determinism (#2607, @terrelln)
> bug: Fix non-determinism test failures on Linux i686 (#2606, @terrelln)
> bug: Fix various dedicated dictionary search bugs (#2540 #2586, @senhuang42 @felixhandte)
> bug: Ensure `ZSTD_estimateCCtxSize*() `monotonically increases with compression level (#2538, @senhuang42)
> bug: Fix --patch-from mode parameter bound bug with small files (#2637, @occivink)
> bug: Fix UBSAN error in decompression (#2625, @terrelln)
> bug: Fix superblock compression divide by zero bug (#2592, @senhuang42)
> bug: Make the number of physical CPU cores detection more robust (#2517, @PaulBone)
> doc: Improve `zdict.h` dictionary training API documentation (#2622, @terrelln)
> doc: Note that public `ZSTD_free*()` functions accept NULL pointers (#2521, @animalize)
> doc: Add style guide docs for open source contributors (#2626, @Cyan4973)
> tests: Better regression test coverage for different dictionary modes (#2559, @senhuang42)
> tests: Better test coverage of index reduction (#2603, @terrelln)
> tests: OSS-Fuzz coverage for seekable format (#2617, @senhuang42)
> tests: Test coverage for ZSTD threadpool API (#2604, @senhuang42)
> build: Dynamic library built multithreaded by default (#2584, @senhuang42)
> build: Move  `zstd_errors.h`  and  `zdict.h`  to  `lib/`  root (#2597, @terrelln)
> build: Allow `ZSTDMT_JOBSIZE_MIN` to be configured at compile-time, reduce default to 512KB (#2611, @Cyan4973)
> build: Single file library build script moved to `build/` directory (#2618, @felixhandte)
> build: `ZBUFF_*()` is no longer built by default (#2583, @senhuang42)
> build: Fixed Meson build (#2548, @SupervisedThinking @kloczek)
> build: Fix excessive compiler warnings with clang-cl and CMake (#2600, @nickhutchinson)
> build: Detect presence of `md5` on Darwin (#2609, @felixhandte)
> build: Avoid SIGBUS on armv6 (#2633, @bmwiedmann)
> cli: `--progress` flag added to always display progress bar (#2595, @senhuang42)
> cli: Allow reading from block devices with `--force` (#2613, @felixhandte)
> cli: Fix CLI filesize display bug (#2550, @Cyan4973)
> cli: Fix windows CLI `--filelist` end-of-line bug (#2620, @Cyan4973)
> contrib: Various fixes for linux kernel patch (#2539, @terrelln)
> contrib: Seekable format - Decompression hanging edge case fix (#2516, @senhuang42)
> contrib: Seekable format - New seek table-only API  (#2113 #2518, @mdittmer @Cyan4973)
> contrib: Seekable format - Fix seek table descriptor check when loading (#2534, @foxeng)
> contrib: Seekable format - Decompression fix for large offsets, (#2594, @azat)
> misc: Automatically published release tarballs available on Github (#2535, @felixhandte)
> h3. snappy
> * Performance improvements.
> * Google Test and Google Benchmark are now bundled in third_party/.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)