You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Xavier Léauté (JIRA)" <ji...@apache.org> on 2017/05/01 23:45:04 UTC
[jira] [Created] (KAFKA-5150) LZ4 decompression is 4-5x slower than
Snappy on small batches / messages
Xavier Léauté created KAFKA-5150:
------------------------------------
Summary: LZ4 decompression is 4-5x slower than Snappy on small batches / messages
Key: KAFKA-5150
URL: https://issues.apache.org/jira/browse/KAFKA-5150
Project: Kafka
Issue Type: Bug
Components: consumer
Affects Versions: 0.10.2.0
Reporter: Xavier Léauté
Assignee: Xavier Léauté
I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch sizes with small messages after observing some performance bottlenecks in the consumer.
For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms compared to Snappy (see benchmark below). Most of our time is currently spent allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we default to larger 64kB block sizes. Some quick testing shows we could improve performance by almost an order of magnitude for small batches and messages if we re-used buffers between instantiations of the input stream.
[Benchmark Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
{code}
Benchmark (compressionType) (messageSize) Mode Cnt Score Error Units
DeepRecordsIteratorBenchmark.measureSingleMessage LZ4 100 thrpt 20 84802.279 ± 1983.847 ops/s
DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY 100 thrpt 20 407585.747 ± 9877.073 ops/s
DeepRecordsIteratorBenchmark.measureSingleMessage NONE 100 thrpt 20 579141.634 ± 18482.093 ops/s
{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)