You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/24 01:46:40 UTC

[GitHub] [incubator-doris] xiaokang commented on pull request #9747: add zstd compression codec

xiaokang commented on PR #9747:
URL: https://github.com/apache/incubator-doris/pull/9747#issuecomment-1135310442

   Compared to Lz4f codec, we see zstd codec get 35% compressed size off, 30% faster at first time read without OS page cache, 40% slower at second time read with OS page cache in the following comparison test.
   
   test data: 25GB text log, 110 million rows
   test table: test_table(ts varchar(30), log string)
   test SQL: set enable_vectorized_engine=1; select sum(length(log)) from test_table
   be.conf: disable_storage_page_cache = true
   set this config to disable doris page cache to avoid all data cached in memory for test real decompression speed.
   test result
   
   master branch with lz4f codec result: 
   - compressed size 4.3G
   - SQL first exec time(read data from disk + decompress + little computation) : 18.3s
   - SQL second exec time(read data from OS pagecache + decompress + little computation) : 2.4s
   
   this branch with zstd codec (hardcode enable it) result:
   - compressed size: 2.8G
   - SQL first exec time: 12.8s
   - SQL second exec time: 3.4s
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org