You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2021/05/10 16:14:32 UTC

[GitHub] [hbase] apurtell commented on pull request #3244: HBASE-25869 WAL value compression

apurtell commented on pull request #3244:
URL: https://github.com/apache/hbase/pull/3244#issuecomment-836903862


   > So we will only compress value?
   
   This is an enhancement to existing WAL compression. As you know the existing WAL compression already compresses other aspects of WAL entries _except_ for the value. This patch adds support for compressing values too. 
   
   > As we will do batching when writing WAL entries out, is it possible to compress when flushing? The data will be larger and compress may perform better. The structure of a WAL file will be multiple compressed blocks.
   
   This is not possible for two reasons:
   
   1. WALCellCodec does not compress the WAL file in blocks. The design is cell by cell. I want to introduce value compression without re-engineering the whole WAL format. Perhaps our WAL file format is due for a redesign, but I would like to see that be a different issue. 
   
   2. It is not necessary to do that. By using the same Deflater instance for the whole WAL we already get the benefit you are thinking of... it builds its dictionary across the whole file. Even though compression is flushed at every cell, we are not resetting the compressor. (That would be FULL_FLUSH. We are using SYNC_FLUSH.) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org