You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2012/05/05 20:25:50 UTC

[jira] [Commented] (HBASE-5891) Change Compression Based on Type of Compaction

    [ https://issues.apache.org/jira/browse/HBASE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269029#comment-13269029 ] 

Andrew Purtell commented on HBASE-5891:
---------------------------------------

It used to be possible (circa 0.90) to vary the compression algorithm used for flushes and minor compactions and that for major compactions. I added this because we had a case under consideration where data would grow colder proportionally to the delta between current and write time. It was simple and low impact to set flush compaction to LZO and major compaction to BZIP2 (and we flirted with LZMA but that is simply too bandwidth constrained), and a script would trigger region-by-region major compaction daily. I don't know if this is maintained in the current code base. Compaction was significantly reworked 0.90 -> 0.92 and we didn't pick up the majority of these changes in our internal version. 
                
> Change Compression Based on Type of Compaction
> ----------------------------------------------
>
>                 Key: HBASE-5891
>                 URL: https://issues.apache.org/jira/browse/HBASE-5891
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Nicolas Spiegelberg
>            Priority: Minor
>
> We currently use LZO on our production systems because the on-demand decompression speed of GZ is too slow.  That said, many of our major-compacted StoreFiles are infrequently read because of lazy seek optimizations, but they occupy the majority of our disk space.  One idea is to change the type of compression depending upon compaction characteristics (input size or major compaction flag).  This would allow us to have our largest and least-read files be GZ compressed and save space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira