You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2012/12/10 01:47:21 UTC

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

    [ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527669#comment-13527669 ] 

stack commented on HBASE-5778:
------------------------------

reopen may not be too bad.  You have to explain the difference between a reopen and a getReader somewhere... as is there is none.  I don't think it would take much to explain why you'd reopen (would 'reset' be a better name as in 'resetting the reader'... as to what it does reseting is implementation specific... If it is a compressed WAL, then we'd reopen the file... if not compressed, the reset is a noop -- right?)?

ReplicationHLogReader does not implement WAL HLog.Reader interface.  Should it?

This javadoc is on the wrong method:

+   * if a positionToSkipTo was specified, this method will take care of seeking there

I think this patch is almost there.



                
> Turn on WAL compression by default
> ----------------------------------
>
>                 Key: HBASE-5778
>                 URL: https://issues.apache.org/jira/browse/HBASE-5778
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.96.0
>
>         Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values).
> Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira