You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Eiichi Sato (JIRA)" <ji...@apache.org> on 2019/03/28 16:21:00 UTC

[jira] [Commented] (HBASE-15545) org.apache.hadoop.io.compress.DecompressorStream allocates too much memory

    [ https://issues.apache.org/jira/browse/HBASE-15545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804066#comment-16804066 ] 

Eiichi Sato commented on HBASE-15545:
-------------------------------------

We had the same issue and found that this issue gets more serious when Lz4Codec or SnappyCodec (both of which by default allocates 256 KiB for every decompress(), which is far larger than the default 4 KiB of GzipCodec) is used or when compressed BlockCache (where every read needs to decompress() cached blocks) is enabled. In our case, DecompressorStream accounts for about 48% of memory allocations in a compaction thread, and more than half of RegionServer's memory allocations in total. Allocation rate was too high and we had suffered from occasional allocation stalls with ZGC.

https://github.com/eiiches/hbase/commit/ad1ec4081b0ec9af5e20befaa1d09d0852e60d02
https://github.com/eiiches/hadoop/commit/e3337840b6e34236342c039b8a0b9fb9fcccfa40

We applied these patches to our cluster and saw 60-70% reduction in allocation rate. My approach is to cache DecompressorStream "weakly" in ThreadLocal and reuse them. WeakReference is used so that the cache won't be retained too long because I thought many people (especially for those who don't use compressed BlockCache) would prefer to keep heap usage minimum at the cost of slightly more frequent re-allocations.

What do you think? As this requires a change to hadoop-common, I think I will go propose the Hadoop part of the change to the community as a first step, if you like this fix.

> org.apache.hadoop.io.compress.DecompressorStream allocates too much memory
> --------------------------------------------------------------------------
>
>                 Key: HBASE-15545
>                 URL: https://issues.apache.org/jira/browse/HBASE-15545
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Major
>         Attachments: image-2019-03-29-01-20-56-863.png
>
>
> It accounts for ~ 11% of overall memory allocation during compaction when compression (GZ) is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)