You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org> on 2013/08/06 12:23:49 UTC

[jira] [Commented] (HBASE-7391) Review/improve HLog compression's memory consumption

    [ https://issues.apache.org/jira/browse/HBASE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730596#comment-13730596 ] 

ramkrishna.s.vasudevan commented on HBASE-7391:
-----------------------------------------------

Going thro this again for having WAL compression for tags.
The above said OOME is more likely to happen when we are creating RecoveredEdits and when we are reading recovered edits.
In case of creating recovered.edits HLog files we would be instantiating Writers per region and each Writer would create a compression context and hence the 5 type of dictionaries would be created.  In the normal case this could be ok (where the HLog is instantiated for the RS).
So incase of recovered edits we can easily save on the initalization part. We know that for the recovered edits the Regionname, tablename are always constant because the recovered.edits knows for which region and which table it is.
Similarly the family can be moved out of this array which is 1024 in size by default.  Assuming the number of families are not huge generally.
Still need to think on the qualifier for wider tables.  
The row dictionary can be as how it is implemented now.  Tags also could go with the same way like rows.  For the row and tag part we could identify a better way to optimize the memory usage.  
So having a different implementation of the dictionary as said above for the recovered.edits creation would directly reduce the memory consumption.
                
> Review/improve HLog compression's memory consumption
> ----------------------------------------------------
>
>                 Key: HBASE-7391
>                 URL: https://issues.apache.org/jira/browse/HBASE-7391
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.95.2
>
>
> From Ram in http://mail-archives.apache.org/mod_mbox/hbase-dev/201205.mbox/%3C00bc01cd31e6$7caf1320$760d3960$%25vasudevan@huawei.com%3E:
> {quote}
> One small observation after giving +1 on the RC.
> The WAL compression feature causes OOME and causes Full GC.
> The problem is, if we have 1500 regions and I need to create recovered.edits
> for each of the region (I don’t have much data in the regions (~300MB)).
> Now when I try to build the dictionary there is a Node object getting
> created.
> Each node object occupies 32 bytes.
> We have 5 such dictionaries.
> Initially we create indexToNodes array and its size is 32767.
> So now we have 32*5*32767 = ~5MB.
> Now I have 1500 regions.
> So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very high
> initial memory foot print and this never allows me to split the logs and I
> am not able to make the cluster up at all.
> Our configured heap size was 8GB, tested in 3 node cluster with 5000
> regions, very less data( 1GB in hdfs cluster including replication), some
> small data is spread evenly across all regions.
> The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
> objects)*noofregions.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira