You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Bharath Vissapragada (Jira)" <ji...@apache.org> on 2020/08/24 18:24:00 UTC

[jira] [Commented] (HBASE-24859) Remove the empty regions from the hbase mapreduce splits

    [ https://issues.apache.org/jira/browse/HBASE-24859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183523#comment-17183523 ] 

Bharath Vissapragada commented on HBASE-24859:
----------------------------------------------

Can you add some data that gives some insight into memory usage? Like Xmx limits on the client JVM, no. of regions, key lengths, top 5-10 contributors (by %, based on a heap dump analysis etc)? I'm wondering if we can do some simple optimizations like dedup with interning, avoid unnecessary copies etc and get a reasonable improvement in the memory usage.

> Remove the empty regions from the hbase mapreduce splits
> --------------------------------------------------------
>
>                 Key: HBASE-24859
>                 URL: https://issues.apache.org/jira/browse/HBASE-24859
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Sandeep Pal
>            Assignee: Sandeep Pal
>            Priority: Major
>
> It has been observed that when the table has too many regions, MR jobs consume more memory in the client. This is because we keep the region level information in memory and the memory heavy object is TableSplit because of Scan object as a part of it.
> We can optimize the memory consumption by not loading the region level information if the region is empty. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)