You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/03/10 21:08:59 UTC

[jira] Resolved: (HBASE-3551) Loaded hfile indexes occupy a good chunk of heap; look into shrinking the amount used and/or evicting unused indices

     [ https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3551.
--------------------------

    Resolution: Won't Fix

Ok.  Closing.  Will reference your comment Marc over in HBASE-25, etc.  I also added a section to schema design on size of rows and column family names, keeping them small.  Thanks for digging in boss.

  <section xml:id="keysize">
      <title>Try to minimize row and column sizes</title>
      <para>In HBase, values are always freighted with their coordinates; as a
          cell value passes through the system, it'll be accompanied by its
          row, column name, and timestamp.  Always.  If your rows and column names
          are large, especially compared o the size of the cell value, then
          you may run up against some interesting scenarios.  One such is
          the case described by Marc Limotte at the tail of
          <link xlink:url="https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=13005272#comment-13005272">HBASE-3551</link>
          (recommended!).
          Therein, the indices that are kept on HBase storefiles (<link linkend="hfile">HFile</link>s)
                  to facilitate random access may end up occupyng large chunks of the HBase
                  allotted RAM because the cell value coordinates are large.
                  Mark in the above cited comment suggests upping the block size so
                  entries in the store file index happen at a larger interval or
                  modify the table schema so it makes for smaller rows and column
                  names.
      `</para>
  </section>

> Loaded hfile indexes occupy a good chunk of heap; look into shrinking the amount used and/or evicting unused indices
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3551
>                 URL: https://issues.apache.org/jira/browse/HBASE-3551
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>
> I hung with a user Marc and we were looking over configs and his cluster profile up on ec2.  One thing we noticed was that his 100+ 1G regions of two families had ~2.5G of heap resident.  We did a bit of math and couldn't get to 2.5G so that needs looking into.  Even still, 2.5G is a bunch of heap to give over to indices (He actually OOME'd when he had his RS heap set to just 3G; we shouldn't OOME, we should just run slower).  It sounds like he needs the indices loaded but still, for some cases we should drop indices for unaccessed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira