You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2010/07/13 02:52:51 UTC

[jira] Commented: (HBASE-2832) Priorities and multi-threading for MemStore flushing

    [ https://issues.apache.org/jira/browse/HBASE-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887606#action_12887606 ] 

Jonathan Gray commented on HBASE-2832:
--------------------------------------

Class comment from new FlushHandler that does a rundown of flushing in hbase:

{noformat}
/**
 * Flushes are currently triggered when:
 *
 * <ol>
 * <li>MemStore size of an HRegion is > hbase.hregion.memstore.flush.size
 *     (checked after every memstore insertion)
 * </li>
 * <li>Sum of MemStores of an HRegionServer are > total.memstore.heap
 *     (checked after every memstore insertion)
 * </li>
 * <li>Number of HLogs of an HRegionServer are > max.hlogs
 *     (checked after every hlog roll?) TODO: Verify
 * </li>
 * <li>HRegion is being closed
 *     (when receiving message from master)
 * </li>
 * <li>HRegionServer is being quiesced
 *     (when receiving message from master)
 * </li>
 * <li>Client manually triggers flush of an HRegion
 *     (when receiving message from master)
 * </li>
 * <li>MemStore size of an HRegion is > memstore.flush.size *
 *     hbase.hregion.memstore.multiplier
 *     (checked before every memstore insertion)
 * </li>
 * </ol>
 *
 * There are 3 different types of flushes that correspond to these 6 events:
 *
 * <ol>
 * <li>
 *    Low Priority Flush
 *
 *    This occurs in response to #1.
 *
 *    This is the lowest priority flush and does not need any tricks.  All other
 *    flush types should be completed before any of this type are done.  The one
 *    optimization it has is that if it determines that a compaction would be
 *    triggered after the flush finished, it should cancel the flush and instead
 *    trigger a CompactAndFlush.
 * </li>
 * <li>
 *    High Priority Flush
 *
 *    This occurs in response to #2, #3, #6, and #7.
 *
 *    High priority flushes occur in response to memory pressure, WAL pressure,
 *    or because a user has asked for the flush.  These flushes should occur
 *    before any low priority flushes are processed.  They are only special
 *    because of their priority, otherwise the implementation of the flush is
 *    identical to a low priority flush.
 *
 *    This flush type explicitly does not contain the CompactAndFlush check
 *    because it wants to flush as fast as possible.
 * </li>
 * <li>
 *    High Priority Double Flush
 *
 *    TODO: Region closing currently does flushing in-band rather than through
 *          the flush queue.  Should move those into using handlers once we have
 *          blocking call.  Therefore, double-flush priority is not currently
 *          used.
 *
 *    TODO: Do we want a separate priority here once we do use this for closes?
 *          Or are they just high priority flushes?  The first one is even a
 *          low priority flush, second one high priority?
 *
 *    This occurs in response to #4.
 *
 *    When an HRegion is being closed (but not when a cluster is being quiesced)
 *    we want to minimize the amount of time the region is unavailable.  To do
 *    this we do a double flush.  A flush is done, then the region is closed,
 *    then an additional flush is done before the region is available to be
 *    re-opened.  This should happen when the Master asks a region to close
 *    because of reassignment.
 *
 *    This may also occur before splits to reduce the amount of time the parent
 *    is offline before the daughters come back online.
 * </li>
 * </ol>
 */

> Priorities and multi-threading for MemStore flushing
> ----------------------------------------------------
>
>                 Key: HBASE-2832
>                 URL: https://issues.apache.org/jira/browse/HBASE-2832
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.90.0
>
>
> Similar to HBASE-1476 and HBASE-2646 which are for compactions, but do this for flushes.
> Flushing when we hit the normal flush size is a low priority flush.  Other types of flushes (heap pressure, blocking client requests, etc) are high priority.
> Should have a tunable number of concurrent flushes.
> Will use the {{HBaseExecutorService}} and {{HBaseEventHandler}} introduced from master/zk changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.