You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2016/06/15 01:10:11 UTC

[jira] [Issue Comment Deleted] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

     [ https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Enis Soztutar updated HBASE-16030:
----------------------------------
    Comment: was deleted

(was: We should already be doing jitter in PeriodicMemstoreFlusher: 
{code}
if (((HRegion)r).shouldFlush(whyFlush)) {
          FlushRequester requester = server.getFlushRequester();
          if (requester != null) {
            long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + MIN_DELAY_TIME;
            LOG.info(getName() + " requesting flush of " +
              r.getRegionInfo().getRegionNameAsString() + " because " +
              whyFlush.toString() +
              " after random delay " + randomDelay + "ms");
            //Throttle the flushes by putting a delay. If we don't throttle, and there
            //is a balanced write-load on the regions in a table, we might end up
            //overwhelming the filesystem with too many flushes at once.
            requester.requestDelayedFlush(r, randomDelay, false);
          }
        }
{code}

You mean the delayed flush with jitter is not working? Range of delay is 5 mins, so 2.5min jitter is not enough? )

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike
> --------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16030
>                 URL: https://issues.apache.org/jira/browse/HBASE-16030
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 1.2.1
>            Reporter: Tianying Chang
>            Assignee: Tianying Chang
>             Fix For: 2.0.0, 1.3.0, 1.2.2
>
>         Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour for all regions/RS. (we use the default memstore periodic flush time of 1 hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, so that they don't get flushed at around the same time. We had this feature running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)