You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "nihed mbarek (JIRA)" <ji...@apache.org> on 2017/07/27 15:00:05 UTC

[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

    [ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103307#comment-16103307 ] 

nihed mbarek commented on HBASE-18451:
--------------------------------------

After investigation, I discover that the check is already implemented on the call requestDelayedFlush and it's more an issue with logs that arrive before the insert.

{code}
requester.requestDelayedFlush(r, randomDelay, false);
{code}
is calling 
{code}
  @Override
  public void requestDelayedFlush(Region r, long delay, boolean forceFlushAllStores) {
    synchronized (regionsInQueue) {
      if (!regionsInQueue.containsKey(r)) {
        // This entry has some delay
        FlushRegionEntry fqe = new FlushRegionEntry(r, forceFlushAllStores);
        fqe.requeue(delay);
        this.regionsInQueue.put(r, fqe);
        this.flushQueue.add(fqe);
      }
    }
  }
{code}

My solution is to do a review of our interface FlushRequester to change return of requestFlush and requestDelayedFlush from void to boolean, true if our region is added on the queue, false if not. 
And the result will be 
{code}
@Override
    protected void chore() {
      final StringBuffer whyFlush = new StringBuffer();
      for (Region r : this.server.onlineRegions.values()) {
        if (r == null) continue;
        if (((HRegion)r).shouldFlush(whyFlush)) {
          FlushRequester requester = server.getFlushRequester();
          if (requester != null) {
            long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + MIN_DELAY_TIME;
            //Throttle the flushes by putting a delay. If we don't throttle, and there
            //is a balanced write-load on the regions in a table, we might end up
            //overwhelming the filesystem with too many flushes at once.
            if (requester.requestDelayedFlush(r, randomDelay, false)) {
              LOG.info(getName() + " requesting flush of " +
                      r.getRegionInfo().getRegionNameAsString() + " because " +
                      whyFlush.toString() +
                      " after random delay " + randomDelay + "ms");
            }
          }
        }
      }
    }
{code}

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-18451
>                 URL: https://issues.apache.org/jira/browse/HBASE-18451
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 2.0.0-alpha-1
>            Reporter: Jean-Marc Spaggiari
>
> If you run a big job every 4 hours, impacting many tables (they have 150 regions per server), ad the end all the regions might have some data to be flushed, and we want, after one hour, trigger a periodic flush. That's totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, which is very good.
> However, because we don't check if there is already a request in the queue, 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, you end-up getting them all way more quickly. Like within the first minute. Which not only feed the queue to to many flush requests, but also defeats the purpose of the randomDelay.
> {code}
>     @Override
>     protected void chore() {
>       final StringBuffer whyFlush = new StringBuffer();
>       for (Region r : this.server.onlineRegions.values()) {
>         if (r == null) continue;
>         if (((HRegion)r).shouldFlush(whyFlush)) {
>           FlushRequester requester = server.getFlushRequester();
>           if (requester != null) {
>             long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + MIN_DELAY_TIME;
>             LOG.info(getName() + " requesting flush of " +
>               r.getRegionInfo().getRegionNameAsString() + " because " +
>               whyFlush.toString() +
>               " after random delay " + randomDelay + "ms");
>             //Throttle the flushes by putting a delay. If we don't throttle, and there
>             //is a balanced write-load on the regions in a table, we might end up
>             //overwhelming the filesystem with too many flushes at once.
>             requester.requestDelayedFlush(r, randomDelay, false);
>           }
>         }
>       }
>     }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 283034ms
> 2017-07-24 18:45:43,933 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 84328ms
> 2017-07-24 18:45:53,866 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 72291ms
> 2017-07-24 18:46:03,329 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 62658ms
> 2017-07-24 18:46:14,084 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 87491ms
> 2017-07-24 18:46:23,538 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 290170ms
> 2017-07-24 18:46:33,353 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 228541ms
> 2017-07-24 18:46:43,359 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 211814ms
> 2017-07-24 18:46:54,150 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 155167ms
> 2017-07-24 18:47:03,546 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 115318ms
> 2017-07-24 18:47:13,350 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 216662ms
> 2017-07-24 18:47:23,329 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 96018ms
> 2017-07-24 18:47:33,891 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 144751ms
> 2017-07-24 18:47:43,329 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 152461ms
> 2017-07-24 18:47:53,832 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 97352ms
> 2017-07-24 18:48:03,348 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 263910ms
> 2017-07-24 18:48:13,353 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 151793ms
> 2017-07-24 18:48:23,340 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 61926ms
> 2017-07-24 18:48:33,724 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 188901ms
> 2017-07-24 18:48:44,038 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 233148ms
> 2017-07-24 18:48:53,624 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 6206ms
> 2017-07-24 18:49:03,378 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 240946ms
> 2017-07-24 18:49:04,125 INFO org.apache.hadoop.hbase.regionserver.HRegion: Flushing 1/1 column families, memstore=160 B
> 2017-07-24 18:49:04,182 INFO org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, sequenceid=6, memsize=160, hasBloomFilter=true, into tmp file hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/.tmp/3609340d58ae4ca190e62c3b0fe415f4
> 2017-07-24 18:49:04,208 INFO org.apache.hadoop.hbase.regionserver.HStore: Added hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/f/3609340d58ae4ca190e62c3b0fe415f4, entries=1, sequenceid=6, filesize=1015
> 2017-07-24 18:49:04,212 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~160 B/160, currentsize=0 B/0 for region testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. in 87ms, sequenceid=6, compaction requested=false
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)