You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/06/01 20:40:00 UTC

[jira] [Updated] (HBASE-20628) SegmentScanner does over-comparing when one flushing

     [ https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-20628:
--------------------------
    Attachment: HBASE-20628.branch-2.0.003.patch

> SegmentScanner does over-comparing when one flushing
> ----------------------------------------------------
>
>                 Key: HBASE-20628
>                 URL: https://issues.apache.org/jira/browse/HBASE-20628
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: stack
>            Priority: Critical
>             Fix For: 2.0.1
>
>         Attachments: HBASE-20628.branch-2.0.001 (1).patch, HBASE-20628.branch-2.0.001.patch, HBASE-20628.branch-2.0.001.patch, HBASE-20628.branch-2.0.002.patch, HBASE-20628.branch-2.0.003.patch, Screen Shot 2018-05-25 at 9.38.00 AM.png, hits-20628.png
>
>
> Flushing memstore is taking too long. It looks like we are doing a bunch of comparing out of a new facility in hbase2, the Segment scanner at flush time.
> Below is a patch from [~anoop.hbase]. I had a similar more hacky version. Both undo the extra comparing we were seeing in perf tests.
> [~anastas] and [~eshcar]. Need your help please.
> As I read it, we are trying to flush the memstore snapshot (default, no IMC case). There is only ever going to be one Segment involved (even if IMC is enabled); the snapshot Segment. But the getScanners is returning a list (of one)  Scanners and the scan is via the generic SegmentScanner which is all about a bunch of stuff we don't need when doing a flush so it seems to do more work than is necessary. It also supports scanning backwards which is not needed when trying to flush memstore.
> Do you see a problem doing a version of Anoops patch (whether IMC or not)? It makes a big difference in general throughput when the below patch is in place. Thanks.
> {code}
> diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> index cbd60e5da3..c3dd972254 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
>      this.cellsCount = snapshot.getCellsCount();
>      this.memStoreSize = snapshot.getMemStoreSize();
>      this.timeRangeTracker = snapshot.getTimeRangeTracker();
> -    this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +    //this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +    this.scanners = snapshot.getScannersForSnapshot();
>      this.tagsPresent = snapshot.isTagsPresent();
>    }
> diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> index 70074bf3b4..279c4e50c8 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
>  import org.apache.hadoop.hbase.io.TimeRange;
>  import org.apache.hadoop.hbase.util.Bytes;
>  import org.apache.hadoop.hbase.util.ClassSize;
> +import org.apache.hadoop.hbase.util.CollectionBackedScanner;
>  import org.apache.yetus.audience.InterfaceAudience;
>  import org.slf4j.Logger;
>  import org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
> @@ -130,6 +131,10 @@ public abstract class Segment {
>      return Collections.singletonList(new SegmentScanner(this, readPoint, order));
>    }
> +  public List<KeyValueScanner> getScannersForSnapshot() {
> +    return Collections.singletonList(new CollectionBackedScanner(this.cellSet.get(), comparator));
> +  }
> +
>    /**
>     * @return whether the segment has any cells
>     */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)