You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2018/12/12 19:55:00 UTC

[jira] [Resolved] (HBASE-19034) Implement "optimize SEEK to SKIP" in storefile scanner

     [ https://issues.apache.org/jira/browse/HBASE-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl resolved HBASE-19034.
-----------------------------------
    Resolution: Won't Fix

Closing as "Won't Fix" as it turns out that the doing this optimization at the StoreFile (or HFile) Scanner level misses the most important opportunity for optimization - it's too far down the stack.

> Implement "optimize SEEK to SKIP" in storefile scanner
> ------------------------------------------------------
>
>                 Key: HBASE-19034
>                 URL: https://issues.apache.org/jira/browse/HBASE-19034
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Guanghao Zhang
>            Priority: Major
>
> {code}
>   protected boolean trySkipToNextRow(Cell cell) throws IOException {
>     Cell nextCell = null;
>     do { 
>       Cell nextIndexedKey = getNextIndexedKey();
>       if (nextIndexedKey != null && nextIndexedKey != KeyValueScanner.NO_NEXT_INDEXED_KEY
>           && matcher.compareKeyForNextRow(nextIndexedKey, cell) >= 0) { 
>         this.heap.next();
>         ++kvsScanned;
>       } else {
>         return false;
>       }    
>     } while ((nextCell = this.heap.peek()) != null && CellUtil.matchingRows(cell, nextCell));
>     return true;
>   }
> {code}
> When SQM return a SEEK_NEXT_ROW, the store scanner will seek to the cell from next row. HBASE-13109 optimized the SEEK to SKIP when we can read the cell in current loaded block. So it will skip by call heap.next to the cell from next row. But the problem is it compare too many times with the nextIndexedKey in the while loop. We plan move the compare outside the loop to reduce compare times. One problem is the nextIndexedKey maybe changed when call heap.peek, because the current storefile scanner was changed. So my proposal is to move the "optimize SEEK to SKIP" to storefile scanner. When we call seek for storefile scanner, it may real seek or implement seek by several times skip.
> Any suggestions are welcomed. Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)