You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jingyun Tian (JIRA)" <ji...@apache.org> on 2018/04/19 02:10:00 UTC

[jira] [Comment Edited] (HBASE-18059) The scanner order for memstore scanners are wrong

    [ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443467#comment-16443467 ] 

Jingyun Tian edited comment on HBASE-18059 at 4/19/18 2:09 AM:
---------------------------------------------------------------

[~stack] Because the scanner order is used when the kvComparator cannot determine which is bigger between 2 cells, that means the 2 cells have same key and same seqId. But for memstore, all cells have its own seqld, thus it will nerver reach the compare of scanner order. [~appy] explain this very clear:
{quote} - memstore scanners vs storefile(SF) scanner:
 -- if cell in SF has seqId: Its seqId should be less than that of memstore's cell. Memstore scanner will win.
 -- if cell in SF does NOT have seqId: SF'cell [defaults to seqId=0|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-common/src/main/java/org/apache/hadoop/hbase/Cell.java#L142], memstore scanner will win. (Note that seqId of cells in hfiles are removed on major compaction if older than certain time, default is 5 days)
 - memstore vs bulk loaded file(BLF)
 -- memstore cell's will have higher seqId, so memstore scanner will win.{quote}

ps. there should not have two cells that have same key and same seqId, thus scanner order is useless


was (Author: tianjingyun):
[~stack] Because the scanner order is used when the kvComparator cannot determine which is bigger between 2 cells, that means the 2 cells have same key and same seqId. But for memstore, all cells have its own seqld, thus it will nerver reach the compare of scanner order. [~appy] explain this very clear:
{quote} - memstore scanners vs storefile(SF) scanner:
 -- if cell in SF has seqId: Its seqId should be less than that of memstore's cell. Memstore scanner will win.
 -- if cell in SF does NOT have seqId: SF'cell [defaults to seqId=0|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-common/src/main/java/org/apache/hadoop/hbase/Cell.java#L142], memstore scanner will win. (Note that seqId of cells in hfiles are removed on major compaction if older than certain time, default is 5 days)
 - memstore vs bulk loaded file(BLF)
 -- memstore cell's will have higher seqId, so memstore scanner will win.{quote}

> The scanner order for memstore scanners are wrong
> -------------------------------------------------
>
>                 Key: HBASE-18059
>                 URL: https://issues.apache.org/jira/browse/HBASE-18059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, scan, Scanners
>    Affects Versions: 2.0.0
>            Reporter: Duo Zhang
>            Assignee: Jingyun Tian
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>    * Get the order of this KeyValueScanner. This is only relevant for StoreFileScanners and
>    * MemStoreScanners (other scanners simply return 0). This is required for comparing multiple
>    * files to find out which one has the latest data. StoreFileScanners are ordered from 0
>    * (oldest) to newest in increasing order. MemStoreScanner gets LONG.max since it always
>    * contains freshest data.
>    */
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to select scanner order for memstore scanner is to ordered from Long.MAX_VALUE in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)