You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Zheng Hu (JIRA)" <ji...@apache.org> on 2019/01/21 07:36:00 UTC

[jira] [Comment Edited] (HBASE-21734) Some optimization in FilterListWithOR

    [ https://issues.apache.org/jira/browse/HBASE-21734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747339#comment-16747339 ] 

Zheng Hu edited comment on HBASE-21734 at 1/21/19 7:36 AM:
-----------------------------------------------------------

bq. So the old impl is slow because KeyValueUtil.toNewKeyCell(currentCell)?
Branch-1 after HBASE-21620 was slow because  we need to merge each sub-filter's RC ([~KarthickRam]'s case , there ten sub-filters, the diff time cost was amplified),  while before patch in HBASE-21620,  we will skip many RC merging (but the logic was wrong, as we has discussed before). 

So, here, one thing I can do,  is removing the KeyValueUtil#toNewKeyCell.  [~anoop.hbase] suggested that it can save some gc because if we copy key part of cell into a single byte[], then then block the cell refering won't be refered by the filter list any more,  the upper layer can GC the data block quickly. 

While after HBASE-21620,  we will update the prevCellList for every encountered cell now,  so the lifecycle of cell in prevCellList for FilterList will be quite shorter.  so just use the cell ref  for saving cpu. 


was (Author: openinx):
bq. So the old impl is slow because KeyValueUtil.toNewKeyCell(currentCell)?
Branch-1 after HBASE-21620 was slow because  we need to merge each sub-filter's RC ([~KarthickRam]'s case , there ten sub-filters, the diff time cost was amplified),  while before patch in HBASE-21620,  we will skip many RC merging (but the logic was wrong, as we has discussed before). 

So, here, one thing I can do,  is removing the KeyValueUtil#toNewKeyCell.  [~anoopjohnson] suggested that it can save some gc because if we copy key part of cell into a single byte[], then then block the cell refering won't be refered by the filter list any more,  the upper layer can GC the data block quickly. 

While after HBASE-21620,  we will update the prevCellList for every encountered cell now,  so the lifecycle of cell in prevCellList for FilterList will be quite shorter.  so just use the cell ref  for saving cpu. 

> Some optimization in FilterListWithOR
> -------------------------------------
>
>                 Key: HBASE-21734
>                 URL: https://issues.apache.org/jira/browse/HBASE-21734
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: HBASE-21734.branch-1.v1.patch, HBASE-21734.v1.patch, columnkey.txt, perf-ut.patch
>
>
> In HBASE-21620,   [~KarthickRam] and [~mohamed.meeran]  complaind that their performance of filter list has been degraded after that patch in here [1].  
> I wrote a UT for this, and test under my host.  It's true.   I gussed there may be two reasons: 
> 1.  the comparator.compare(nextKV, cell) > 0 StoreScanner; 
> 2.  the filter list concated by OR will choose the minimal forward step among all sub-filters. in this patch, we have stricter restrictions on all sub filters, include those sub-filter whose has non-null RC returned in calculateReturnCodeByPrevCellAndRC (previously, we will skip to merge this sub-filter's rc, but it's wrong in some case), and merge all of the sub-filter's RC, this is also some time cost.
> The former one seems not the main problem, because the UT still cost ~ 3s even if I comment the compare.  the second one has some impact indeed, because after i skip to merge the sub-filters's RC if calculateReturnCodeByPrevCellAndRC return a non-null rc,  the UT cost ~ 1s,  it's improvement but the logic is not wrong.
> 1. https://issues.apache.org/jira/browse/HBASE-21620?focusedCommentId=16737100&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16737100



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)