You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Chen Luo (JIRA)" <ji...@apache.org> on 2017/10/06 23:26:00 UTC
[jira] [Updated] (ASTERIXDB-2128) Bloom Filter for LSMBTree is totally useless

     [ https://issues.apache.org/jira/browse/ASTERIXDB-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chen Luo updated ASTERIXDB-2128:
--------------------------------
    Description: 
We use bloom filters on primary index (as LSMBTree) to skip unnecessary pk lookups on some disk components. The idea is that if the bloom filter reports false on some pk, we totally skip searching that disk component.

However, in our current complicated BTree/LSMBTree software stack, the bloom filter is completely useless (in terms of saving I/Os). Take a look at LSMBTreePointSearchCursor. At line 74, we first call btreeAccessors[i].search(rangeCursors[i], predicate), and then call BloomFilterAwareBTreePointSearchCursor.hasNext() which performs bloom filter check. However, the call sequence is wrong. Actually, btreeAccessors[i].search has already searched the primary key against the BTree, from the root down to the leaf, rendering bloom filter completely useless in this case.


  was:
We use bloom filters on primary index (as LSMBTree) to skip unnecessary pk lookups on some disk components. The idea is that if the bloom filter reports false on some pk, we totally skip searching that disk component.

However, in our current complicated BTree/LSMBTree software stack, the bloom filter is completely useless (in terms of saving I/Os). Take a look at LSMBTreePointSearchCursor. At line 74, we first call btreeAccessors[i].search(rangeCursors[i], predicate), and then call BloomFilterAwareBTreePointSearchCursor.hasNext() which performs bloom filter check. However, the call sequence is wrong. Actually, btreeAccessors[i].search has already searched the primary key against the BTree, from the root down to the leaf, rending bloom filter completely useless in this case.



> Bloom Filter for LSMBTree is totally useless
> --------------------------------------------
>
>                 Key: ASTERIXDB-2128
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2128
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: IDX - Indexes, STO - Storage
>            Reporter: Chen Luo
>            Assignee: Chen Luo
>
> We use bloom filters on primary index (as LSMBTree) to skip unnecessary pk lookups on some disk components. The idea is that if the bloom filter reports false on some pk, we totally skip searching that disk component.
> However, in our current complicated BTree/LSMBTree software stack, the bloom filter is completely useless (in terms of saving I/Os). Take a look at LSMBTreePointSearchCursor. At line 74, we first call btreeAccessors[i].search(rangeCursors[i], predicate), and then call BloomFilterAwareBTreePointSearchCursor.hasNext() which performs bloom filter check. However, the call sequence is wrong. Actually, btreeAccessors[i].search has already searched the primary key against the BTree, from the root down to the leaf, rendering bloom filter completely useless in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)