You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2010/05/07 22:15:01 UTC

[jira] Updated: (HBASE-1200) Add bloomfilters

     [ https://issues.apache.org/jira/browse/HBASE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-1200:
---------------------------------------

               Status: Patch Available  (was: In Progress)
    Affects Version/s: 0.20.5
        Fix Version/s: 0.20.5
                           (was: 0.21.0)

Static bloom filter implementation for 0.20.5.  See subsequent document for overview of config settings, implementation details, lessons learned, and future ideas.  Has been through internal peer review, passing unit tests, and passed preliminary HBaseTest load test & random read test with expected results:

1.8 mil rows, 1 col/row, 1 version/row, 51KB/entry
 =  ~2 stores/region, ~2x read speedup, negligible load time difference

PS - Could not submit this to review board for some reason.  Said it couldn't find branches/0.20/src/test/org/apache/hadoop/hbase/HBaseTestingUtility.java.  Todd?

> Add bloomfilters
> ----------------
>
>                 Key: HBASE-1200
>                 URL: https://issues.apache.org/jira/browse/HBASE-1200
>             Project: Hadoop HBase
>          Issue Type: Task
>    Affects Versions: 0.20.5
>            Reporter: stack
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.20.5
>
>         Attachments: ryan_bloomfilter.patch
>
>
> Add bloomfiltering to hfile.  Can be enabled on a family-level basis.  Ability to configure a row vs row+col level bloom.  We size the bloomfilter with the number of entries we are about to flush which seems like usually we'd be making a filter too big, so our implementation needs to take that into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.