You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2010/04/28 23:55:51 UTC

[jira] Updated: (HBASE-1200) Add bloomfilters

     [ https://issues.apache.org/jira/browse/HBASE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-1200:
---------------------------------------

        Summary: Add bloomfilters  (was: Add bloomfilters; use dynamicbloomfilter instead of base bloomfilter)
    Description: Add bloomfiltering to hfile.  Can be enabled on a family-level basis.  Ability to configure a row vs row+col level bloom.  We size the bloomfilter with the number of entries we are about to flush which seems like usually we'd be making a filter too big, so our implementation needs to take that into account.  (was: Add bloomfiltering to hfile.  Should it be optional or on always?  Currently, we bloom filter rows only, not the column + ts component, which seems good place to start but we size the bloomfilter with the number of entries we are about to flush which seems like usually we'd be making a filter too big.  How to figure how many rows in the flush?   We should use the DynamicBloomFilter as Andrezj does up in hadoop BloomFilterMapFile.  Start small and let it resize as entries are added.)

updating the title & description text.  Note that I took out DynamicBloomFilter requirement.  I will send out a document to compliment the code fix, talking about the implementation reasoning and possible future alternatives.

> Add bloomfilters
> ----------------
>
>                 Key: HBASE-1200
>                 URL: https://issues.apache.org/jira/browse/HBASE-1200
>             Project: Hadoop HBase
>          Issue Type: Task
>            Reporter: stack
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.21.0
>
>         Attachments: ryan_bloomfilter.patch
>
>
> Add bloomfiltering to hfile.  Can be enabled on a family-level basis.  Ability to configure a row vs row+col level bloom.  We size the bloomfilter with the number of entries we are about to flush which seems like usually we'd be making a filter too big, so our implementation needs to take that into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.