You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Jim Kellerman (JIRA)" <ji...@apache.org> on 2008/06/30 22:32:45 UTC

[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609357#action_12609357 ] 

Jim Kellerman commented on HBASE-696:
-------------------------------------

stack wrote:
> Remove bloomfilter options. Only one bloomfilter type makes sense in hbase context.

True.

> Also, make bloomfilter self-sizing; you know size when flushing.

However, you can't easily know the size when doing a compaction.

Question: Is the bloomfilter based on the row key; row and column; row, column and timestamp; row and timestamp?

It seems as if basing the bloomfilter solely on the row key would be the most useful. If you are doing a get or scan with LATEST_TIMESTAMP, that won't match anything in the bloomfilter if the timestamp is included. Similarly row/family:member doesn't make sense if you are fetching by column wildcard (family:).

Using row/family: might be another option.

>Putting in 0.2 for now because its API change (for the simpler). We can punt later.

With respect to the API change, would it be sufficient to change HColumnDescriptor so that bloomFilter is a boolean ? That would require a migration step. 

BloomFilterDescriptor could then be moved to org.apache.hadoop.hbase.regionserver and become package private.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.