You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/06/18 22:19:46 UTC

[jira] Created: (HBASE-696) Make bloomfilter true/false and self-sizing

Make bloomfilter true/false and self-sizing
-------------------------------------------

                 Key: HBASE-696
                 URL: https://issues.apache.org/jira/browse/HBASE-696
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: stack
             Fix For: 0.2.0


Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.

Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609432#action_12609432 ] 

stack edited comment on HBASE-696 at 6/30/08 8:47 PM:
------------------------------------------------------

Do you think we should let users choose whether they want bloom filters to be row or row+column or row+column+ts?  Its not hard to imagine access scenarios that would benefit from any of the former.

But maybe this option can just be added later, post 0.2?

      was (Author: stack):
    Do you think we should let users choose whether they want bloom filters to be row or row+column or row+column+ts?  Its not hard to imagine access scenarios that would benefit from any of the former.

But maybe this option can just be added later, post 0.3?
  
> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609432#action_12609432 ] 

stack commented on HBASE-696:
-----------------------------

Do you think we should let users choose whether they want bloom filters to be row or row+column or row+column+ts?  Its not hard to imagine access scenarios that would benefit from any of the former.

But maybe this option can just be added later, post 0.3?

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Izaak Rubin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Izaak Rubin updated HBASE-696:
------------------------------

    Attachment: hbase-696-shellfix.patch

The patch I've attached (hbase-696-shellfix.patch) fixes the one problem I could find in the shell resulting from the bloomfilter changes.  There was no DEFAULT_BLOOMFILTER constant in HColumnDescriptor, so I added one (set to false).

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Izaak Rubin
>             Fix For: 0.2.0
>
>         Attachments: hbase-696-shellfix.patch, patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612997#action_12612997 ] 

stack commented on HBASE-696:
-----------------------------

+1.  Nice cleanup.

Would suggest that after application, you give this issue to Izaak so he makes sure the shell aligns with this change.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609357#action_12609357 ] 

jimk edited comment on HBASE-696 at 6/30/08 1:32 PM:
--------------------------------------------------------------

stack wrote:
> Remove bloomfilter options. Only one bloomfilter type makes sense in hbase context.

True.

> Also, make bloomfilter self-sizing; you know size when flushing.

However, you can't easily know the size when doing a compaction.

Question: Is the bloomfilter based on the row key; row and column; row, column and timestamp; row and timestamp?

It seems as if basing the bloomfilter solely on the row key would be the most useful. If you are doing a get or scan with LATEST_TIMESTAMP, that won't match anything in the bloomfilter if the timestamp is included. Similarly row/family:member doesn't make sense if you are fetching by column wildcard (family: ).

Using row/family: might be another option.

>Putting in 0.2 for now because its API change (for the simpler). We can punt later.

With respect to the API change, would it be sufficient to change HColumnDescriptor so that bloomFilter is a boolean ? That would require a migration step. 

BloomFilterDescriptor could then be moved to org.apache.hadoop.hbase.regionserver and become package private.


      was (Author: jimk):
    stack wrote:
> Remove bloomfilter options. Only one bloomfilter type makes sense in hbase context.

True.

> Also, make bloomfilter self-sizing; you know size when flushing.

However, you can't easily know the size when doing a compaction.

Question: Is the bloomfilter based on the row key; row and column; row, column and timestamp; row and timestamp?

It seems as if basing the bloomfilter solely on the row key would be the most useful. If you are doing a get or scan with LATEST_TIMESTAMP, that won't match anything in the bloomfilter if the timestamp is included. Similarly row/family:member doesn't make sense if you are fetching by column wildcard (family:).

Using row/family: might be another option.

>Putting in 0.2 for now because its API change (for the simpler). We can punt later.

With respect to the API change, would it be sufficient to change HColumnDescriptor so that bloomFilter is a boolean ? That would require a migration step. 

BloomFilterDescriptor could then be moved to org.apache.hadoop.hbase.regionserver and become package private.

  
> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-696:
-----------------------------------

    Assignee: Jim Kellerman

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609380#action_12609380 ] 

Bryan Duxbury commented on HBASE-696:
-------------------------------------

I thought it might be useful to point out that the bloom filter wouldn't really be used for scanning, either, since you're already going to have locality proceeding through the file as you scan. There wouldn't be much point in checking if a storefile has the next row or not.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-696.
-------------------------

    Resolution: Fixed

Resolving issue.  Thanks for the patch Izaak.  Nice cleanup Jim.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Izaak Rubin
>             Fix For: 0.2.0
>
>         Attachments: hbase-696-shellfix.patch, patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-696:
--------------------------------

    Status: In Progress  (was: Patch Available)

Committed.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609613#action_12609613 ] 

Bryan Duxbury commented on HBASE-696:
-------------------------------------

I'm not sure that there would be much benefit to putting the timestamp in the filter. It'd only be useful in the case where you're always looking for EXACT matches on row/col/ts, since you can't do any "less-than" behavior there. Likewise, I think the column is of dubious value. It's only really handy if you have a lot of columns and the one you are looking for is somewhere in the middle of the range, which would require a lot of seeking.

I think that we should go with a fewer-options approach and just make the row-only bloom filter. If someone can make the case for more things going in the filter down the road, then we'll tackle it at that point.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613447#action_12613447 ] 

Jim Kellerman commented on HBASE-696:
-------------------------------------

As it turns out bloom filters were broken anyway (see HBASE-744), so erasing them during migration might have fixed problems.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Izaak Rubin
>             Fix For: 0.2.0
>
>         Attachments: hbase-696-shellfix.patch, patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609363#action_12609363 ] 

stack commented on HBASE-696:
-----------------------------

On size for compaction, we could add count of elements to the info file (or as first field to persisted filter)?

(After chatting w/ Jim), current filters were for full key -- row/column/timestamp -- which as Jim points out, is pretty useless when most of the time we're trying to get the 'latest'.  +1 on doing the row only for now.

Yeah, would need to change HCD so filters were on or off.  No chance of self-migrating?



> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-696:
--------------------------------

    Status: Patch Available  (was: Open)

Passes all regression tests, and PerformanceEvaluation

Please review.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609357#action_12609357 ] 

Jim Kellerman commented on HBASE-696:
-------------------------------------

stack wrote:
> Remove bloomfilter options. Only one bloomfilter type makes sense in hbase context.

True.

> Also, make bloomfilter self-sizing; you know size when flushing.

However, you can't easily know the size when doing a compaction.

Question: Is the bloomfilter based on the row key; row and column; row, column and timestamp; row and timestamp?

It seems as if basing the bloomfilter solely on the row key would be the most useful. If you are doing a get or scan with LATEST_TIMESTAMP, that won't match anything in the bloomfilter if the timestamp is included. Similarly row/family:member doesn't make sense if you are fetching by column wildcard (family:).

Using row/family: might be another option.

>Putting in 0.2 for now because its API change (for the simpler). We can punt later.

With respect to the API change, would it be sufficient to change HColumnDescriptor so that bloomFilter is a boolean ? That would require a migration step. 

BloomFilterDescriptor could then be moved to org.apache.hadoop.hbase.regionserver and become package private.


> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-696:
-----------------------------------

    Assignee: Izaak Rubin  (was: Jim Kellerman)

Izaak,

Please ensure that the shell works with this change. It has been committed, so just SVN up and test out/fix the shell.

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Izaak Rubin
>             Fix For: 0.2.0
>
>         Attachments: patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609421#action_12609421 ] 

Jim Kellerman commented on HBASE-696:
-------------------------------------

> Bryan Duxbury - 30/Jun/08 02:56 PM
> I thought it might be useful to point out that the bloom filter wouldn't really be used for
> scanning, either, since you're already going to have locality proceeding through the file
> as you scan. There wouldn't be much point in checking if a storefile has the next row or
> not. 

You are quite right. bloomfilters have little to do with scanners. Their primary use is for get


> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611846#action_12611846 ] 

Jim Kellerman commented on HBASE-696:
-------------------------------------

> -----Original Message-----
> From: Jim Kellerman 
> Sent: Tuesday, July 08, 2008 4:31 PM
> To: hbase-user
> Subject: RE: HBASE-696 Make bloomfilter true/false and self-sizing
> 
> Unless there is significant feedback to the contrary, making 
> bloom filters true/false and self sizing will be an 
> incompatible change.
> 
> During migration, columns which have bloom filters enabled 
> currently will have the bloom filter erased and disabled.
> 
> Columns may bloom filters only if they are enabled when the 
> column is created.
> 
> If a column is later modified to disable the bloom filter, it 
> will be erased and cannot be re-enabled.
> 
> There is one short term (0.2.0) option for migration and 
> enabling bloom filters which will be very expensive: reading 
> the column twice, first to establish the number of entries 
> that are needed and second, to create the bloom filter.
> 
> There is one long term option: convince Hadoop that MapFile 
> should be subclassable which would entail changing private 
> members to protected members, or to provide accessors to the 
> private members in MapFile. Because hadoop-0.18.0 is in 
> feature freeze, any change of this sort would have to wait 
> for hadoop-0.19.0. hbase-0.3.0 will target hadoop-0.18.x, so 
> this change would have to wait until hbase-0.4.0.
> 
> The question is how many people use bloom filters today? It 
> is our belief that they are not particularly useful as 
> implemented. If you do use bloom filters today, would you 
> object to a process by which you would create a new bloom 
> filter enabled column and copy your data to the new column?
> 
> ---
> Jim Kellerman, Senior Engineer; Powerset


> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-696) Make bloomfilter true/false and self-sizing

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-696:
--------------------------------

    Attachment: patch.txt

> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: patch.txt
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.