You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Alexander Georgiev <es...@gmail.com> on 2010/08/03 22:04:02 UTC

Re: Review Request: HBASE-2823: Entire Row Deletes not stored in Row+Col Bloom

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/426/
-----------------------------------------------------------

(Updated 2010-08-03 13:04:02.245493)


Review request for hbase.


Changes
-------

Applied corrections suggested by Nicolas.


Summary
-------

When a Delete Row is issued on a row with row+col bloom filter, some of the columns might not be deleted. Since a Delete Row is just Delete Family applied to all columns, if a file doesn't contain the column we are searching for it might end up unaffected. In order to ensure the file will be included, the row together with row+col are added in the bloom. Then shouldSeek() checks both row and row+col if the bloom is row+col (BloomType.ROWCOL). That adds additional false positives, which are taken into account with dividing the error rate the user requires by two.


This addresses bug HBASE-2823.
    http://issues.apache.org/jira/browse/HBASE-2823


Diffs (updated)
-----

  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 979864 
  trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 979864 

Diff: http://review.cloudera.org/r/426/diff


Testing
-------

Added new test that checks this in TestHRegion.java.
Dumped the contents of the StoreFile in order to ensure that the bloom filter has row as a value when using ROWCOL blooms.


Thanks,

Alexander