You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Juhani Connolly (JIRA)" <ji...@apache.org> on 2010/05/07 06:29:48 UTC

[jira] Updated: (HBASE-2466) Improving filter API to allow for modification of keyvalue list by filter

     [ https://issues.apache.org/jira/browse/HBASE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juhani Connolly updated HBASE-2466:
-----------------------------------

    Attachment: HBASE-2466-5.patch

Included functionality in DependentColumnFilter that allows included timestamps to be restricted by the value.
Improved tests to include scans over an HRegion

Passes all tests. Could use a review. 


One possible application of DependentColumnFilter:

restricting "sets of entries" by a specific value:
A blog-entries table with several columns for comments: title, text, author. Entries in these with the same Timestamp would consist of a single full comment. One could set up a filter that restricts by author discarding all comments from Bob.



> Improving filter API to allow for modification of keyvalue list by filter
> -------------------------------------------------------------------------
>
>                 Key: HBASE-2466
>                 URL: https://issues.apache.org/jira/browse/HBASE-2466
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: filters, regionserver
>            Reporter: Juhani Connolly
>            Priority: Minor
>         Attachments: HBASE-2466-2.patch, HBASE-2466-4.patch, HBASE-2466-5.patch, HBASE-2466.patch
>
>
> As it stands, the Filter interface allows filtering by
> Filter#filterAllRemaining() -> true indicates scan is over, false, keep going on.
> Filter#filterRowKey(byte[],int,int) -> true to drop this row, if false, we will also call
> Filter#filterKeyValue(KeyValue) -> true to drop this key/value
> Filter#filterRow() -> last chance to drop entire row based on the sequence of filterValue() calls. Eg: filter a row if it doesn't contain a specified column.
> It would be useful to allow for an additional API in the form of a step to prune the list of KeyValues to be sent by implementing an additional
> Filter#filterRow(List<KeyValue>)
> This would allow for a user to write a custom filter against the api that drops unnecessary KeyValues according to user-defined rules.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.