You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "James Kennedy (JIRA)" <ji...@apache.org> on 2007/07/01 18:49:04 UTC
[jira] Updated: (HADOOP-1531) Add RowFilter to HRegion.HScanner
[ https://issues.apache.org/jira/browse/HADOOP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Kennedy updated HADOOP-1531:
----------------------------------
Attachment: RowFilter-v2.patch
Ok i've fixed above issues and implemented filter-stop-scan mechanism.
I also renamed RowFilter to RegExpRowFilter, created a new PageRowFilter for limiting to a maximum result size, and created a RowFilterSet which is a RowFilter that contains RowFilters and represents a heirarchy of filters to be processed disjunctively or conjunctively.
I've tested with my own tests but still need to write one in the HBase project.
I had a hard time getting my eclipse formatter to wrap lines on boolean operators like you suggested. I did manually in this case. Is it possible for you or have you guys already posted an export of your formatter settings? That way I can be sure I'm formatting exactly how you are.
> Add RowFilter to HRegion.HScanner
> ---------------------------------
>
> Key: HADOOP-1531
> URL: https://issues.apache.org/jira/browse/HADOOP-1531
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Affects Versions: 0.14.0
> Reporter: James Kennedy
> Assignee: James Kennedy
> Attachments: RowFilter-v2.patch, RowFilter.patch
>
>
> I've implemented a RowFilterInterface and a RowFilter implementation. This is passed to the HRegion.HScanner via HClient.openScanner() though it is an entirely optional parameter.
> HScanner applies the filter in the next() call by iterating until it encounters a row that is not filtered by the RowFilter. The filter applies criteria based on row keys and/or column data values.
> Null values are little tricky since the resultSet in that loop may represent nulls as absent columns or as DELETED_BYTES. Nevertheless null cases are taken care of by the filter and you can for example retrieve all rows where column X = null.
> The initial RowFilter implementation is limited in several ways:
> * Equality test only with literal values. No !=, <, >, etc. No col1 == col2. This is a straight-up byte[] comparison.
> * Multiple column criteria are treated as an implicit conjunction, no disjunction possible.
> * row key criteria is a regular expression only
> * row key criteria is independent of column criteria. No "if rowkey.matches(A) and col1==B" although the interface is created to allow for that.
> But it should be easy to write an improved RowFilterInterface implementation to take care of most of the above without having to change code elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.