You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "YiFeng Jiang (JIRA)" <ji...@apache.org> on 2011/01/25 11:38:43 UTC

[jira] Created: (HBASE-3477) Filter for deprecated mapred APIs doesn't work when the table has few rows

Filter for deprecated mapred APIs doesn't work when the table has few rows
--------------------------------------------------------------------------

                 Key: HBASE-3477
                 URL: https://issues.apache.org/jira/browse/HBASE-3477
             Project: HBase
          Issue Type: Bug
          Components: filters
    Affects Versions: 0.90.0
         Environment: Linux (Debian), master 1, slaves 2
            Reporter: YiFeng Jiang


It seems that the filters will not be invoke when there are only a few data in the table.

I added some logs to the org.apache.hadoop.hbase.filte. PrefixFilter, and has a MyInputFormat extends hbase.mapred.TableInputFormat, the deprecated mapred APIs.

The log added to PrefixFilter
{noformat} 
  public boolean filterRowKey(byte[] buffer, int offset, int length) {
    log.info("TODO: filterRowKey invoked");
    if (buffer == null || this.prefix == null) {
        log.info("TODO: #1 of filter");
      return true;
    }
    if (length < prefix.length) {
   ...
  }
{noformat} 

This is the code in my InputFormat's configure method.
{noformat} 
byte[] prefix = Bytes.toBytes("001");
Filter filter = new PrefixFilter(prefix);
setRowFilter(filter);
{noformat} 

And the job setup code.
{noformat} 
job.setInputFormat(MyInputFormat.class);
FileInputFormat.addInputPaths(job, "my_table_in_hbase");
job.set(TableInputFormat.COLUMN_LIST, "data:");
{noformat} 

When I put lots of data (> 500,000) in the table, the filter works well, but when I put only a few data (<100) in the table, it seems that the filter will not be invoked,  and the log in the filter has no output either.

This is the log output when lots of data in the table
{noformat} 
2011-01-25 16:43:59,568 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: default constructor
2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterRowKey invoked
2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: #3 of filter
2011-01-25 16:44:01,728 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterAllRemaining invoked
2011-01-25 16:44:01,729 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterAllRemaining invoked
2011-01-25 16:44:01,729 INFO org.apache.hadoop.hbase.filter.PrefixFilter: TODO: filterAllRemaining invoked
{noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.