You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Yifeng Jiang <yi...@mail.rakuten.co.jp> on 2011/01/24 10:31:06 UTC

Filter for deprecated mapred APIs

Hi,

I has a MyFilter class extends FilterBase, and a MyInputFormat extends
hbase.mapred.TableInputFormat, the deprecated mapred APIs.

It seems that the filter will not be invoke when there are only a few
data in the table.

This is the code in my InputFormat's configure method.

String startDate = job.get("key.startdate");
String endDate = job.get("key.enddate");
Filter filter = new MyFilter(sdf.parse(startDate), sdf.parse(endDate));
setRowFilter(filter);

And the job setup code.

job.setInputFormat(MyInputFormat.class);
FileInputFormat.addInputPaths(job, "my_table_in_hbase");
job.set(TableInputFormat.COLUMN_LIST, "data:");

When I put lots of data (> 500,000) in the table, the filter works well,
but when I put only a few data (<100) in the table, the filter does not
function any all, and the log in the filter has no output either.

Is there something wrong in my code?
I'm using HBase0.90.0 and have put MyFilter in HBase's classpath.

Thanks.

-- 
Yifeng Jiang


Re: Filter for deprecated mapred APIs

Posted by Yifeng Jiang <yi...@mail.rakuten.co.jp>.
Hi Stack,

I have tested it using org.apache.hadoop.hbase.filte.PrefixFilter.
And the result is same when using my own filter.

So I created the following issue.
Please check the details at here.
https://issues.apache.org/jira/browse/HBASE-3477

Thanks.

On 01/25/2011 02:44 PM, Yifeng Jiang wrote:
> I will test it again using the filters shipped with hbase itself, and 
> give a report later.
>
> Thanks
>
> On 01/25/2011 02:30 AM, Stack wrote:
>>   Want to try with one of the filters we ship with to see if it has
>> same issue?  If so, please file an issue.  Thats pretty serious bug.
>>
>> Thanks,
>> St.Ack
>>
>> 2011/1/24 Yifeng Jiang<yi...@mail.rakuten.co.jp>:
>>> Hi,
>>>
>>> I has a MyFilter class extends FilterBase, and a MyInputFormat extends
>>> hbase.mapred.TableInputFormat, the deprecated mapred APIs.
>>>
>>> It seems that the filter will not be invoke when there are only a few
>>> data in the table.
>>>
>>> This is the code in my InputFormat's configure method.
>>>
>>> String startDate = job.get("key.startdate");
>>> String endDate = job.get("key.enddate");
>>> Filter filter = new MyFilter(sdf.parse(startDate), sdf.parse(endDate));
>>> setRowFilter(filter);
>>>
>>> And the job setup code.
>>>
>>> job.setInputFormat(MyInputFormat.class);
>>> FileInputFormat.addInputPaths(job, "my_table_in_hbase");
>>> job.set(TableInputFormat.COLUMN_LIST, "data:");
>>>
>>> When I put lots of data (>  500,000) in the table, the filter works 
>>> well,
>>> but when I put only a few data (<100) in the table, the filter does not
>>> function any all, and the log in the filter has no output either.
>>>
>>> Is there something wrong in my code?
>>> I'm using HBase0.90.0 and have put MyFilter in HBase's classpath.
>>>
>>> Thanks.
>>>
>>> -- 
>>> Yifeng Jiang
>>>
>>>
>
>


-- 
Yifeng Jiang


Re: Filter for deprecated mapred APIs

Posted by Yifeng Jiang <yi...@mail.rakuten.co.jp>.
I will test it again using the filters shipped with hbase itself, and 
give a report later.

Thanks

On 01/25/2011 02:30 AM, Stack wrote:
>   Want to try with one of the filters we ship with to see if it has
> same issue?  If so, please file an issue.  Thats pretty serious bug.
>
> Thanks,
> St.Ack
>
> 2011/1/24 Yifeng Jiang<yi...@mail.rakuten.co.jp>:
>> Hi,
>>
>> I has a MyFilter class extends FilterBase, and a MyInputFormat extends
>> hbase.mapred.TableInputFormat, the deprecated mapred APIs.
>>
>> It seems that the filter will not be invoke when there are only a few
>> data in the table.
>>
>> This is the code in my InputFormat's configure method.
>>
>> String startDate = job.get("key.startdate");
>> String endDate = job.get("key.enddate");
>> Filter filter = new MyFilter(sdf.parse(startDate), sdf.parse(endDate));
>> setRowFilter(filter);
>>
>> And the job setup code.
>>
>> job.setInputFormat(MyInputFormat.class);
>> FileInputFormat.addInputPaths(job, "my_table_in_hbase");
>> job.set(TableInputFormat.COLUMN_LIST, "data:");
>>
>> When I put lots of data (>  500,000) in the table, the filter works well,
>> but when I put only a few data (<100) in the table, the filter does not
>> function any all, and the log in the filter has no output either.
>>
>> Is there something wrong in my code?
>> I'm using HBase0.90.0 and have put MyFilter in HBase's classpath.
>>
>> Thanks.
>>
>> --
>> Yifeng Jiang
>>
>>


-- 
Yifeng Jiang


Re: Filter for deprecated mapred APIs

Posted by Stack <st...@duboce.net>.
 Want to try with one of the filters we ship with to see if it has
same issue?  If so, please file an issue.  Thats pretty serious bug.

Thanks,
St.Ack

2011/1/24 Yifeng Jiang <yi...@mail.rakuten.co.jp>:
> Hi,
>
> I has a MyFilter class extends FilterBase, and a MyInputFormat extends
> hbase.mapred.TableInputFormat, the deprecated mapred APIs.
>
> It seems that the filter will not be invoke when there are only a few
> data in the table.
>
> This is the code in my InputFormat's configure method.
>
> String startDate = job.get("key.startdate");
> String endDate = job.get("key.enddate");
> Filter filter = new MyFilter(sdf.parse(startDate), sdf.parse(endDate));
> setRowFilter(filter);
>
> And the job setup code.
>
> job.setInputFormat(MyInputFormat.class);
> FileInputFormat.addInputPaths(job, "my_table_in_hbase");
> job.set(TableInputFormat.COLUMN_LIST, "data:");
>
> When I put lots of data (> 500,000) in the table, the filter works well,
> but when I put only a few data (<100) in the table, the filter does not
> function any all, and the log in the filter has no output either.
>
> Is there something wrong in my code?
> I'm using HBase0.90.0 and have put MyFilter in HBase's classpath.
>
> Thanks.
>
> --
> Yifeng Jiang
>
>