You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Michael Seibold <se...@in.tum.de> on 2009/03/03 14:04:16 UTC

HBase: scanner with custom filter - filterRowKey: rowKey == null

Hi,

I'm not sure how the method filterRowKey(byte[] rowKey) of the
RowFilterInterface works.

I have the feeling that this method may be called with null as parameter
by HBase. When is the rowKey Parameter null and why?

I have used the following custom filter to verify this:

public class HFilter implements RowFilterInterface {

	public HFilter() {
		
	}

	public boolean filterRowKey(byte[] rowKey) {
		if (rowKey == null) {
			throw new RuntimeException("rowKey == null");
		}
		return false;
	}
}

After a while I get the following exception:

192.168.0.11:32806: error: java.io.IOException:
java.lang.RuntimeException: rowKey == null
java.io.IOException: java.lang.RuntimeException: rowKey == null
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:687)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:677)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1586)
        at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

I tried this with HBase 0.19.0 and HDFS (Hadoop 0.19.0) running together
on 1 server.

That's why I got the feeling that the method filterRowKey may be called
with null as parameter by HBase. I don't understand why this should
happen. Is null a special case that I have to handle in this method?

Kind regards,
Michael


Re: HBase: scanner with custom filter - filterRowKey: rowKey == null

Posted by Michael Seibold <se...@in.tum.de>.
Hi,

Yes, it contains data and without the filter it works and returns data.

I have looked at the code of StopRowFilter now. They do the following:
    if (rowKey == null) {
      if (this.stopRowKey == null) {
        return true;
      }
      return false;
    }
    return Bytes.compareTo(stopRowKey, rowKey) <= 0;

So it seems OK, to return false if rowKey == null.

When I include the following:
    if (rowKey == null) {
      return false;
    }
the test runs through without problem. I will do some more testing now
to verify, that exactly the same data is returned with and without the
filter (and instead using application level filtering).

It still would interest me to know why HBase calls the method with the
parameter rowKey == null and what the programmers did that for.

For me also the code in StopRowFilter doesn't make much sense. Why
should I create a StopRowFilter with stopRowKey == null. As a user I
would assume that there can't be any row with rowKey == null and would
never filter for that.

If one doesn't find a row in a particular region wouldn't it be more 
logical to simply not call the method filterRowKey which is supposed
to be called for every row (If there is no row then don't call it)?
I'm just curious.

Thanks,
Michael

El mar, 03-03-2009 a las 23:15 -0800, stack escribió:
> Its called on the server in the StoreScanner#next method.  It can be null if
> we did not find a row in a particular region.  For sure your table has stuff
> in it and your scanner returns rows when no filter in place?  Perhaps just
> log nulls rather then kill the scanner.  See if we move past the null?
> 
> Here is how it is called at line #164:
> 
>       filtered = dataFilter != null? dataFilter.filterRowKey(chosenRow) :
> false;
> 
> See above this line for where we chose the next row from the backing
> memcache and store files.
> 
> St.Ack
> 
> 
> On Tue, Mar 3, 2009 at 5:04 AM, Michael Seibold <se...@in.tum.de> wrote:
> 
> > Hi,
> >
> > I'm not sure how the method filterRowKey(byte[] rowKey) of the
> > RowFilterInterface works.
> >
> > I have the feeling that this method may be called with null as parameter
> > by HBase. When is the rowKey Parameter null and why?
> >
> > I have used the following custom filter to verify this:
> >
> > public class HFilter implements RowFilterInterface {
> >
> >        public HFilter() {
> >
> >        }
> >
> >        public boolean filterRowKey(byte[] rowKey) {
> >                if (rowKey == null) {
> >                        throw new RuntimeException("rowKey == null");
> >                }
> >                return false;
> >        }
> > }
> >
> > After a while I get the following exception:
> >
> > 192.168.0.11:32806: error: java.io.IOException:
> > java.lang.RuntimeException: rowKey == null
> > java.io.IOException: java.lang.RuntimeException: rowKey == null
> >        at
> >
> > org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:687)
> >        at
> >
> > org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:677)
> >        at
> >
> > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1586)
> >        at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
> >        at
> >
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >
> > I tried this with HBase 0.19.0 and HDFS (Hadoop 0.19.0) running together
> > on 1 server.
> >
> > That's why I got the feeling that the method filterRowKey may be called
> > with null as parameter by HBase. I don't understand why this should
> > happen. Is null a special case that I have to handle in this method?
> >
> > Kind regards,
> > Michael
> >
> >


Re: HBase: scanner with custom filter - filterRowKey: rowKey == null

Posted by stack <st...@duboce.net>.
Its called on the server in the StoreScanner#next method.  It can be null if
we did not find a row in a particular region.  For sure your table has stuff
in it and your scanner returns rows when no filter in place?  Perhaps just
log nulls rather then kill the scanner.  See if we move past the null?

Here is how it is called at line #164:

      filtered = dataFilter != null? dataFilter.filterRowKey(chosenRow) :
false;

See above this line for where we chose the next row from the backing
memcache and store files.

St.Ack


On Tue, Mar 3, 2009 at 5:04 AM, Michael Seibold <se...@in.tum.de> wrote:

> Hi,
>
> I'm not sure how the method filterRowKey(byte[] rowKey) of the
> RowFilterInterface works.
>
> I have the feeling that this method may be called with null as parameter
> by HBase. When is the rowKey Parameter null and why?
>
> I have used the following custom filter to verify this:
>
> public class HFilter implements RowFilterInterface {
>
>        public HFilter() {
>
>        }
>
>        public boolean filterRowKey(byte[] rowKey) {
>                if (rowKey == null) {
>                        throw new RuntimeException("rowKey == null");
>                }
>                return false;
>        }
> }
>
> After a while I get the following exception:
>
> 192.168.0.11:32806: error: java.io.IOException:
> java.lang.RuntimeException: rowKey == null
> java.io.IOException: java.lang.RuntimeException: rowKey == null
>        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:687)
>        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:677)
>        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1586)
>        at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
> I tried this with HBase 0.19.0 and HDFS (Hadoop 0.19.0) running together
> on 1 server.
>
> That's why I got the feeling that the method filterRowKey may be called
> with null as parameter by HBase. I don't understand why this should
> happen. Is null a special case that I have to handle in this method?
>
> Kind regards,
> Michael
>
>