You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Tom Brown <to...@gmail.com> on 2012/08/03 22:27:36 UTC
Need to fast-forward a scanner inside a coprocessor
I have a custom coprocessor that aggregates a selection of records
from the table based various criteria. For efficiency, I would like to
make it skip a bunch of records. For example, if I don't need any
"AAAA" records and I encounter "AAAA0000", I would like to tell it to
skip everything until "AAAB.."
I don't see any methods of the InternalScanner class that would give
me that ability. Do I need to close the current scanner and open a new
one? Does that add significant overhead (which would reduce any gains
achieved by skipping small numbers of records)?
I am using HBase 0.92. Upgrading to 0.94 is possible if it gives this
functionality.
--Tom
Re: Need to fast-forward a scanner inside a coprocessor
Posted by lars hofhansl <lh...@yahoo.com>.
Oh... I just meant you need to have your hands on a RegionScanner :)
As long as you only scan forward it should work.
----- Original Message -----
From: Tom Brown <to...@gmail.com>
To: user@hbase.apache.org; lars hofhansl <lh...@yahoo.com>
Cc:
Sent: Friday, August 3, 2012 5:47 PM
Subject: Re: Need to fast-forward a scanner inside a coprocessor
So I understand I'll need to upgrade to 0.94 (which won't be a problem
because the releases are binary-compatible). I see that the
RegionScanner interface contains the new method "reseek(byte[] row)".
I have a reference to a RegionScanner in my coprocessor because I'm
using: getEnvironment().getRegion().getScanner(scan).
What I don't understand is your conditional statement "it depends
specifically on where you hook this up". I'm not doing anything with
"postScannerOpen". Since I have an instance of a RegionScanner,
should I expect "reseek" to work, as long as I'm seeking forward? Is
the way I'm using it up compatible with how it should work?
--Tom
On Fri, Aug 3, 2012 at 3:05 PM, lars hofhansl <lh...@yahoo.com> wrote:
> We recently added a new API for that:
> RegionScanner.reseek(...). See HBASE-5520. 0.94+ only, unfortunately.
>
> So it depends specifically on where you hook this up. If you do it at RegionObserver.postScannerOpen you can reseek forward at any time.
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Tom Brown <to...@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Friday, August 3, 2012 1:27 PM
> Subject: Need to fast-forward a scanner inside a coprocessor
>
> I have a custom coprocessor that aggregates a selection of records
> from the table based various criteria. For efficiency, I would like to
> make it skip a bunch of records. For example, if I don't need any
> "AAAA" records and I encounter "AAAA0000", I would like to tell it to
> skip everything until "AAAB.."
>
> I don't see any methods of the InternalScanner class that would give
> me that ability. Do I need to close the current scanner and open a new
> one? Does that add significant overhead (which would reduce any gains
> achieved by skipping small numbers of records)?
>
> I am using HBase 0.92. Upgrading to 0.94 is possible if it gives this
> functionality.
>
> --Tom
>
Re: Need to fast-forward a scanner inside a coprocessor
Posted by Tom Brown <to...@gmail.com>.
So I understand I'll need to upgrade to 0.94 (which won't be a problem
because the releases are binary-compatible). I see that the
RegionScanner interface contains the new method "reseek(byte[] row)".
I have a reference to a RegionScanner in my coprocessor because I'm
using: getEnvironment().getRegion().getScanner(scan).
What I don't understand is your conditional statement "it depends
specifically on where you hook this up". I'm not doing anything with
"postScannerOpen". Since I have an instance of a RegionScanner,
should I expect "reseek" to work, as long as I'm seeking forward? Is
the way I'm using it up compatible with how it should work?
--Tom
On Fri, Aug 3, 2012 at 3:05 PM, lars hofhansl <lh...@yahoo.com> wrote:
> We recently added a new API for that:
> RegionScanner.reseek(...). See HBASE-5520. 0.94+ only, unfortunately.
>
> So it depends specifically on where you hook this up. If you do it at RegionObserver.postScannerOpen you can reseek forward at any time.
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Tom Brown <to...@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Friday, August 3, 2012 1:27 PM
> Subject: Need to fast-forward a scanner inside a coprocessor
>
> I have a custom coprocessor that aggregates a selection of records
> from the table based various criteria. For efficiency, I would like to
> make it skip a bunch of records. For example, if I don't need any
> "AAAA" records and I encounter "AAAA0000", I would like to tell it to
> skip everything until "AAAB.."
>
> I don't see any methods of the InternalScanner class that would give
> me that ability. Do I need to close the current scanner and open a new
> one? Does that add significant overhead (which would reduce any gains
> achieved by skipping small numbers of records)?
>
> I am using HBase 0.92. Upgrading to 0.94 is possible if it gives this
> functionality.
>
> --Tom
>
Re: Need to fast-forward a scanner inside a coprocessor
Posted by lars hofhansl <lh...@yahoo.com>.
We recently added a new API for that:
RegionScanner.reseek(...). See HBASE-5520. 0.94+ only, unfortunately.
So it depends specifically on where you hook this up. If you do it at RegionObserver.postScannerOpen you can reseek forward at any time.
-- Lars
----- Original Message -----
From: Tom Brown <to...@gmail.com>
To: user@hbase.apache.org
Cc:
Sent: Friday, August 3, 2012 1:27 PM
Subject: Need to fast-forward a scanner inside a coprocessor
I have a custom coprocessor that aggregates a selection of records
from the table based various criteria. For efficiency, I would like to
make it skip a bunch of records. For example, if I don't need any
"AAAA" records and I encounter "AAAA0000", I would like to tell it to
skip everything until "AAAB.."
I don't see any methods of the InternalScanner class that would give
me that ability. Do I need to close the current scanner and open a new
one? Does that add significant overhead (which would reduce any gains
achieved by skipping small numbers of records)?
I am using HBase 0.92. Upgrading to 0.94 is possible if it gives this
functionality.
--Tom