You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Alex Baranau <al...@gmail.com> on 2018/10/25 00:29:13 UTC

What is the best way to return scanner stats to client from co-processor?

Hi,

I'd like to return some scanner statistics from co-processor back to the
client. E.g. number of rows filtered out. There isn't a way to add
an arbitrary attributes to Result object, which would be simplest--e.g. I
could do that in postScannerNext.

The only way I can think of, is to add special Result object--again
in postScannerNext--to the result set. The Result object would carry the
needed info as bytes and could be watched for and interpreted on the client
side. Is it reasonable at all? Is it at all dangerous to do that in
postScannerNext or in general? Would it interfere with some client or
server-side logic?

Any ideas are welcome!

Thank you in advance,
Alex Baranau

Re: What is the best way to return scanner stats to client from co-processor?

Posted by Alex Baranau <al...@gmail.com>.
Hi Stack,

> Does the ScanMetrics feature help at all in your case (See
AbstractClientScanner#getScanMetrics()).

They are helpful. In fact, we'll make use of those for now.

At the same time, I did a poc with "special" Result and it worked well.
We'll probably end up using this approach to get more stats from our custom
filter.

For the context, my goal is to compute a "complexity" for the query to
better inform our latency metrics for the HBase queries. The latency
normalized by complexity would give a more fair indicator. And to make it
really useful, using simple # of records or bytes returned is not enough to
define a good complexity measurement. Things like rows scanned, filtered,
rpc calls, etc. in ScanMetrics are very helpful to inform it though!

Good on you sir,
Alex

On Mon, Oct 29, 2018 at 11:05 AM Stack <st...@duboce.net> wrote:

> On Wed, Oct 24, 2018 at 5:29 PM Alex Baranau <al...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I'd like to return some scanner statistics from co-processor back to the
> > client. E.g. number of rows filtered out. There isn't a way to add
> > an arbitrary attributes to Result object, which would be simplest--e.g. I
> > could do that in postScannerNext.
> >
> > The only way I can think of, is to add special Result object--again
> > in postScannerNext--to the result set. The Result object would carry the
> > needed info as bytes and could be watched for and interpreted on the
> client
> > side. Is it reasonable at all? Is it at all dangerous to do that in
> > postScannerNext or in general? Would it interfere with some client or
> > server-side logic?
> >
> > Any ideas are welcome!
> >
> > Thank you in advance,
> > Alex Baranau
> >
>
>
> Hey Alex:
>
> Does the ScanMetrics feature help at all in your case (See
> AbstractClientScanner#getScanMetrics()).
>
> On the Result, yeah, it is intentionally dumb -- a package of Cells. A
> 'speical' Result appended by the CP seems like a good way to go though....
> Would be interested if you make progress sir.
>
> S
>

Re: What is the best way to return scanner stats to client from co-processor?

Posted by Stack <st...@duboce.net>.
On Wed, Oct 24, 2018 at 5:29 PM Alex Baranau <al...@gmail.com>
wrote:

> Hi,
>
> I'd like to return some scanner statistics from co-processor back to the
> client. E.g. number of rows filtered out. There isn't a way to add
> an arbitrary attributes to Result object, which would be simplest--e.g. I
> could do that in postScannerNext.
>
> The only way I can think of, is to add special Result object--again
> in postScannerNext--to the result set. The Result object would carry the
> needed info as bytes and could be watched for and interpreted on the client
> side. Is it reasonable at all? Is it at all dangerous to do that in
> postScannerNext or in general? Would it interfere with some client or
> server-side logic?
>
> Any ideas are welcome!
>
> Thank you in advance,
> Alex Baranau
>


Hey Alex:

Does the ScanMetrics feature help at all in your case (See
AbstractClientScanner#getScanMetrics()).

On the Result, yeah, it is intentionally dumb -- a package of Cells. A
'speical' Result appended by the CP seems like a good way to go though....
Would be interested if you make progress sir.

S