You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by lars hofhansl <lh...@yahoo.com> on 2011/08/09 02:38:51 UTC

Allow RegionCoprocessorEnvironment to register custom scanners?

Currently coprocessors can't do any streaming operations.

I think that would be a necessary feature to perform long running operations on the server (like scans) that in turn could produce a lot of data.
GroupBy type aggregates come to mind, but there are many more cases.


Somewhere I read about some approach for server side cursors (can't find that discussion now).
I think a simpler approach would be allowing a coprocessor to register new InternalScanners that it could implement,
and then have some way of accessing the scanner via the normal ClientScanner mechanism.
Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) through RegionServerServices.
and adding  public ResultScanner getScanner(long scannerId) ... on HTable, and similar on all other clients (I don't know anything about the client beside the HTable Java client).


Or something similar (just making this up here).


That way all major parts are already in place (Client Scanners are good in performing caching, the coprocessor could just wrap "real" internal scanners, etc). The problem is just about how to wire up the parts.


Thoughts? Are questions like this better asked on the dev list?

Thanks.

-- Lars

Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Posted by lars hofhansl <lh...@yahoo.com>.
Created https://issues.apache.org/jira/browse/HBASE-4197 and attached a minimal patch there to make it work for me.



________________________________
From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars hofhansl <lh...@yahoo.com>
Sent: Friday, August 12, 2011 9:39 AM
Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Thanks Lars. You are the first to try this. Please file a jira, this is a bug if you cannot accomplish what you want here.
 
> there should be a RegionScanner interface that includes: public HRegionInfo getRegionName();

I think this is the answer, and a small refactor.

Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: lars hofhansl <lh...@yahoo.com>
>To: Andrew Purtell <ap...@apache.org>; "user@hbase.apache.org" <us...@hbase.apache.org>
>Sent: Thursday, August 11, 2011 11:19 PM
>Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>
>Hmm...
>
>
>Here's what I get when I try to wrap a scanner in my own Scanner implementation:
>
>
>java.io.IOException: InternalScanner implementation is expected to be HRegion.RegionScanner.
>        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:616)
>        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
>
>
>Indeed in HRegionServer.next(...) I see this:
>...
>
>    InternalScanner s = this.scanners.get(scannerName);
>
>...
>
>      // Call coprocessor. Get region info from scanner.
>      HRegion region = null;
>      if (s instanceof HRegion.RegionScanner) {
>        HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
>        region = getRegion(rs.getRegionName().getRegionName());
>      } else {
>        throw new IOException("InternalScanner implementation is expected " +
>            "to be HRegion.RegionScanner.");
>      }
>
>Forcing all scanners created in {pre|post}OpenScanner to be subclasses of RegionScanner is not right.
>I see why it's done this way, though (after all only the scanner knows its region).
>
>
>So I guess either a scanner's region should be stored somewhere else, or there should be a RegionScanner interface that includes:
>public HRegionInfo getRegionName();
>
>
>Here a simple isolates test for this:
>
>public class ScannerTest extends BaseRegionObserver {
>    @Override
>    public InternalScanner postScannerOpen(final ObserverContext<RegionCoprocessorEnvironment> c,
>                                           final Scan scan, final InternalScanner s) throws IOException
>    {
>        // just wrap the passed scanner
>        return new InternalScanner() {
>            private InternalScanner del;
>            { this.del = s;}
>            public boolean next(List<KeyValue> results) throws IOException
>            {
>                return del.next(results);
>            }
>            public boolean next(List<KeyValue> result, int limit) throws IOException
>            {
>                return del.next(result, limit);
>            }
>            public void close() throws IOException {
>                del.close();
>            }
>        };            
>    }
>
>}
>
>
>-- Lars
>
>
>
>________________________________
>From: Andrew Purtell <ap...@apache.org>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars hofhansl <lh...@yahoo.com>
>Sent: Tuesday, August 9, 2011 9:50 AM
>Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>
>Great! I'm glad you have everything you need Lars. If at some point you become stuck because there is indeed some control or API surface missing, please write back!
> 
>Best regards,
>
>
>   - Andy
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>----- Original Message -----
>> From: lars hofhansl <lh...@yahoo.com>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org>
>> Cc: 
>> Sent: Monday, August 8, 2011 7:53 PM
>> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>> 
>> I see.I just didn't see how you could communicate any information to the 
>> server via a Scan, but now I see Scan.setAttribute(...).
>> 
>> Thanks Andy.
>> 
>> -- Lars
>> 
>> 
>> 
>> ________________________________
>> From: Andrew Purtell <ap...@apache.org>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars 
>> hofhansl <lh...@yahoo.com>
>> Sent: Monday, August 8, 2011 5:50 PM
>> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>> 
>> The RegionObserver already wraps all of the scanner operations. 
>> RegionObserver.preScannerOpen can create an InternalScanner and return it 
>> exactly as you propose with "HRegionServer.addScanner(InternalScanner) 
>> ". 
>> 
>> preScannerOpen takes a Scan object.
>> 
>> Only if preScannerOpen does not return an InternalScanner will the RegionServer 
>> look for a "real" InternalScanner.
>> 
>> So I don't see what addScanner would buy you.
>> 
>> Best regards,
>> 
>> 
>>    - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
>> Tom White)
>> 
>> 
>>> ________________________________
>>> From: lars hofhansl <lh...@yahoo.com>
>>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
>>> Sent: Monday, August 8, 2011 5:38 PM
>>> Subject: Allow RegionCoprocessorEnvironment to register custom scanners?
>>> 
>>> Currently coprocessors can't do any streaming operations.
>>> 
>>> I think that would be a necessary feature to perform long running operations 
>> on the server (like scans) that in turn could produce a lot of data.
>>> GroupBy type aggregates come to mind, but there are many more cases.
>>> 
>>> 
>>> Somewhere I read about some approach for server side cursors (can't find 
>> that discussion now).
>>> I think a simpler approach would be allowing a coprocessor to register new 
>> InternalScanners that it could implement,
>>> and then have some way of accessing the scanner via the normal ClientScanner 
>> mechanism.
>>> Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) 
>> through RegionServerServices.
>>> and adding  public ResultScanner getScanner(long scannerId) ... on HTable, 
>> and similar on all other clients (I don't know anything about the client 
>> beside the HTable Java client).
>>> 
>>> 
>>> Or something similar (just making this up here).
>>> 
>>> 
>>> That way all major parts are already in place (Client Scanners are good in 
>> performing caching, the coprocessor could just wrap "real" internal 
>> scanners, etc). The problem is just about how to wire up the parts.
>>> 
>>> 
>>> Thoughts? Are questions like this better asked on the dev list?
>>> 
>>> Thanks.
>>> 
>>> -- Lars
>>> 
>>> 
>>> 
>>
>
>

Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Posted by Andrew Purtell <ap...@apache.org>.
Thanks Lars. You are the first to try this. Please file a jira, this is a bug if you cannot accomplish what you want here.
 
> there should be a RegionScanner interface that includes: public HRegionInfo getRegionName();

I think this is the answer, and a small refactor.

Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: lars hofhansl <lh...@yahoo.com>
>To: Andrew Purtell <ap...@apache.org>; "user@hbase.apache.org" <us...@hbase.apache.org>
>Sent: Thursday, August 11, 2011 11:19 PM
>Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>
>Hmm...
>
>
>Here's what I get when I try to wrap a scanner in my own Scanner implementation:
>
>
>java.io.IOException: InternalScanner implementation is expected to be HRegion.RegionScanner.
>        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:616)
>        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
>
>
>Indeed in HRegionServer.next(...) I see this:
>...
>
>    InternalScanner s = this.scanners.get(scannerName);
>
>...
>
>      // Call coprocessor. Get region info from scanner.
>      HRegion region = null;
>      if (s instanceof HRegion.RegionScanner) {
>        HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
>        region = getRegion(rs.getRegionName().getRegionName());
>      } else {
>        throw new IOException("InternalScanner implementation is expected " +
>            "to be HRegion.RegionScanner.");
>      }
>
>Forcing all scanners created in {pre|post}OpenScanner to be subclasses of RegionScanner is not right.
>I see why it's done this way, though (after all only the scanner knows its region).
>
>
>So I guess either a scanner's region should be stored somewhere else, or there should be a RegionScanner interface that includes:
>public HRegionInfo getRegionName();
>
>
>Here a simple isolates test for this:
>
>public class ScannerTest extends BaseRegionObserver {
>    @Override
>    public InternalScanner postScannerOpen(final ObserverContext<RegionCoprocessorEnvironment> c,
>                                           final Scan scan, final InternalScanner s) throws IOException
>    {
>        // just wrap the passed scanner
>        return new InternalScanner() {
>            private InternalScanner del;
>            { this.del = s;}
>            public boolean next(List<KeyValue> results) throws IOException
>            {
>                return del.next(results);
>            }
>            public boolean next(List<KeyValue> result, int limit) throws IOException
>            {
>                return del.next(result, limit);
>            }
>            public void close() throws IOException {
>                del.close();
>            }
>        };            
>    }
>
>}
>
>
>-- Lars
>
>
>
>________________________________
>From: Andrew Purtell <ap...@apache.org>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars hofhansl <lh...@yahoo.com>
>Sent: Tuesday, August 9, 2011 9:50 AM
>Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>
>Great! I'm glad you have everything you need Lars. If at some point you become stuck because there is indeed some control or API surface missing, please write back!
> 
>Best regards,
>
>
>   - Andy
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>----- Original Message -----
>> From: lars hofhansl <lh...@yahoo.com>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org>
>> Cc: 
>> Sent: Monday, August 8, 2011 7:53 PM
>> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>> 
>> I see.I just didn't see how you could communicate any information to the 
>> server via a Scan, but now I see Scan.setAttribute(...).
>> 
>> Thanks Andy.
>> 
>> -- Lars
>> 
>> 
>> 
>> ________________________________
>> From: Andrew Purtell <ap...@apache.org>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars 
>> hofhansl <lh...@yahoo.com>
>> Sent: Monday, August 8, 2011 5:50 PM
>> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
>> 
>> The RegionObserver already wraps all of the scanner operations. 
>> RegionObserver.preScannerOpen can create an InternalScanner and return it 
>> exactly as you propose with "HRegionServer.addScanner(InternalScanner) 
>> ". 
>> 
>> preScannerOpen takes a Scan object.
>> 
>> Only if preScannerOpen does not return an InternalScanner will the RegionServer 
>> look for a "real" InternalScanner.
>> 
>> So I don't see what addScanner would buy you.
>> 
>> Best regards,
>> 
>> 
>>    - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
>> Tom White)
>> 
>> 
>>> ________________________________
>>> From: lars hofhansl <lh...@yahoo.com>
>>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
>>> Sent: Monday, August 8, 2011 5:38 PM
>>> Subject: Allow RegionCoprocessorEnvironment to register custom scanners?
>>> 
>>> Currently coprocessors can't do any streaming operations.
>>> 
>>> I think that would be a necessary feature to perform long running operations 
>> on the server (like scans) that in turn could produce a lot of data.
>>> GroupBy type aggregates come to mind, but there are many more cases.
>>> 
>>> 
>>> Somewhere I read about some approach for server side cursors (can't find 
>> that discussion now).
>>> I think a simpler approach would be allowing a coprocessor to register new 
>> InternalScanners that it could implement,
>>> and then have some way of accessing the scanner via the normal ClientScanner 
>> mechanism.
>>> Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) 
>> through RegionServerServices.
>>> and adding  public ResultScanner getScanner(long scannerId) ... on HTable, 
>> and similar on all other clients (I don't know anything about the client 
>> beside the HTable Java client).
>>> 
>>> 
>>> Or something similar (just making this up here).
>>> 
>>> 
>>> That way all major parts are already in place (Client Scanners are good in 
>> performing caching, the coprocessor could just wrap "real" internal 
>> scanners, etc). The problem is just about how to wire up the parts.
>>> 
>>> 
>>> Thoughts? Are questions like this better asked on the dev list?
>>> 
>>> Thanks.
>>> 
>>> -- Lars
>>> 
>>> 
>>> 
>>
>
>

Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Posted by lars hofhansl <lh...@yahoo.com>.
Hmm...


Here's what I get when I try to wrap a scanner in my own Scanner implementation:


java.io.IOException: InternalScanner implementation is expected to be HRegion.RegionScanner.
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)


Indeed in HRegionServer.next(...) I see this:
...

    InternalScanner s = this.scanners.get(scannerName);

...

      // Call coprocessor. Get region info from scanner.
      HRegion region = null;
      if (s instanceof HRegion.RegionScanner) {
        HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
        region = getRegion(rs.getRegionName().getRegionName());
      } else {
        throw new IOException("InternalScanner implementation is expected " +
            "to be HRegion.RegionScanner.");
      }

Forcing all scanners created in {pre|post}OpenScanner to be subclasses of RegionScanner is not right.
I see why it's done this way, though (after all only the scanner knows its region).


So I guess either a scanner's region should be stored somewhere else, or there should be a RegionScanner interface that includes:
public HRegionInfo getRegionName();


Here a simple isolates test for this:

public class ScannerTest extends BaseRegionObserver {
    @Override
    public InternalScanner postScannerOpen(final ObserverContext<RegionCoprocessorEnvironment> c,
                                           final Scan scan, final InternalScanner s) throws IOException
    {
        // just wrap the passed scanner
        return new InternalScanner() {
            private InternalScanner del;
            { this.del = s;}
            public boolean next(List<KeyValue> results) throws IOException
            {
                return del.next(results);
            }
            public boolean next(List<KeyValue> result, int limit) throws IOException
            {
                return del.next(result, limit);
            }
            public void close() throws IOException {
                del.close();
            }
        };            
    }

}


-- Lars



________________________________
From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars hofhansl <lh...@yahoo.com>
Sent: Tuesday, August 9, 2011 9:50 AM
Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Great! I'm glad you have everything you need Lars. If at some point you become stuck because there is indeed some control or API surface missing, please write back!
 
Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: lars hofhansl <lh...@yahoo.com>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org>
> Cc: 
> Sent: Monday, August 8, 2011 7:53 PM
> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
> 
> I see.I just didn't see how you could communicate any information to the 
> server via a Scan, but now I see Scan.setAttribute(...).
> 
> Thanks Andy.
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Andrew Purtell <ap...@apache.org>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars 
> hofhansl <lh...@yahoo.com>
> Sent: Monday, August 8, 2011 5:50 PM
> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
> 
> The RegionObserver already wraps all of the scanner operations. 
> RegionObserver.preScannerOpen can create an InternalScanner and return it 
> exactly as you propose with "HRegionServer.addScanner(InternalScanner) 
> ". 
> 
> preScannerOpen takes a Scan object.
> 
> Only if preScannerOpen does not return an InternalScanner will the RegionServer 
> look for a "real" InternalScanner.
> 
> So I don't see what addScanner would buy you.
> 
> Best regards,
> 
> 
>    - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
> Tom White)
> 
> 
>> ________________________________
>> From: lars hofhansl <lh...@yahoo.com>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
>> Sent: Monday, August 8, 2011 5:38 PM
>> Subject: Allow RegionCoprocessorEnvironment to register custom scanners?
>> 
>> Currently coprocessors can't do any streaming operations.
>> 
>> I think that would be a necessary feature to perform long running operations 
> on the server (like scans) that in turn could produce a lot of data.
>> GroupBy type aggregates come to mind, but there are many more cases.
>> 
>> 
>> Somewhere I read about some approach for server side cursors (can't find 
> that discussion now).
>> I think a simpler approach would be allowing a coprocessor to register new 
> InternalScanners that it could implement,
>> and then have some way of accessing the scanner via the normal ClientScanner 
> mechanism.
>> Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) 
> through RegionServerServices.
>> and adding  public ResultScanner getScanner(long scannerId) ... on HTable, 
> and similar on all other clients (I don't know anything about the client 
> beside the HTable Java client).
>> 
>> 
>> Or something similar (just making this up here).
>> 
>> 
>> That way all major parts are already in place (Client Scanners are good in 
> performing caching, the coprocessor could just wrap "real" internal 
> scanners, etc). The problem is just about how to wire up the parts.
>> 
>> 
>> Thoughts? Are questions like this better asked on the dev list?
>> 
>> Thanks.
>> 
>> -- Lars
>> 
>> 
>> 
>

Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Posted by Andrew Purtell <ap...@apache.org>.
Great! I'm glad you have everything you need Lars. If at some point you become stuck because there is indeed some control or API surface missing, please write back!
 
Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: lars hofhansl <lh...@yahoo.com>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org>
> Cc: 
> Sent: Monday, August 8, 2011 7:53 PM
> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
> 
> I see.I just didn't see how you could communicate any information to the 
> server via a Scan, but now I see Scan.setAttribute(...).
> 
> Thanks Andy.
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Andrew Purtell <ap...@apache.org>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars 
> hofhansl <lh...@yahoo.com>
> Sent: Monday, August 8, 2011 5:50 PM
> Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?
> 
> The RegionObserver already wraps all of the scanner operations. 
> RegionObserver.preScannerOpen can create an InternalScanner and return it 
> exactly as you propose with "HRegionServer.addScanner(InternalScanner) 
> ". 
> 
> preScannerOpen takes a Scan object.
> 
> Only if preScannerOpen does not return an InternalScanner will the RegionServer 
> look for a "real" InternalScanner.
> 
> So I don't see what addScanner would buy you.
> 
> Best regards,
> 
> 
>    - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
> Tom White)
> 
> 
>> ________________________________
>> From: lars hofhansl <lh...@yahoo.com>
>> To: "user@hbase.apache.org" <us...@hbase.apache.org>
>> Sent: Monday, August 8, 2011 5:38 PM
>> Subject: Allow RegionCoprocessorEnvironment to register custom scanners?
>> 
>> Currently coprocessors can't do any streaming operations.
>> 
>> I think that would be a necessary feature to perform long running operations 
> on the server (like scans) that in turn could produce a lot of data.
>> GroupBy type aggregates come to mind, but there are many more cases.
>> 
>> 
>> Somewhere I read about some approach for server side cursors (can't find 
> that discussion now).
>> I think a simpler approach would be allowing a coprocessor to register new 
> InternalScanners that it could implement,
>> and then have some way of accessing the scanner via the normal ClientScanner 
> mechanism.
>> Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) 
> through RegionServerServices.
>> and adding  public ResultScanner getScanner(long scannerId) ... on HTable, 
> and similar on all other clients (I don't know anything about the client 
> beside the HTable Java client).
>> 
>> 
>> Or something similar (just making this up here).
>> 
>> 
>> That way all major parts are already in place (Client Scanners are good in 
> performing caching, the coprocessor could just wrap "real" internal 
> scanners, etc). The problem is just about how to wire up the parts.
>> 
>> 
>> Thoughts? Are questions like this better asked on the dev list?
>> 
>> Thanks.
>> 
>> -- Lars
>> 
>> 
>> 
>

Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Posted by lars hofhansl <lh...@yahoo.com>.
I see.I just didn't see how you could communicate any information to the server via a Scan, but now I see Scan.setAttribute(...).

Thanks Andy.

-- Lars



________________________________
From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org>; lars hofhansl <lh...@yahoo.com>
Sent: Monday, August 8, 2011 5:50 PM
Subject: Re: Allow RegionCoprocessorEnvironment to register custom scanners?

The RegionObserver already wraps all of the scanner operations. RegionObserver.preScannerOpen can create an InternalScanner and return it exactly as you propose with "HRegionServer.addScanner(InternalScanner) ". 

preScannerOpen takes a Scan object.

Only if preScannerOpen does not return an InternalScanner will the RegionServer look for a "real" InternalScanner.

So I don't see what addScanner would buy you.

Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: lars hofhansl <lh...@yahoo.com>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>
>Sent: Monday, August 8, 2011 5:38 PM
>Subject: Allow RegionCoprocessorEnvironment to register custom scanners?
>
>Currently coprocessors can't do any streaming operations.
>
>I think that would be a necessary feature to perform long running operations on the server (like scans) that in turn could produce a lot of data.
>GroupBy type aggregates come to mind, but there are many more cases.
>
>
>Somewhere I read about some approach for server side cursors (can't find that discussion now).
>I think a simpler approach would be allowing a coprocessor to register new InternalScanners that it could implement,
>and then have some way of accessing the scanner via the normal ClientScanner mechanism.
>Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) through RegionServerServices.
>and adding  public ResultScanner getScanner(long scannerId) ... on HTable, and similar on all other clients (I don't know anything about the client beside the HTable Java client).
>
>
>Or something similar (just making this up here).
>
>
>That way all major parts are already in place (Client Scanners are good in performing caching, the coprocessor could just wrap "real" internal scanners, etc). The problem is just about how to wire up the parts.
>
>
>Thoughts? Are questions like this better asked on the dev list?
>
>Thanks.
>
>-- Lars
>
>
>

Re: Allow RegionCoprocessorEnvironment to register custom scanners?

Posted by Andrew Purtell <ap...@apache.org>.
The RegionObserver already wraps all of the scanner operations. RegionObserver.preScannerOpen can create an InternalScanner and return it exactly as you propose with "HRegionServer.addScanner(InternalScanner) ". 

preScannerOpen takes a Scan object.

Only if preScannerOpen does not return an InternalScanner will the RegionServer look for a "real" InternalScanner.

So I don't see what addScanner would buy you.

Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: lars hofhansl <lh...@yahoo.com>
>To: "user@hbase.apache.org" <us...@hbase.apache.org>
>Sent: Monday, August 8, 2011 5:38 PM
>Subject: Allow RegionCoprocessorEnvironment to register custom scanners?
>
>Currently coprocessors can't do any streaming operations.
>
>I think that would be a necessary feature to perform long running operations on the server (like scans) that in turn could produce a lot of data.
>GroupBy type aggregates come to mind, but there are many more cases.
>
>
>Somewhere I read about some approach for server side cursors (can't find that discussion now).
>I think a simpler approach would be allowing a coprocessor to register new InternalScanners that it could implement,
>and then have some way of accessing the scanner via the normal ClientScanner mechanism.
>Maybe by just exposing  long HRegionServer.addScanner(InternalScanner) through RegionServerServices.
>and adding  public ResultScanner getScanner(long scannerId) ... on HTable, and similar on all other clients (I don't know anything about the client beside the HTable Java client).
>
>
>Or something similar (just making this up here).
>
>
>That way all major parts are already in place (Client Scanners are good in performing caching, the coprocessor could just wrap "real" internal scanners, etc). The problem is just about how to wire up the parts.
>
>
>Thoughts? Are questions like this better asked on the dev list?
>
>Thanks.
>
>-- Lars
>
>
>