You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Dave Birdsall <da...@esgyn.com> on 2017/07/24 16:29:55 UTC

EndpointCoprocessors with multi-region access

Hi,

I have a basic question about Endpoint coprocessors.

Suppose I want to write a coprocessor that returns the total number of memstore bytes used by a table.

I can write code that loops through all the regions, asking their region servers to tell me the memstore bytes for each given region, and then add them all up.

Such code, of course, will have a RegionServer talking to other RegionServers in the cluster.

Is there any problem with this? For example, when a RegionServer does an RPC to another RegionServer, does that tie up a thread in the calling RegionServer? And if so, and if my coprocessor is popular, might I get deadlocks or thread exhaustion errors if multiple RegionServers run my coprocessor?

The more general architectural question is, should an EndPoint coprocessor limit itself to the regions that are on its own RegionServer? Or does HBase possess appropriate layers to robustly manage prolific cross-server traffic?

Thanks,

Dave

Re: EndpointCoprocessors with multi-region access

Posted by Anoop John <an...@gmail.com>.
You got it correct.
Within your EP (this handles one region), you can get that Region
memstore size and add them all at client side.

-Anoop-

On Tue, Jul 25, 2017 at 12:10 AM, Dave Birdsall <da...@esgyn.com> wrote:
> Hi,
>
> I think I understand the answer.
>
> My question was based on incorrect premises. (You can tell I am new to this.)
>
> The CoprocesserService() method will send requests to all region servers serving regions within a given key range. So each coprocessor instance is handling just one region. I suppose one could write badly behaved code in a coprocessor instance that does cross servers, but the natural architecture of an EndPoint coprocessor is to work on one region locally.
>
> The client code that calls CoprocessorService is responsible for processing the set of responses from each region server that was called.
>
> So in my example, some client side code has to loop through these, adding together the results from each response.
>
> Thanks,
>
> Dave
>
> From: Dave Birdsall
> Sent: Monday, July 24, 2017 9:30 AM
> To: user@hbase.apache.org
> Subject: EndpointCoprocessors with multi-region access
>
> Hi,
>
> I have a basic question about Endpoint coprocessors.
>
> Suppose I want to write a coprocessor that returns the total number of memstore bytes used by a table.
>
> I can write code that loops through all the regions, asking their region servers to tell me the memstore bytes for each given region, and then add them all up.
>
> Such code, of course, will have a RegionServer talking to other RegionServers in the cluster.
>
> Is there any problem with this? For example, when a RegionServer does an RPC to another RegionServer, does that tie up a thread in the calling RegionServer? And if so, and if my coprocessor is popular, might I get deadlocks or thread exhaustion errors if multiple RegionServers run my coprocessor?
>
> The more general architectural question is, should an EndPoint coprocessor limit itself to the regions that are on its own RegionServer? Or does HBase possess appropriate layers to robustly manage prolific cross-server traffic?
>
> Thanks,
>
> Dave

RE: EndpointCoprocessors with multi-region access

Posted by Dave Birdsall <da...@esgyn.com>.
Hi,

I think I understand the answer.

My question was based on incorrect premises. (You can tell I am new to this.)

The CoprocesserService() method will send requests to all region servers serving regions within a given key range. So each coprocessor instance is handling just one region. I suppose one could write badly behaved code in a coprocessor instance that does cross servers, but the natural architecture of an EndPoint coprocessor is to work on one region locally.

The client code that calls CoprocessorService is responsible for processing the set of responses from each region server that was called.

So in my example, some client side code has to loop through these, adding together the results from each response.

Thanks,

Dave

From: Dave Birdsall
Sent: Monday, July 24, 2017 9:30 AM
To: user@hbase.apache.org
Subject: EndpointCoprocessors with multi-region access

Hi,

I have a basic question about Endpoint coprocessors.

Suppose I want to write a coprocessor that returns the total number of memstore bytes used by a table.

I can write code that loops through all the regions, asking their region servers to tell me the memstore bytes for each given region, and then add them all up.

Such code, of course, will have a RegionServer talking to other RegionServers in the cluster.

Is there any problem with this? For example, when a RegionServer does an RPC to another RegionServer, does that tie up a thread in the calling RegionServer? And if so, and if my coprocessor is popular, might I get deadlocks or thread exhaustion errors if multiple RegionServers run my coprocessor?

The more general architectural question is, should an EndPoint coprocessor limit itself to the regions that are on its own RegionServer? Or does HBase possess appropriate layers to robustly manage prolific cross-server traffic?

Thanks,

Dave