You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Ma, Ming" <mi...@ebay.com> on 2011/08/24 09:21:45 UTC

coprocessor & table operations

As part of fixing 3229, I have some questions about usage scenarios of coprocessor in table operations, e.g., MasterObserver::preEnableTable, etc.

1.      Operations like preAddColumn, etc. don't honor coprocessor's request to bypass default behavior, as in MasterCoProcessorHost.java. Is that a bug or there is some reason for that?  It seems useful feature if coprocessor can disallow addColumn, for example.
2.      Some of these operations like EnableTable is async. postEnableTable is called right after the request is queued to the thread pool. So there is no guarantee the process is done by the time postEnableTable is called. Yes, postEnableTable won't be called if the validation in the constructor of EnableTableHandler throws exception. Still, what are the scenarios that postEnableTable can be useful in the async implementation?
3.      Method signature in MasterObserver::preCreateTable uses tableDescriptor+splitKey; MasterObserver:: postCreateTable signature uses HRegionInfo. They are essential the same thing. Why can't we use the same signature, like HRegionInfo instead?

If these are indeed issues, I can follow up and fix them.

Ming


Re: coprocessor & table operations

Posted by Gary Helmling <gh...@gmail.com>.
On Fri, Aug 26, 2011 at 5:19 PM, Ma, Ming <mi...@ebay.com> wrote:

> Thanks, Gray, Ted. I will open a jira and provide the fix later.
>
> About where to call such post methods for table operations, after request
> is queued to executor or after EventHandler.process has finished:
>
> 1. It applies to EnableTableHandler and other table operations as well.
> Currently post methods are called after request is queued to executor. So
> whatever way to go, we will apply to all methods.
>

I don't feel strongly whether the post method invocations are prior to
completion of the underlying operation or wait to invoke after completion,
as long as we are consistent in similar cases.  When the operations are
async from the client perspective, I'm not sure it matters.  Though when we
have sync and async versions of the operation from a client perspective, how
should this apply?  Maybe it _would_ be better to defer the post method
invocation to completion in those cases?


> 2. Thread model for coprocessor. For a given operation, pre and post
> methods are currently called on the same thread. That seems to be the case
> for all hooks in hbase. Calling post methods from executor thread pool means
> pre and post methods could be called on different threads. Perhaps calling
> on the same thread just happens to be the implementation; there is no such
> assumption in coprocessor design. If we want to change such behavior, at
> least we can clarify in javadoc so that coprocessors developers know about
> it.
>
>
I don't think we've expressed any guarantee that pre and post methods will
run in the same thread, but it seems like a natural expectation (right or
not) for coprocessor implementors.  If we need to deviate from this
anywhere, I agree we should clearly document that behavior.

RE: coprocessor & table operations

Posted by "Ma, Ming" <mi...@ebay.com>.
Thanks, Gray, Ted. I will open a jira and provide the fix later.

About where to call such post methods for table operations, after request is queued to executor or after EventHandler.process has finished:

1. It applies to EnableTableHandler and other table operations as well. Currently post methods are called after request is queued to executor. So whatever way to go, we will apply to all methods.
2. Thread model for coprocessor. For a given operation, pre and post methods are currently called on the same thread. That seems to be the case for all hooks in hbase. Calling post methods from executor thread pool means pre and post methods could be called on different threads. Perhaps calling on the same thread just happens to be the implementation; there is no such assumption in coprocessor design. If we want to change such behavior, at least we can clarify in javadoc so that coprocessors developers know about it.


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Friday, August 26, 2011 3:18 PM
To: dev@hbase.apache.org
Subject: Re: coprocessor & table operations

w.r.t. MasterObserver.postCreateTable(), does it make sense to provide this
hook at the end of CreateTableHandler.process() now that HBASE-3229 has been
committed ?

Cheers

On Fri, Aug 26, 2011 at 2:57 PM, Gary Helmling <gh...@gmail.com> wrote:

> Hi Ming,
>
> Sorry, gmail flagged this as spam for some reason, so I didn't see it until
> now.
>
> On Wed, Aug 24, 2011 at 12:21 AM, Ma, Ming <mi...@ebay.com> wrote:
>
> > As part of fixing 3229, I have some questions about usage scenarios of
> > coprocessor in table operations, e.g., MasterObserver::preEnableTable,
> etc.
> >
> > 1.      Operations like preAddColumn, etc. don't honor coprocessor's
> > request to bypass default behavior, as in MasterCoProcessorHost.java. Is
> > that a bug or there is some reason for that?  It seems useful feature if
> > coprocessor can disallow addColumn, for example.
> >
>
> This seems like an oversight that we should fix.  Taking a quick look over
> MasterCoprocessorHost, I would agree that all of the preXXX hooks for
> create, modify, delete of tables should support bypass.  Early on in
> implementing coprocessors, there was some question of to what extent they
> should be able to "break" core HBase functionality.  So I might have left
> bypass handling out of these methods out of an over-abundance of caution,
> but if so it seems unwarranted.  The only hooks that I would say should
> _not_ support "bypass" are preShutdown() and preStopMaster().
>
>
>
> > 2.      Some of these operations like EnableTable is async.
> postEnableTable
> > is called right after the request is queued to the thread pool. So there
> is
> > no guarantee the process is done by the time postEnableTable is called.
> Yes,
> > postEnableTable won't be called if the validation in the constructor of
> > EnableTableHandler throws exception. Still, what are the scenarios that
> > postEnableTable can be useful in the async implementation?
> >
>
> Sounds like we should clearly note this in the javadoc for
> MasterObserver.postEnableTable().  I don't think the async handling itself
> renders postEnableTable not useful, though.  In general I find the postXXX
> hooks most useful for listener-type operations.  As long as implementors
> are
> aware of the timing, I don't think it's necessarily an issue.
>
>
>
> > 3.      Method signature in MasterObserver::preCreateTable uses
> > tableDescriptor+splitKey; MasterObserver:: postCreateTable signature uses
> > HRegionInfo. They are essential the same thing. Why can't we use the same
> > signature, like HRegionInfo instead?
> >
> >
> We might have used that signature for preCreateTable() to mirror the RPC
> call from clients?  But changing this to use HRegionInfo[] would allow us
> to
> also change the return type to HRegionInfo[], which would allow
> implementors
> to override the region splits.  This seems like an additional bonus.
>
> All of these sound like good and useful changes.  You want to open up a
> jira
> Ming?  Feel free to work up a patch as well!  I'd be happy to review.
>
>
> Gary
>

Re: coprocessor & table operations

Posted by Ted Yu <yu...@gmail.com>.
w.r.t. MasterObserver.postCreateTable(), does it make sense to provide this
hook at the end of CreateTableHandler.process() now that HBASE-3229 has been
committed ?

Cheers

On Fri, Aug 26, 2011 at 2:57 PM, Gary Helmling <gh...@gmail.com> wrote:

> Hi Ming,
>
> Sorry, gmail flagged this as spam for some reason, so I didn't see it until
> now.
>
> On Wed, Aug 24, 2011 at 12:21 AM, Ma, Ming <mi...@ebay.com> wrote:
>
> > As part of fixing 3229, I have some questions about usage scenarios of
> > coprocessor in table operations, e.g., MasterObserver::preEnableTable,
> etc.
> >
> > 1.      Operations like preAddColumn, etc. don't honor coprocessor's
> > request to bypass default behavior, as in MasterCoProcessorHost.java. Is
> > that a bug or there is some reason for that?  It seems useful feature if
> > coprocessor can disallow addColumn, for example.
> >
>
> This seems like an oversight that we should fix.  Taking a quick look over
> MasterCoprocessorHost, I would agree that all of the preXXX hooks for
> create, modify, delete of tables should support bypass.  Early on in
> implementing coprocessors, there was some question of to what extent they
> should be able to "break" core HBase functionality.  So I might have left
> bypass handling out of these methods out of an over-abundance of caution,
> but if so it seems unwarranted.  The only hooks that I would say should
> _not_ support "bypass" are preShutdown() and preStopMaster().
>
>
>
> > 2.      Some of these operations like EnableTable is async.
> postEnableTable
> > is called right after the request is queued to the thread pool. So there
> is
> > no guarantee the process is done by the time postEnableTable is called.
> Yes,
> > postEnableTable won't be called if the validation in the constructor of
> > EnableTableHandler throws exception. Still, what are the scenarios that
> > postEnableTable can be useful in the async implementation?
> >
>
> Sounds like we should clearly note this in the javadoc for
> MasterObserver.postEnableTable().  I don't think the async handling itself
> renders postEnableTable not useful, though.  In general I find the postXXX
> hooks most useful for listener-type operations.  As long as implementors
> are
> aware of the timing, I don't think it's necessarily an issue.
>
>
>
> > 3.      Method signature in MasterObserver::preCreateTable uses
> > tableDescriptor+splitKey; MasterObserver:: postCreateTable signature uses
> > HRegionInfo. They are essential the same thing. Why can't we use the same
> > signature, like HRegionInfo instead?
> >
> >
> We might have used that signature for preCreateTable() to mirror the RPC
> call from clients?  But changing this to use HRegionInfo[] would allow us
> to
> also change the return type to HRegionInfo[], which would allow
> implementors
> to override the region splits.  This seems like an additional bonus.
>
> All of these sound like good and useful changes.  You want to open up a
> jira
> Ming?  Feel free to work up a patch as well!  I'd be happy to review.
>
>
> Gary
>

Re: coprocessor & table operations

Posted by Gary Helmling <gh...@gmail.com>.
Hi Ming,

Sorry, gmail flagged this as spam for some reason, so I didn't see it until
now.

On Wed, Aug 24, 2011 at 12:21 AM, Ma, Ming <mi...@ebay.com> wrote:

> As part of fixing 3229, I have some questions about usage scenarios of
> coprocessor in table operations, e.g., MasterObserver::preEnableTable, etc.
>
> 1.      Operations like preAddColumn, etc. don't honor coprocessor's
> request to bypass default behavior, as in MasterCoProcessorHost.java. Is
> that a bug or there is some reason for that?  It seems useful feature if
> coprocessor can disallow addColumn, for example.
>

This seems like an oversight that we should fix.  Taking a quick look over
MasterCoprocessorHost, I would agree that all of the preXXX hooks for
create, modify, delete of tables should support bypass.  Early on in
implementing coprocessors, there was some question of to what extent they
should be able to "break" core HBase functionality.  So I might have left
bypass handling out of these methods out of an over-abundance of caution,
but if so it seems unwarranted.  The only hooks that I would say should
_not_ support "bypass" are preShutdown() and preStopMaster().



> 2.      Some of these operations like EnableTable is async. postEnableTable
> is called right after the request is queued to the thread pool. So there is
> no guarantee the process is done by the time postEnableTable is called. Yes,
> postEnableTable won't be called if the validation in the constructor of
> EnableTableHandler throws exception. Still, what are the scenarios that
> postEnableTable can be useful in the async implementation?
>

Sounds like we should clearly note this in the javadoc for
MasterObserver.postEnableTable().  I don't think the async handling itself
renders postEnableTable not useful, though.  In general I find the postXXX
hooks most useful for listener-type operations.  As long as implementors are
aware of the timing, I don't think it's necessarily an issue.



> 3.      Method signature in MasterObserver::preCreateTable uses
> tableDescriptor+splitKey; MasterObserver:: postCreateTable signature uses
> HRegionInfo. They are essential the same thing. Why can't we use the same
> signature, like HRegionInfo instead?
>
>
We might have used that signature for preCreateTable() to mirror the RPC
call from clients?  But changing this to use HRegionInfo[] would allow us to
also change the return type to HRegionInfo[], which would allow implementors
to override the region splits.  This seems like an additional bonus.

All of these sound like good and useful changes.  You want to open up a jira
Ming?  Feel free to work up a patch as well!  I'd be happy to review.


Gary