You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2019/08/08 19:20:46 UTC

Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Please let me direct your attention to the tail of HBASE-22623 for a larger
discussion.  I tried to sum it up as follows:

An opinion that we should have more and more coprocessor interfaces to
address new use cases is valid. An opinion that coprocessors are too
invasive and should be 'cleaned up' is also valid. An opinion that the
compatibility headaches of coprocessor interfaces are annoying is valid. An
opinion that Phoenix can be considered as a valid use case when considering
interface changes is valid. An opinion that only HBase level concerns
should motivate API changes is valid. These opinions are strawmen. I think
they approach actual positions in the community but I do not imply any
specific person has one of them. These strawmen are at least partially
contradictory. It is going to be an ongoing process to sort them out into
something that makes sense and can get consensus.

So while as committer I am moving forward on HBASE-22623 because I don't
see a veto but instead a disagreement on the margins (deprecation or not)
motivated on larger principles, I also want to raise the visibility of the
disagreement because I think it impacts our relationship with another
project at Apache at a minimum, but also future technical directions of an
important subset of interfaces.

For your consideration.

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Andrew Purtell <ap...@apache.org>.
This is a great point.

Implicitly I've always had an expectation the audience for coprocessors is
expert.

I developed Linux device drivers and loadable kernel modules for years.
Although the vast majority of the kernel was out of scope for any
particular project, it was super trivial for me the developer to panic the
kernel with the slightest mistake - a null pointer dereference, an infinite
loop, a mistake in (spin)lock discipline, another hundred details... all of
that comes with the territory. The risks are known and accepted. Nobody in
their right mind installs a random and untrusted user supplied kernel
module. When developing an integrated application that includes a kernel
module the kernel module gets a ton of in house expert scrutiny, or should,
because that is a concentration of risk. Likewise, in every respect, are
coprocessors (IMHO).

There is no concept of safety in the coprocessor interfaces. Coprocessors
are Java bytecode we load into the server process. Java doesn't offer intra
process memory or code safety. What we do try to do is encapsulate and
distinguish public and limited private interfaces from private interfaces
so we know we have total freedom to break the private interfaces, and
likewise we know when compatibility concerns  That encapsulation is not
hiding, it is not "protection", it is not "safety".

If you want safety, that is HBASE-4047. (Dang, 60 watchers on that one.)


On Fri, Aug 9, 2019 at 3:57 PM Geoffrey Jacoby <gj...@apache.org> wrote:

> For coprocessors in general, I think Andrew sums the issues up well. I just
> want to add that it's really important to have a mental model of who the
> audience for a coprocessor is when designing. I remember when coprocessors
> were first introduced, I read articles comparing them to triggers and
> stored procedures, and the docs still describe them that way in section
> 112.1.  But they're really not like triggers and SPs, because a
> user-written stored procedure shouldn't be able to crash your database. :-)
> Andrew's comparison to kernel extensions is apt I think.
>
> Writing a Linux kernel extension doesn't require you to be Linus or one of
> his lieutenants, but it does require more sophistication that just writing
> a user-space Linux program does. Their entire purpose is for people who
> aren't at the level of Linus to affect the kernel behavior in limited ways.
> Likewise, with HBase coprocessors. They should provide safe enough
> abstractions that they don't require an HBase committer to implement
> safely, but I think it's fine to require more HBase knowledge than what's
> required to use the client API.
>
> You can't say "a mutable WALEdit is too dangerous" without making
> assumptions about the developer it's too dangerous _for_. Ideally these
> assumptions should be explicit and agreed upon by the community. I think
> differing audience assumptions are where at least some of the disagreements
> are coming from.
>
> Geoffrey
>
>
> On Fri, Aug 9, 2019 at 8:55 AM Andrew Purtell <an...@gmail.com>
> wrote:
>
> > The future of the coprocessor API is an interesting topic. I think we
> have
> > a range of opinions in the community and I would like to hear more of
> them.
> >
> > Because of the compatibility headaches sometimes I’d like to rip them
> out.
> > Sometimes they are essential for accomplishing something in Phoenix or
> our
> > in house backup solution, for example. For me my opinion is very mixed.
> >
> > When first conceived the first use case was security and we took
> > inspiration from OS kernel examples in Linux and TrustedBSD (now merged
> > with FreeBSD) where upcall interfaces were made available where
> > authoritative access control decisions are made. The set of hooks has
> > expanded over time as users have requested extensions. This style of
> > interface is powerful in that it enables a mixin approach to composition
> of
> > functionality and extension. However in retrospect the maintenance
> burdens
> > were not fully appreciated, at least by me. I had assumed we would be
> > allowed more freedom to change but my experiences with Phoenix educated
> me
> > on the kind of downstream headaches that result when we make those
> changes.
> >
> > If I were to do it again I would attempt an abstract and fluent interface
> > where extensions would register intents and receive callbacks in a much
> > more granular way, like:
> >
> >     onRegion().onRPC().onGet().then(...)
> >
> > I suppose this looks kind of like Mockito. There would be no overlarge
> > interfaces full of upcall methods that break source compatibility on
> every
> > change (in branch-1). Although we could not avoid the complexity of
> > ensuring the right callbacks are invoked at the right places on the right
> > code paths the extension interface and its types would be decoupled from
> > internals and the kind of compatibility headaches we impose on
> > downstreamers (and ourselves) would mostly disappear. This has been
> > proposed on an old JIRA somewhere...
> >
> > Of course a redo like this would be a very complex and time consuming
> > project, and a port of say something like Phoenix would be a reboot of
> > multi man year efforts, and as far as I know nobody working in the code
> > today has that kind of sustained available time and attention. It’s too
> > late. We have to make the best of the legacy of past engineering choices.
> > If that is not correct then it would be a very pleasant surprise indeed.
> >
> > Given that we have the current model and we have downstreamers like
> > Phoenix depending on them I think we are limited in the kind of clean ups
> > we might like to do and need to be tolerant of the requirement to
> maintain
> > these interfaces and their functionally across major versions.
> Deprecation
> > is fine but if it is done without the input of the known consumer are we
> > really being fair? Unclear.
> >
> >
> > > On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com>
> wrote:
> > >
> > > When releasing 2.0.0 we faced a lot of problems because we exposes so
> > many
> > > internal classes to CPs. It is really hard to both consider the
> > > compatibility and development on HBase. And then we have done lots of
> > works
> > > to abstract interfaces for CPs to use and hide the actual
> implementation
> > > classes to be HBase only.
> > >
> > > The work is not fully done, so we still left some methods which exposes
> > > internal classes there with a deprecated annotation, I think most of
> them
> > > are for Phoenix. This is a trade off and I think it is also acceptable.
> > > Phoenix could still use the deprecated methods, and we will not remove
> > them
> > > unless we find an alternate solution. And I think if anyone wants to
> > remove
> > > them without replacment you will first jump out and give a -1 :)
> > >
> > > Specific to HBASE-22623, I still think we should add the deprecated
> > > annotation to keep the API consistent, otherwise users will be confused
> > > that whether they can use the WALEdit. And on the abstraction of
> > WALEdit, I
> > > think we used to rely on a high level abstraction in HBASE-20952. But
> now
> > > since there is little progress there, I think we can start the work of
> > > abstracting WALEdit only.
> > >
> > > Thanks.
> > >
> > > Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> > >
> > >> Please let me direct your attention to the tail of HBASE-22623 for a
> > larger
> > >> discussion.  I tried to sum it up as follows:
> > >>
> > >> An opinion that we should have more and more coprocessor interfaces to
> > >> address new use cases is valid. An opinion that coprocessors are too
> > >> invasive and should be 'cleaned up' is also valid. An opinion that the
> > >> compatibility headaches of coprocessor interfaces are annoying is
> > valid. An
> > >> opinion that Phoenix can be considered as a valid use case when
> > considering
> > >> interface changes is valid. An opinion that only HBase level concerns
> > >> should motivate API changes is valid. These opinions are strawmen. I
> > think
> > >> they approach actual positions in the community but I do not imply any
> > >> specific person has one of them. These strawmen are at least partially
> > >> contradictory. It is going to be an ongoing process to sort them out
> > into
> > >> something that makes sense and can get consensus.
> > >>
> > >> So while as committer I am moving forward on HBASE-22623 because I
> don't
> > >> see a veto but instead a disagreement on the margins (deprecation or
> > not)
> > >> motivated on larger principles, I also want to raise the visibility of
> > the
> > >> disagreement because I think it impacts our relationship with another
> > >> project at Apache at a minimum, but also future technical directions
> of
> > an
> > >> important subset of interfaces.
> > >>
> > >> For your consideration.
> > >>
> > >> --
> > >> Best regards,
> > >> Andrew
> > >>
> > >> Words like orphans lost among the crosstalk, meaning torn from truth's
> > >> decrepit hands
> > >>   - A23, Crosstalk
> > >>
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Stack <st...@duboce.net>.
On Sat, Aug 10, 2019 at 5:58 PM Andrew Purtell <an...@gmail.com>
wrote:

> .... And yet it is also paranoia in this instance because we are talking
> specifically of WALkey annotations made ONCE and BEFORE commit. Warnings
> are duly noted and acknowledged!


ACK the instance of paranoia and the receipt of warning.

S


>
>
> > On Aug 10, 2019, at 4:50 PM, Sean Busbey <bu...@apache.org> wrote:
> >
> > How much of that trouble around debug is a matter of us not doing enough
> to
> > make clear when change has happened due to a coprocessor?
> >
> > Heck it's hard enough just saying what coprocessors were active over a
> time
> > period on a running cluster. If we improved that would heavy coprocessor
> > use be less concerning?
> >
> >
> > For example, coprocessor actions could go into the audit log.
> >
> > Or in the case of the addition of coprocessor provided metadata about the
> > wal* we could track information about what added said metadata right in
> the
> > wall itself.
> >
> >
> >
> >> On Sat, Aug 10, 2019, 18:29 Stack <st...@duboce.net> wrote:
> >>
> >>> On Fri, Aug 9, 2019 at 3:57 PM Geoffrey Jacoby <gj...@apache.org>
> wrote:
> >>>
> >>>
> >>> You can't say "a mutable WALEdit is too dangerous" without making
> >>> assumptions about the developer it's too dangerous _for_. Ideally these
> >>> assumptions should be explicit and agreed upon by the community. I
> think
> >>> differing audience assumptions are where at least some of the
> >> disagreements
> >>> are coming from.
> >>>
> >>
> >> All sounds good but I have a bit of trouble with your using WALEdit in
> your
> >> illustration above.
> >>
> >> To be explicit, the reluctance exposing WAL{Edit,Key} is because when
> the
> >> system goes awry, from experience, debug is consuming. Errors in WAL*
> >> accounting are particularly hard to fathom because the problem shows up
> >> usually at a displacement in time and location (replay or on other end
> of a
> >> replication). Often the issue triggers only when large amounts of data.
> All
> >> WAL*s flow together whatever the source out the one firehose. The
> >> *assumption* is that a CP writer randomly mangling a WAL* that happens
> to
> >> break the cluster's recovery may not be around or game for the fun debug
> >> sessions required figuring root cause.
> >>
> >> <joke>We could add a Geoffrey Jacoby annotation on CP APIs?</joke>
> >>
> >> Thanks,
> >> S
> >>
> >>
> >>
> >>
> >> Geoffrey
> >>>
> >>>
> >>> On Fri, Aug 9, 2019 at 8:55 AM Andrew Purtell <
> andrew.purtell@gmail.com>
> >>> wrote:
> >>>
> >>>> The future of the coprocessor API is an interesting topic. I think we
> >>> have
> >>>> a range of opinions in the community and I would like to hear more of
> >>> them.
> >>>>
> >>>> Because of the compatibility headaches sometimes I’d like to rip them
> >>> out.
> >>>> Sometimes they are essential for accomplishing something in Phoenix or
> >>> our
> >>>> in house backup solution, for example. For me my opinion is very
> mixed.
> >>>>
> >>>> When first conceived the first use case was security and we took
> >>>> inspiration from OS kernel examples in Linux and TrustedBSD (now
> merged
> >>>> with FreeBSD) where upcall interfaces were made available where
> >>>> authoritative access control decisions are made. The set of hooks has
> >>>> expanded over time as users have requested extensions. This style of
> >>>> interface is powerful in that it enables a mixin approach to
> >> composition
> >>> of
> >>>> functionality and extension. However in retrospect the maintenance
> >>> burdens
> >>>> were not fully appreciated, at least by me. I had assumed we would be
> >>>> allowed more freedom to change but my experiences with Phoenix
> educated
> >>> me
> >>>> on the kind of downstream headaches that result when we make those
> >>> changes.
> >>>>
> >>>> If I were to do it again I would attempt an abstract and fluent
> >> interface
> >>>> where extensions would register intents and receive callbacks in a
> much
> >>>> more granular way, like:
> >>>>
> >>>>    onRegion().onRPC().onGet().then(...)
> >>>>
> >>>> I suppose this looks kind of like Mockito. There would be no overlarge
> >>>> interfaces full of upcall methods that break source compatibility on
> >>> every
> >>>> change (in branch-1). Although we could not avoid the complexity of
> >>>> ensuring the right callbacks are invoked at the right places on the
> >> right
> >>>> code paths the extension interface and its types would be decoupled
> >> from
> >>>> internals and the kind of compatibility headaches we impose on
> >>>> downstreamers (and ourselves) would mostly disappear. This has been
> >>>> proposed on an old JIRA somewhere...
> >>>>
> >>>> Of course a redo like this would be a very complex and time consuming
> >>>> project, and a port of say something like Phoenix would be a reboot of
> >>>> multi man year efforts, and as far as I know nobody working in the
> code
> >>>> today has that kind of sustained available time and attention. It’s
> too
> >>>> late. We have to make the best of the legacy of past engineering
> >> choices.
> >>>> If that is not correct then it would be a very pleasant surprise
> >> indeed.
> >>>>
> >>>> Given that we have the current model and we have downstreamers like
> >>>> Phoenix depending on them I think we are limited in the kind of clean
> >> ups
> >>>> we might like to do and need to be tolerant of the requirement to
> >>> maintain
> >>>> these interfaces and their functionally across major versions.
> >>> Deprecation
> >>>> is fine but if it is done without the input of the known consumer are
> >> we
> >>>> really being fair? Unclear.
> >>>>
> >>>>
> >>>>> On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>> When releasing 2.0.0 we faced a lot of problems because we exposes so
> >>>> many
> >>>>> internal classes to CPs. It is really hard to both consider the
> >>>>> compatibility and development on HBase. And then we have done lots of
> >>>> works
> >>>>> to abstract interfaces for CPs to use and hide the actual
> >>> implementation
> >>>>> classes to be HBase only.
> >>>>>
> >>>>> The work is not fully done, so we still left some methods which
> >> exposes
> >>>>> internal classes there with a deprecated annotation, I think most of
> >>> them
> >>>>> are for Phoenix. This is a trade off and I think it is also
> >> acceptable.
> >>>>> Phoenix could still use the deprecated methods, and we will not
> >> remove
> >>>> them
> >>>>> unless we find an alternate solution. And I think if anyone wants to
> >>>> remove
> >>>>> them without replacment you will first jump out and give a -1 :)
> >>>>>
> >>>>> Specific to HBASE-22623, I still think we should add the deprecated
> >>>>> annotation to keep the API consistent, otherwise users will be
> >> confused
> >>>>> that whether they can use the WALEdit. And on the abstraction of
> >>>> WALEdit, I
> >>>>> think we used to rely on a high level abstraction in HBASE-20952. But
> >>> now
> >>>>> since there is little progress there, I think we can start the work
> >> of
> >>>>> abstracting WALEdit only.
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> >>>>>
> >>>>>> Please let me direct your attention to the tail of HBASE-22623 for a
> >>>> larger
> >>>>>> discussion.  I tried to sum it up as follows:
> >>>>>>
> >>>>>> An opinion that we should have more and more coprocessor interfaces
> >> to
> >>>>>> address new use cases is valid. An opinion that coprocessors are too
> >>>>>> invasive and should be 'cleaned up' is also valid. An opinion that
> >> the
> >>>>>> compatibility headaches of coprocessor interfaces are annoying is
> >>>> valid. An
> >>>>>> opinion that Phoenix can be considered as a valid use case when
> >>>> considering
> >>>>>> interface changes is valid. An opinion that only HBase level
> >> concerns
> >>>>>> should motivate API changes is valid. These opinions are strawmen. I
> >>>> think
> >>>>>> they approach actual positions in the community but I do not imply
> >> any
> >>>>>> specific person has one of them. These strawmen are at least
> >> partially
> >>>>>> contradictory. It is going to be an ongoing process to sort them out
> >>>> into
> >>>>>> something that makes sense and can get consensus.
> >>>>>>
> >>>>>> So while as committer I am moving forward on HBASE-22623 because I
> >>> don't
> >>>>>> see a veto but instead a disagreement on the margins (deprecation or
> >>>> not)
> >>>>>> motivated on larger principles, I also want to raise the visibility
> >> of
> >>>> the
> >>>>>> disagreement because I think it impacts our relationship with
> >> another
> >>>>>> project at Apache at a minimum, but also future technical directions
> >>> of
> >>>> an
> >>>>>> important subset of interfaces.
> >>>>>>
> >>>>>> For your consideration.
> >>>>>>
> >>>>>> --
> >>>>>> Best regards,
> >>>>>> Andrew
> >>>>>>
> >>>>>> Words like orphans lost among the crosstalk, meaning torn from
> >> truth's
> >>>>>> decrepit hands
> >>>>>>  - A23, Crosstalk
> >>>>>>
> >>>>
> >>>
> >>
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Andrew Purtell <an...@gmail.com>.
We list coprocessors in ClusterStatus, the idea being someone investigating issues has it up front and visible what extensions are installed. Status listings in the UI and status commands in the shell print this. The information in the UI is right there next to no less important attributes like Hadoop and HBase version and compile/build string. 

We also log when coprocessors are installed. If they are installed on a table their instantiation is logged at every region open. 

What we don’t do is have a single common code path for problem reporting so can’t do something consistent and obvious like the TAINTED flag in a Linux kernel oops. Perhaps we could refactor every use of >= error level logging to such a facility? That wouldn’t be difficult, just time consuming. 

That said I think Stack’s paranoia about debugging the WAL is a battle scarred legacy of chasing down bad bugs. It is definitely earned. And I remember some bad bugs. ITBLL’s log parsing tools aren’t there for a laugh. (If you are using them that is not laughter coming from your mouth it is a broken noise.) And yet it is also paranoia in this instance because we are talking specifically of WALkey annotations made ONCE and BEFORE commit. Warnings are duly noted and acknowledged! If we want to inject stuff into the WAL over on Phoenix that is our prerogative, and perhaps folly too. But we believe we know what we are doing, and just might. (Smile) I go back to Geoffrey’s point on audience. 


> On Aug 10, 2019, at 4:50 PM, Sean Busbey <bu...@apache.org> wrote:
> 
> How much of that trouble around debug is a matter of us not doing enough to
> make clear when change has happened due to a coprocessor?
> 
> Heck it's hard enough just saying what coprocessors were active over a time
> period on a running cluster. If we improved that would heavy coprocessor
> use be less concerning?
> 
> 
> For example, coprocessor actions could go into the audit log.
> 
> Or in the case of the addition of coprocessor provided metadata about the
> wal* we could track information about what added said metadata right in the
> wall itself.
> 
> 
> 
>> On Sat, Aug 10, 2019, 18:29 Stack <st...@duboce.net> wrote:
>> 
>>> On Fri, Aug 9, 2019 at 3:57 PM Geoffrey Jacoby <gj...@apache.org> wrote:
>>> 
>>> 
>>> You can't say "a mutable WALEdit is too dangerous" without making
>>> assumptions about the developer it's too dangerous _for_. Ideally these
>>> assumptions should be explicit and agreed upon by the community. I think
>>> differing audience assumptions are where at least some of the
>> disagreements
>>> are coming from.
>>> 
>> 
>> All sounds good but I have a bit of trouble with your using WALEdit in your
>> illustration above.
>> 
>> To be explicit, the reluctance exposing WAL{Edit,Key} is because when the
>> system goes awry, from experience, debug is consuming. Errors in WAL*
>> accounting are particularly hard to fathom because the problem shows up
>> usually at a displacement in time and location (replay or on other end of a
>> replication). Often the issue triggers only when large amounts of data. All
>> WAL*s flow together whatever the source out the one firehose. The
>> *assumption* is that a CP writer randomly mangling a WAL* that happens to
>> break the cluster's recovery may not be around or game for the fun debug
>> sessions required figuring root cause.
>> 
>> <joke>We could add a Geoffrey Jacoby annotation on CP APIs?</joke>
>> 
>> Thanks,
>> S
>> 
>> 
>> 
>> 
>> Geoffrey
>>> 
>>> 
>>> On Fri, Aug 9, 2019 at 8:55 AM Andrew Purtell <an...@gmail.com>
>>> wrote:
>>> 
>>>> The future of the coprocessor API is an interesting topic. I think we
>>> have
>>>> a range of opinions in the community and I would like to hear more of
>>> them.
>>>> 
>>>> Because of the compatibility headaches sometimes I’d like to rip them
>>> out.
>>>> Sometimes they are essential for accomplishing something in Phoenix or
>>> our
>>>> in house backup solution, for example. For me my opinion is very mixed.
>>>> 
>>>> When first conceived the first use case was security and we took
>>>> inspiration from OS kernel examples in Linux and TrustedBSD (now merged
>>>> with FreeBSD) where upcall interfaces were made available where
>>>> authoritative access control decisions are made. The set of hooks has
>>>> expanded over time as users have requested extensions. This style of
>>>> interface is powerful in that it enables a mixin approach to
>> composition
>>> of
>>>> functionality and extension. However in retrospect the maintenance
>>> burdens
>>>> were not fully appreciated, at least by me. I had assumed we would be
>>>> allowed more freedom to change but my experiences with Phoenix educated
>>> me
>>>> on the kind of downstream headaches that result when we make those
>>> changes.
>>>> 
>>>> If I were to do it again I would attempt an abstract and fluent
>> interface
>>>> where extensions would register intents and receive callbacks in a much
>>>> more granular way, like:
>>>> 
>>>>    onRegion().onRPC().onGet().then(...)
>>>> 
>>>> I suppose this looks kind of like Mockito. There would be no overlarge
>>>> interfaces full of upcall methods that break source compatibility on
>>> every
>>>> change (in branch-1). Although we could not avoid the complexity of
>>>> ensuring the right callbacks are invoked at the right places on the
>> right
>>>> code paths the extension interface and its types would be decoupled
>> from
>>>> internals and the kind of compatibility headaches we impose on
>>>> downstreamers (and ourselves) would mostly disappear. This has been
>>>> proposed on an old JIRA somewhere...
>>>> 
>>>> Of course a redo like this would be a very complex and time consuming
>>>> project, and a port of say something like Phoenix would be a reboot of
>>>> multi man year efforts, and as far as I know nobody working in the code
>>>> today has that kind of sustained available time and attention. It’s too
>>>> late. We have to make the best of the legacy of past engineering
>> choices.
>>>> If that is not correct then it would be a very pleasant surprise
>> indeed.
>>>> 
>>>> Given that we have the current model and we have downstreamers like
>>>> Phoenix depending on them I think we are limited in the kind of clean
>> ups
>>>> we might like to do and need to be tolerant of the requirement to
>>> maintain
>>>> these interfaces and their functionally across major versions.
>>> Deprecation
>>>> is fine but if it is done without the input of the known consumer are
>> we
>>>> really being fair? Unclear.
>>>> 
>>>> 
>>>>> On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com>
>>> wrote:
>>>>> 
>>>>> When releasing 2.0.0 we faced a lot of problems because we exposes so
>>>> many
>>>>> internal classes to CPs. It is really hard to both consider the
>>>>> compatibility and development on HBase. And then we have done lots of
>>>> works
>>>>> to abstract interfaces for CPs to use and hide the actual
>>> implementation
>>>>> classes to be HBase only.
>>>>> 
>>>>> The work is not fully done, so we still left some methods which
>> exposes
>>>>> internal classes there with a deprecated annotation, I think most of
>>> them
>>>>> are for Phoenix. This is a trade off and I think it is also
>> acceptable.
>>>>> Phoenix could still use the deprecated methods, and we will not
>> remove
>>>> them
>>>>> unless we find an alternate solution. And I think if anyone wants to
>>>> remove
>>>>> them without replacment you will first jump out and give a -1 :)
>>>>> 
>>>>> Specific to HBASE-22623, I still think we should add the deprecated
>>>>> annotation to keep the API consistent, otherwise users will be
>> confused
>>>>> that whether they can use the WALEdit. And on the abstraction of
>>>> WALEdit, I
>>>>> think we used to rely on a high level abstraction in HBASE-20952. But
>>> now
>>>>> since there is little progress there, I think we can start the work
>> of
>>>>> abstracting WALEdit only.
>>>>> 
>>>>> Thanks.
>>>>> 
>>>>> Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
>>>>> 
>>>>>> Please let me direct your attention to the tail of HBASE-22623 for a
>>>> larger
>>>>>> discussion.  I tried to sum it up as follows:
>>>>>> 
>>>>>> An opinion that we should have more and more coprocessor interfaces
>> to
>>>>>> address new use cases is valid. An opinion that coprocessors are too
>>>>>> invasive and should be 'cleaned up' is also valid. An opinion that
>> the
>>>>>> compatibility headaches of coprocessor interfaces are annoying is
>>>> valid. An
>>>>>> opinion that Phoenix can be considered as a valid use case when
>>>> considering
>>>>>> interface changes is valid. An opinion that only HBase level
>> concerns
>>>>>> should motivate API changes is valid. These opinions are strawmen. I
>>>> think
>>>>>> they approach actual positions in the community but I do not imply
>> any
>>>>>> specific person has one of them. These strawmen are at least
>> partially
>>>>>> contradictory. It is going to be an ongoing process to sort them out
>>>> into
>>>>>> something that makes sense and can get consensus.
>>>>>> 
>>>>>> So while as committer I am moving forward on HBASE-22623 because I
>>> don't
>>>>>> see a veto but instead a disagreement on the margins (deprecation or
>>>> not)
>>>>>> motivated on larger principles, I also want to raise the visibility
>> of
>>>> the
>>>>>> disagreement because I think it impacts our relationship with
>> another
>>>>>> project at Apache at a minimum, but also future technical directions
>>> of
>>>> an
>>>>>> important subset of interfaces.
>>>>>> 
>>>>>> For your consideration.
>>>>>> 
>>>>>> --
>>>>>> Best regards,
>>>>>> Andrew
>>>>>> 
>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>> truth's
>>>>>> decrepit hands
>>>>>>  - A23, Crosstalk
>>>>>> 
>>>> 
>>> 
>> 

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Sean Busbey <bu...@apache.org>.
How much of that trouble around debug is a matter of us not doing enough to
make clear when change has happened due to a coprocessor?

Heck it's hard enough just saying what coprocessors were active over a time
period on a running cluster. If we improved that would heavy coprocessor
use be less concerning?


For example, coprocessor actions could go into the audit log.

Or in the case of the addition of coprocessor provided metadata about the
wal* we could track information about what added said metadata right in the
wall itself.



On Sat, Aug 10, 2019, 18:29 Stack <st...@duboce.net> wrote:

> On Fri, Aug 9, 2019 at 3:57 PM Geoffrey Jacoby <gj...@apache.org> wrote:
>
> >
> > You can't say "a mutable WALEdit is too dangerous" without making
> > assumptions about the developer it's too dangerous _for_. Ideally these
> > assumptions should be explicit and agreed upon by the community. I think
> > differing audience assumptions are where at least some of the
> disagreements
> > are coming from.
> >
>
> All sounds good but I have a bit of trouble with your using WALEdit in your
> illustration above.
>
> To be explicit, the reluctance exposing WAL{Edit,Key} is because when the
> system goes awry, from experience, debug is consuming. Errors in WAL*
> accounting are particularly hard to fathom because the problem shows up
> usually at a displacement in time and location (replay or on other end of a
> replication). Often the issue triggers only when large amounts of data. All
> WAL*s flow together whatever the source out the one firehose. The
> *assumption* is that a CP writer randomly mangling a WAL* that happens to
> break the cluster's recovery may not be around or game for the fun debug
> sessions required figuring root cause.
>
> <joke>We could add a Geoffrey Jacoby annotation on CP APIs?</joke>
>
> Thanks,
> S
>
>
>
>
> Geoffrey
> >
> >
> > On Fri, Aug 9, 2019 at 8:55 AM Andrew Purtell <an...@gmail.com>
> > wrote:
> >
> > > The future of the coprocessor API is an interesting topic. I think we
> > have
> > > a range of opinions in the community and I would like to hear more of
> > them.
> > >
> > > Because of the compatibility headaches sometimes I’d like to rip them
> > out.
> > > Sometimes they are essential for accomplishing something in Phoenix or
> > our
> > > in house backup solution, for example. For me my opinion is very mixed.
> > >
> > > When first conceived the first use case was security and we took
> > > inspiration from OS kernel examples in Linux and TrustedBSD (now merged
> > > with FreeBSD) where upcall interfaces were made available where
> > > authoritative access control decisions are made. The set of hooks has
> > > expanded over time as users have requested extensions. This style of
> > > interface is powerful in that it enables a mixin approach to
> composition
> > of
> > > functionality and extension. However in retrospect the maintenance
> > burdens
> > > were not fully appreciated, at least by me. I had assumed we would be
> > > allowed more freedom to change but my experiences with Phoenix educated
> > me
> > > on the kind of downstream headaches that result when we make those
> > changes.
> > >
> > > If I were to do it again I would attempt an abstract and fluent
> interface
> > > where extensions would register intents and receive callbacks in a much
> > > more granular way, like:
> > >
> > >     onRegion().onRPC().onGet().then(...)
> > >
> > > I suppose this looks kind of like Mockito. There would be no overlarge
> > > interfaces full of upcall methods that break source compatibility on
> > every
> > > change (in branch-1). Although we could not avoid the complexity of
> > > ensuring the right callbacks are invoked at the right places on the
> right
> > > code paths the extension interface and its types would be decoupled
> from
> > > internals and the kind of compatibility headaches we impose on
> > > downstreamers (and ourselves) would mostly disappear. This has been
> > > proposed on an old JIRA somewhere...
> > >
> > > Of course a redo like this would be a very complex and time consuming
> > > project, and a port of say something like Phoenix would be a reboot of
> > > multi man year efforts, and as far as I know nobody working in the code
> > > today has that kind of sustained available time and attention. It’s too
> > > late. We have to make the best of the legacy of past engineering
> choices.
> > > If that is not correct then it would be a very pleasant surprise
> indeed.
> > >
> > > Given that we have the current model and we have downstreamers like
> > > Phoenix depending on them I think we are limited in the kind of clean
> ups
> > > we might like to do and need to be tolerant of the requirement to
> > maintain
> > > these interfaces and their functionally across major versions.
> > Deprecation
> > > is fine but if it is done without the input of the known consumer are
> we
> > > really being fair? Unclear.
> > >
> > >
> > > > On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com>
> > wrote:
> > > >
> > > > When releasing 2.0.0 we faced a lot of problems because we exposes so
> > > many
> > > > internal classes to CPs. It is really hard to both consider the
> > > > compatibility and development on HBase. And then we have done lots of
> > > works
> > > > to abstract interfaces for CPs to use and hide the actual
> > implementation
> > > > classes to be HBase only.
> > > >
> > > > The work is not fully done, so we still left some methods which
> exposes
> > > > internal classes there with a deprecated annotation, I think most of
> > them
> > > > are for Phoenix. This is a trade off and I think it is also
> acceptable.
> > > > Phoenix could still use the deprecated methods, and we will not
> remove
> > > them
> > > > unless we find an alternate solution. And I think if anyone wants to
> > > remove
> > > > them without replacment you will first jump out and give a -1 :)
> > > >
> > > > Specific to HBASE-22623, I still think we should add the deprecated
> > > > annotation to keep the API consistent, otherwise users will be
> confused
> > > > that whether they can use the WALEdit. And on the abstraction of
> > > WALEdit, I
> > > > think we used to rely on a high level abstraction in HBASE-20952. But
> > now
> > > > since there is little progress there, I think we can start the work
> of
> > > > abstracting WALEdit only.
> > > >
> > > > Thanks.
> > > >
> > > > Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> > > >
> > > >> Please let me direct your attention to the tail of HBASE-22623 for a
> > > larger
> > > >> discussion.  I tried to sum it up as follows:
> > > >>
> > > >> An opinion that we should have more and more coprocessor interfaces
> to
> > > >> address new use cases is valid. An opinion that coprocessors are too
> > > >> invasive and should be 'cleaned up' is also valid. An opinion that
> the
> > > >> compatibility headaches of coprocessor interfaces are annoying is
> > > valid. An
> > > >> opinion that Phoenix can be considered as a valid use case when
> > > considering
> > > >> interface changes is valid. An opinion that only HBase level
> concerns
> > > >> should motivate API changes is valid. These opinions are strawmen. I
> > > think
> > > >> they approach actual positions in the community but I do not imply
> any
> > > >> specific person has one of them. These strawmen are at least
> partially
> > > >> contradictory. It is going to be an ongoing process to sort them out
> > > into
> > > >> something that makes sense and can get consensus.
> > > >>
> > > >> So while as committer I am moving forward on HBASE-22623 because I
> > don't
> > > >> see a veto but instead a disagreement on the margins (deprecation or
> > > not)
> > > >> motivated on larger principles, I also want to raise the visibility
> of
> > > the
> > > >> disagreement because I think it impacts our relationship with
> another
> > > >> project at Apache at a minimum, but also future technical directions
> > of
> > > an
> > > >> important subset of interfaces.
> > > >>
> > > >> For your consideration.
> > > >>
> > > >> --
> > > >> Best regards,
> > > >> Andrew
> > > >>
> > > >> Words like orphans lost among the crosstalk, meaning torn from
> truth's
> > > >> decrepit hands
> > > >>   - A23, Crosstalk
> > > >>
> > >
> >
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Stack <st...@duboce.net>.
On Fri, Aug 9, 2019 at 3:57 PM Geoffrey Jacoby <gj...@apache.org> wrote:

>
> You can't say "a mutable WALEdit is too dangerous" without making
> assumptions about the developer it's too dangerous _for_. Ideally these
> assumptions should be explicit and agreed upon by the community. I think
> differing audience assumptions are where at least some of the disagreements
> are coming from.
>

All sounds good but I have a bit of trouble with your using WALEdit in your
illustration above.

To be explicit, the reluctance exposing WAL{Edit,Key} is because when the
system goes awry, from experience, debug is consuming. Errors in WAL*
accounting are particularly hard to fathom because the problem shows up
usually at a displacement in time and location (replay or on other end of a
replication). Often the issue triggers only when large amounts of data. All
WAL*s flow together whatever the source out the one firehose. The
*assumption* is that a CP writer randomly mangling a WAL* that happens to
break the cluster's recovery may not be around or game for the fun debug
sessions required figuring root cause.

<joke>We could add a Geoffrey Jacoby annotation on CP APIs?</joke>

Thanks,
S




Geoffrey
>
>
> On Fri, Aug 9, 2019 at 8:55 AM Andrew Purtell <an...@gmail.com>
> wrote:
>
> > The future of the coprocessor API is an interesting topic. I think we
> have
> > a range of opinions in the community and I would like to hear more of
> them.
> >
> > Because of the compatibility headaches sometimes I’d like to rip them
> out.
> > Sometimes they are essential for accomplishing something in Phoenix or
> our
> > in house backup solution, for example. For me my opinion is very mixed.
> >
> > When first conceived the first use case was security and we took
> > inspiration from OS kernel examples in Linux and TrustedBSD (now merged
> > with FreeBSD) where upcall interfaces were made available where
> > authoritative access control decisions are made. The set of hooks has
> > expanded over time as users have requested extensions. This style of
> > interface is powerful in that it enables a mixin approach to composition
> of
> > functionality and extension. However in retrospect the maintenance
> burdens
> > were not fully appreciated, at least by me. I had assumed we would be
> > allowed more freedom to change but my experiences with Phoenix educated
> me
> > on the kind of downstream headaches that result when we make those
> changes.
> >
> > If I were to do it again I would attempt an abstract and fluent interface
> > where extensions would register intents and receive callbacks in a much
> > more granular way, like:
> >
> >     onRegion().onRPC().onGet().then(...)
> >
> > I suppose this looks kind of like Mockito. There would be no overlarge
> > interfaces full of upcall methods that break source compatibility on
> every
> > change (in branch-1). Although we could not avoid the complexity of
> > ensuring the right callbacks are invoked at the right places on the right
> > code paths the extension interface and its types would be decoupled from
> > internals and the kind of compatibility headaches we impose on
> > downstreamers (and ourselves) would mostly disappear. This has been
> > proposed on an old JIRA somewhere...
> >
> > Of course a redo like this would be a very complex and time consuming
> > project, and a port of say something like Phoenix would be a reboot of
> > multi man year efforts, and as far as I know nobody working in the code
> > today has that kind of sustained available time and attention. It’s too
> > late. We have to make the best of the legacy of past engineering choices.
> > If that is not correct then it would be a very pleasant surprise indeed.
> >
> > Given that we have the current model and we have downstreamers like
> > Phoenix depending on them I think we are limited in the kind of clean ups
> > we might like to do and need to be tolerant of the requirement to
> maintain
> > these interfaces and their functionally across major versions.
> Deprecation
> > is fine but if it is done without the input of the known consumer are we
> > really being fair? Unclear.
> >
> >
> > > On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com>
> wrote:
> > >
> > > When releasing 2.0.0 we faced a lot of problems because we exposes so
> > many
> > > internal classes to CPs. It is really hard to both consider the
> > > compatibility and development on HBase. And then we have done lots of
> > works
> > > to abstract interfaces for CPs to use and hide the actual
> implementation
> > > classes to be HBase only.
> > >
> > > The work is not fully done, so we still left some methods which exposes
> > > internal classes there with a deprecated annotation, I think most of
> them
> > > are for Phoenix. This is a trade off and I think it is also acceptable.
> > > Phoenix could still use the deprecated methods, and we will not remove
> > them
> > > unless we find an alternate solution. And I think if anyone wants to
> > remove
> > > them without replacment you will first jump out and give a -1 :)
> > >
> > > Specific to HBASE-22623, I still think we should add the deprecated
> > > annotation to keep the API consistent, otherwise users will be confused
> > > that whether they can use the WALEdit. And on the abstraction of
> > WALEdit, I
> > > think we used to rely on a high level abstraction in HBASE-20952. But
> now
> > > since there is little progress there, I think we can start the work of
> > > abstracting WALEdit only.
> > >
> > > Thanks.
> > >
> > > Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> > >
> > >> Please let me direct your attention to the tail of HBASE-22623 for a
> > larger
> > >> discussion.  I tried to sum it up as follows:
> > >>
> > >> An opinion that we should have more and more coprocessor interfaces to
> > >> address new use cases is valid. An opinion that coprocessors are too
> > >> invasive and should be 'cleaned up' is also valid. An opinion that the
> > >> compatibility headaches of coprocessor interfaces are annoying is
> > valid. An
> > >> opinion that Phoenix can be considered as a valid use case when
> > considering
> > >> interface changes is valid. An opinion that only HBase level concerns
> > >> should motivate API changes is valid. These opinions are strawmen. I
> > think
> > >> they approach actual positions in the community but I do not imply any
> > >> specific person has one of them. These strawmen are at least partially
> > >> contradictory. It is going to be an ongoing process to sort them out
> > into
> > >> something that makes sense and can get consensus.
> > >>
> > >> So while as committer I am moving forward on HBASE-22623 because I
> don't
> > >> see a veto but instead a disagreement on the margins (deprecation or
> > not)
> > >> motivated on larger principles, I also want to raise the visibility of
> > the
> > >> disagreement because I think it impacts our relationship with another
> > >> project at Apache at a minimum, but also future technical directions
> of
> > an
> > >> important subset of interfaces.
> > >>
> > >> For your consideration.
> > >>
> > >> --
> > >> Best regards,
> > >> Andrew
> > >>
> > >> Words like orphans lost among the crosstalk, meaning torn from truth's
> > >> decrepit hands
> > >>   - A23, Crosstalk
> > >>
> >
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Geoffrey Jacoby <gj...@apache.org>.
For coprocessors in general, I think Andrew sums the issues up well. I just
want to add that it's really important to have a mental model of who the
audience for a coprocessor is when designing. I remember when coprocessors
were first introduced, I read articles comparing them to triggers and
stored procedures, and the docs still describe them that way in section
112.1.  But they're really not like triggers and SPs, because a
user-written stored procedure shouldn't be able to crash your database. :-)
Andrew's comparison to kernel extensions is apt I think.

Writing a Linux kernel extension doesn't require you to be Linus or one of
his lieutenants, but it does require more sophistication that just writing
a user-space Linux program does. Their entire purpose is for people who
aren't at the level of Linus to affect the kernel behavior in limited ways.
Likewise, with HBase coprocessors. They should provide safe enough
abstractions that they don't require an HBase committer to implement
safely, but I think it's fine to require more HBase knowledge than what's
required to use the client API.

You can't say "a mutable WALEdit is too dangerous" without making
assumptions about the developer it's too dangerous _for_. Ideally these
assumptions should be explicit and agreed upon by the community. I think
differing audience assumptions are where at least some of the disagreements
are coming from.

Geoffrey


On Fri, Aug 9, 2019 at 8:55 AM Andrew Purtell <an...@gmail.com>
wrote:

> The future of the coprocessor API is an interesting topic. I think we have
> a range of opinions in the community and I would like to hear more of them.
>
> Because of the compatibility headaches sometimes I’d like to rip them out.
> Sometimes they are essential for accomplishing something in Phoenix or our
> in house backup solution, for example. For me my opinion is very mixed.
>
> When first conceived the first use case was security and we took
> inspiration from OS kernel examples in Linux and TrustedBSD (now merged
> with FreeBSD) where upcall interfaces were made available where
> authoritative access control decisions are made. The set of hooks has
> expanded over time as users have requested extensions. This style of
> interface is powerful in that it enables a mixin approach to composition of
> functionality and extension. However in retrospect the maintenance burdens
> were not fully appreciated, at least by me. I had assumed we would be
> allowed more freedom to change but my experiences with Phoenix educated me
> on the kind of downstream headaches that result when we make those changes.
>
> If I were to do it again I would attempt an abstract and fluent interface
> where extensions would register intents and receive callbacks in a much
> more granular way, like:
>
>     onRegion().onRPC().onGet().then(...)
>
> I suppose this looks kind of like Mockito. There would be no overlarge
> interfaces full of upcall methods that break source compatibility on every
> change (in branch-1). Although we could not avoid the complexity of
> ensuring the right callbacks are invoked at the right places on the right
> code paths the extension interface and its types would be decoupled from
> internals and the kind of compatibility headaches we impose on
> downstreamers (and ourselves) would mostly disappear. This has been
> proposed on an old JIRA somewhere...
>
> Of course a redo like this would be a very complex and time consuming
> project, and a port of say something like Phoenix would be a reboot of
> multi man year efforts, and as far as I know nobody working in the code
> today has that kind of sustained available time and attention. It’s too
> late. We have to make the best of the legacy of past engineering choices.
> If that is not correct then it would be a very pleasant surprise indeed.
>
> Given that we have the current model and we have downstreamers like
> Phoenix depending on them I think we are limited in the kind of clean ups
> we might like to do and need to be tolerant of the requirement to maintain
> these interfaces and their functionally across major versions. Deprecation
> is fine but if it is done without the input of the known consumer are we
> really being fair? Unclear.
>
>
> > On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com> wrote:
> >
> > When releasing 2.0.0 we faced a lot of problems because we exposes so
> many
> > internal classes to CPs. It is really hard to both consider the
> > compatibility and development on HBase. And then we have done lots of
> works
> > to abstract interfaces for CPs to use and hide the actual implementation
> > classes to be HBase only.
> >
> > The work is not fully done, so we still left some methods which exposes
> > internal classes there with a deprecated annotation, I think most of them
> > are for Phoenix. This is a trade off and I think it is also acceptable.
> > Phoenix could still use the deprecated methods, and we will not remove
> them
> > unless we find an alternate solution. And I think if anyone wants to
> remove
> > them without replacment you will first jump out and give a -1 :)
> >
> > Specific to HBASE-22623, I still think we should add the deprecated
> > annotation to keep the API consistent, otherwise users will be confused
> > that whether they can use the WALEdit. And on the abstraction of
> WALEdit, I
> > think we used to rely on a high level abstraction in HBASE-20952. But now
> > since there is little progress there, I think we can start the work of
> > abstracting WALEdit only.
> >
> > Thanks.
> >
> > Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> >
> >> Please let me direct your attention to the tail of HBASE-22623 for a
> larger
> >> discussion.  I tried to sum it up as follows:
> >>
> >> An opinion that we should have more and more coprocessor interfaces to
> >> address new use cases is valid. An opinion that coprocessors are too
> >> invasive and should be 'cleaned up' is also valid. An opinion that the
> >> compatibility headaches of coprocessor interfaces are annoying is
> valid. An
> >> opinion that Phoenix can be considered as a valid use case when
> considering
> >> interface changes is valid. An opinion that only HBase level concerns
> >> should motivate API changes is valid. These opinions are strawmen. I
> think
> >> they approach actual positions in the community but I do not imply any
> >> specific person has one of them. These strawmen are at least partially
> >> contradictory. It is going to be an ongoing process to sort them out
> into
> >> something that makes sense and can get consensus.
> >>
> >> So while as committer I am moving forward on HBASE-22623 because I don't
> >> see a veto but instead a disagreement on the margins (deprecation or
> not)
> >> motivated on larger principles, I also want to raise the visibility of
> the
> >> disagreement because I think it impacts our relationship with another
> >> project at Apache at a minimum, but also future technical directions of
> an
> >> important subset of interfaces.
> >>
> >> For your consideration.
> >>
> >> --
> >> Best regards,
> >> Andrew
> >>
> >> Words like orphans lost among the crosstalk, meaning torn from truth's
> >> decrepit hands
> >>   - A23, Crosstalk
> >>
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Andrew Purtell <an...@gmail.com>.
The future of the coprocessor API is an interesting topic. I think we have a range of opinions in the community and I would like to hear more of them. 

Because of the compatibility headaches sometimes I’d like to rip them out. Sometimes they are essential for accomplishing something in Phoenix or our in house backup solution, for example. For me my opinion is very mixed. 

When first conceived the first use case was security and we took inspiration from OS kernel examples in Linux and TrustedBSD (now merged with FreeBSD) where upcall interfaces were made available where authoritative access control decisions are made. The set of hooks has expanded over time as users have requested extensions. This style of interface is powerful in that it enables a mixin approach to composition of functionality and extension. However in retrospect the maintenance burdens were not fully appreciated, at least by me. I had assumed we would be allowed more freedom to change but my experiences with Phoenix educated me on the kind of downstream headaches that result when we make those changes. 

If I were to do it again I would attempt an abstract and fluent interface where extensions would register intents and receive callbacks in a much more granular way, like:

    onRegion().onRPC().onGet().then(...)

I suppose this looks kind of like Mockito. There would be no overlarge interfaces full of upcall methods that break source compatibility on every change (in branch-1). Although we could not avoid the complexity of ensuring the right callbacks are invoked at the right places on the right code paths the extension interface and its types would be decoupled from internals and the kind of compatibility headaches we impose on downstreamers (and ourselves) would mostly disappear. This has been proposed on an old JIRA somewhere...

Of course a redo like this would be a very complex and time consuming project, and a port of say something like Phoenix would be a reboot of multi man year efforts, and as far as I know nobody working in the code today has that kind of sustained available time and attention. It’s too late. We have to make the best of the legacy of past engineering choices. If that is not correct then it would be a very pleasant surprise indeed. 

Given that we have the current model and we have downstreamers like Phoenix depending on them I think we are limited in the kind of clean ups we might like to do and need to be tolerant of the requirement to maintain these interfaces and their functionally across major versions. Deprecation is fine but if it is done without the input of the known consumer are we really being fair? Unclear. 


> On Aug 8, 2019, at 6:39 PM, 张铎(Duo Zhang) <pa...@gmail.com> wrote:
> 
> When releasing 2.0.0 we faced a lot of problems because we exposes so many
> internal classes to CPs. It is really hard to both consider the
> compatibility and development on HBase. And then we have done lots of works
> to abstract interfaces for CPs to use and hide the actual implementation
> classes to be HBase only.
> 
> The work is not fully done, so we still left some methods which exposes
> internal classes there with a deprecated annotation, I think most of them
> are for Phoenix. This is a trade off and I think it is also acceptable.
> Phoenix could still use the deprecated methods, and we will not remove them
> unless we find an alternate solution. And I think if anyone wants to remove
> them without replacment you will first jump out and give a -1 :)
> 
> Specific to HBASE-22623, I still think we should add the deprecated
> annotation to keep the API consistent, otherwise users will be confused
> that whether they can use the WALEdit. And on the abstraction of WALEdit, I
> think we used to rely on a high level abstraction in HBASE-20952. But now
> since there is little progress there, I think we can start the work of
> abstracting WALEdit only.
> 
> Thanks.
> 
> Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> 
>> Please let me direct your attention to the tail of HBASE-22623 for a larger
>> discussion.  I tried to sum it up as follows:
>> 
>> An opinion that we should have more and more coprocessor interfaces to
>> address new use cases is valid. An opinion that coprocessors are too
>> invasive and should be 'cleaned up' is also valid. An opinion that the
>> compatibility headaches of coprocessor interfaces are annoying is valid. An
>> opinion that Phoenix can be considered as a valid use case when considering
>> interface changes is valid. An opinion that only HBase level concerns
>> should motivate API changes is valid. These opinions are strawmen. I think
>> they approach actual positions in the community but I do not imply any
>> specific person has one of them. These strawmen are at least partially
>> contradictory. It is going to be an ongoing process to sort them out into
>> something that makes sense and can get consensus.
>> 
>> So while as committer I am moving forward on HBASE-22623 because I don't
>> see a veto but instead a disagreement on the margins (deprecation or not)
>> motivated on larger principles, I also want to raise the visibility of the
>> disagreement because I think it impacts our relationship with another
>> project at Apache at a minimum, but also future technical directions of an
>> important subset of interfaces.
>> 
>> For your consideration.
>> 
>> --
>> Best regards,
>> Andrew
>> 
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>   - A23, Crosstalk
>> 

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by "张铎 (Duo Zhang)" <pa...@gmail.com>.
My point is that we should not confuse users. And if there are lots of
other methods which also use WALEdit but are not deprecated, then maybe we
should remove the deprecated annotation on the preWALWrite method, so we
can keep the API consistent.

And I think the root cause here is that we do not have an interface for
WALEdit, we should start to work on this.

Geoffrey Jacoby <gj...@apache.org> 于2019年8月9日周五 下午1:26写道:

> There are a bunch of issues raised here, some micro and some macro. I'll
> tackle them in two separate messages so they're short(ish) enough for
> people to bother reading. :-) For this one I wear my "HBase contributor"
> hat.
>
> First the micro, which is my patch on HBASE-22623 which sparked a lot of
> passionate disagreement. I want to start by saying that I really appreciate
> the work that Duo and many others have done to improve the abstraction and
> cleanliness of the code base -- implementing a change to the write path in
> both branch-1 and master, you really feel the improvement in the later
> code.
>
> In a few places though, that cleanup left contradictions. One of them is
> WALEdit, which is LimitedPrivate for coprocessors and replication, and is
> already exposed in about 10 non-deprecated cooprocessor hooks between at
> least two interfaces. But the comments say it should never be exposed to
> coprocs, (I _think_ it means "never before WAL append", but that's not what
> it says) and the add methods are marked IA.Private.
>
> I think the fundamental mistake here is to have comments which contradict
> the IA annotation, and to enforce the comments over the IA contract. If a
> class is LimitedPrivate for coprocs and replication, then it's part of the
> public API for those components, and should be safe to consume. If the
> community later decides it's not safe, the IA can be changed and _all_
> coproc + replication methods using WALEdit can be deprecated in favor of
> some new safer interface. As I believe was once done for HLogKey vs WALKey.
> But that hasn't (yet anyway) been done for WALEdit, and patches which honor
> the IA contract, as I tried to do, shouldn't be rejected because they
> violate a private understanding.
>
> I'm (non-bindingly) strongly against any proposal to create an API which is
> deprecated on the moment of its birth, as has been proposed above. That
> seems nonsensical to me, since a deprecation means "don't use this", so
> what's the point? Slipping my Phoenix hat on, I don't want Phoenix to be a
> trespasser tolerated on sufferance, but based on good, clear abstractions
> and APIs negotiated between the two communities, with those APIs open to
> other projects as well.
>
> But more on that in the next one, which I'll write in the (Pacific time)
> morning.
>
> Geoffrey
>
> On Thu, Aug 8, 2019 at 6:39 PM 张铎(Duo Zhang) <pa...@gmail.com>
> wrote:
>
> > When releasing 2.0.0 we faced a lot of problems because we exposes so
> many
> > internal classes to CPs. It is really hard to both consider the
> > compatibility and development on HBase. And then we have done lots of
> works
> > to abstract interfaces for CPs to use and hide the actual implementation
> > classes to be HBase only.
> >
> > The work is not fully done, so we still left some methods which exposes
> > internal classes there with a deprecated annotation, I think most of them
> > are for Phoenix. This is a trade off and I think it is also acceptable.
> > Phoenix could still use the deprecated methods, and we will not remove
> them
> > unless we find an alternate solution. And I think if anyone wants to
> remove
> > them without replacment you will first jump out and give a -1 :)
> >
> > Specific to HBASE-22623, I still think we should add the deprecated
> > annotation to keep the API consistent, otherwise users will be confused
> > that whether they can use the WALEdit. And on the abstraction of
> WALEdit, I
> > think we used to rely on a high level abstraction in HBASE-20952. But now
> > since there is little progress there, I think we can start the work of
> > abstracting WALEdit only.
> >
> > Thanks.
> >
> > Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
> >
> > > Please let me direct your attention to the tail of HBASE-22623 for a
> > larger
> > > discussion.  I tried to sum it up as follows:
> > >
> > > An opinion that we should have more and more coprocessor interfaces to
> > > address new use cases is valid. An opinion that coprocessors are too
> > > invasive and should be 'cleaned up' is also valid. An opinion that the
> > > compatibility headaches of coprocessor interfaces are annoying is
> valid.
> > An
> > > opinion that Phoenix can be considered as a valid use case when
> > considering
> > > interface changes is valid. An opinion that only HBase level concerns
> > > should motivate API changes is valid. These opinions are strawmen. I
> > think
> > > they approach actual positions in the community but I do not imply any
> > > specific person has one of them. These strawmen are at least partially
> > > contradictory. It is going to be an ongoing process to sort them out
> into
> > > something that makes sense and can get consensus.
> > >
> > > So while as committer I am moving forward on HBASE-22623 because I
> don't
> > > see a veto but instead a disagreement on the margins (deprecation or
> not)
> > > motivated on larger principles, I also want to raise the visibility of
> > the
> > > disagreement because I think it impacts our relationship with another
> > > project at Apache at a minimum, but also future technical directions of
> > an
> > > important subset of interfaces.
> > >
> > > For your consideration.
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Words like orphans lost among the crosstalk, meaning torn from truth's
> > > decrepit hands
> > >    - A23, Crosstalk
> > >
> >
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Geoffrey Jacoby <gj...@apache.org>.
There are a bunch of issues raised here, some micro and some macro. I'll
tackle them in two separate messages so they're short(ish) enough for
people to bother reading. :-) For this one I wear my "HBase contributor"
hat.

First the micro, which is my patch on HBASE-22623 which sparked a lot of
passionate disagreement. I want to start by saying that I really appreciate
the work that Duo and many others have done to improve the abstraction and
cleanliness of the code base -- implementing a change to the write path in
both branch-1 and master, you really feel the improvement in the later
code.

In a few places though, that cleanup left contradictions. One of them is
WALEdit, which is LimitedPrivate for coprocessors and replication, and is
already exposed in about 10 non-deprecated cooprocessor hooks between at
least two interfaces. But the comments say it should never be exposed to
coprocs, (I _think_ it means "never before WAL append", but that's not what
it says) and the add methods are marked IA.Private.

I think the fundamental mistake here is to have comments which contradict
the IA annotation, and to enforce the comments over the IA contract. If a
class is LimitedPrivate for coprocs and replication, then it's part of the
public API for those components, and should be safe to consume. If the
community later decides it's not safe, the IA can be changed and _all_
coproc + replication methods using WALEdit can be deprecated in favor of
some new safer interface. As I believe was once done for HLogKey vs WALKey.
But that hasn't (yet anyway) been done for WALEdit, and patches which honor
the IA contract, as I tried to do, shouldn't be rejected because they
violate a private understanding.

I'm (non-bindingly) strongly against any proposal to create an API which is
deprecated on the moment of its birth, as has been proposed above. That
seems nonsensical to me, since a deprecation means "don't use this", so
what's the point? Slipping my Phoenix hat on, I don't want Phoenix to be a
trespasser tolerated on sufferance, but based on good, clear abstractions
and APIs negotiated between the two communities, with those APIs open to
other projects as well.

But more on that in the next one, which I'll write in the (Pacific time)
morning.

Geoffrey

On Thu, Aug 8, 2019 at 6:39 PM 张铎(Duo Zhang) <pa...@gmail.com> wrote:

> When releasing 2.0.0 we faced a lot of problems because we exposes so many
> internal classes to CPs. It is really hard to both consider the
> compatibility and development on HBase. And then we have done lots of works
> to abstract interfaces for CPs to use and hide the actual implementation
> classes to be HBase only.
>
> The work is not fully done, so we still left some methods which exposes
> internal classes there with a deprecated annotation, I think most of them
> are for Phoenix. This is a trade off and I think it is also acceptable.
> Phoenix could still use the deprecated methods, and we will not remove them
> unless we find an alternate solution. And I think if anyone wants to remove
> them without replacment you will first jump out and give a -1 :)
>
> Specific to HBASE-22623, I still think we should add the deprecated
> annotation to keep the API consistent, otherwise users will be confused
> that whether they can use the WALEdit. And on the abstraction of WALEdit, I
> think we used to rely on a high level abstraction in HBASE-20952. But now
> since there is little progress there, I think we can start the work of
> abstracting WALEdit only.
>
> Thanks.
>
> Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:
>
> > Please let me direct your attention to the tail of HBASE-22623 for a
> larger
> > discussion.  I tried to sum it up as follows:
> >
> > An opinion that we should have more and more coprocessor interfaces to
> > address new use cases is valid. An opinion that coprocessors are too
> > invasive and should be 'cleaned up' is also valid. An opinion that the
> > compatibility headaches of coprocessor interfaces are annoying is valid.
> An
> > opinion that Phoenix can be considered as a valid use case when
> considering
> > interface changes is valid. An opinion that only HBase level concerns
> > should motivate API changes is valid. These opinions are strawmen. I
> think
> > they approach actual positions in the community but I do not imply any
> > specific person has one of them. These strawmen are at least partially
> > contradictory. It is going to be an ongoing process to sort them out into
> > something that makes sense and can get consensus.
> >
> > So while as committer I am moving forward on HBASE-22623 because I don't
> > see a veto but instead a disagreement on the margins (deprecation or not)
> > motivated on larger principles, I also want to raise the visibility of
> the
> > disagreement because I think it impacts our relationship with another
> > project at Apache at a minimum, but also future technical directions of
> an
> > important subset of interfaces.
> >
> > For your consideration.
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >    - A23, Crosstalk
> >
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by "张铎 (Duo Zhang)" <pa...@gmail.com>.
When releasing 2.0.0 we faced a lot of problems because we exposes so many
internal classes to CPs. It is really hard to both consider the
compatibility and development on HBase. And then we have done lots of works
to abstract interfaces for CPs to use and hide the actual implementation
classes to be HBase only.

The work is not fully done, so we still left some methods which exposes
internal classes there with a deprecated annotation, I think most of them
are for Phoenix. This is a trade off and I think it is also acceptable.
Phoenix could still use the deprecated methods, and we will not remove them
unless we find an alternate solution. And I think if anyone wants to remove
them without replacment you will first jump out and give a -1 :)

Specific to HBASE-22623, I still think we should add the deprecated
annotation to keep the API consistent, otherwise users will be confused
that whether they can use the WALEdit. And on the abstraction of WALEdit, I
think we used to rely on a high level abstraction in HBASE-20952. But now
since there is little progress there, I think we can start the work of
abstracting WALEdit only.

Thanks.

Andrew Purtell <ap...@apache.org> 于2019年8月9日周五 上午3:21写道:

> Please let me direct your attention to the tail of HBASE-22623 for a larger
> discussion.  I tried to sum it up as follows:
>
> An opinion that we should have more and more coprocessor interfaces to
> address new use cases is valid. An opinion that coprocessors are too
> invasive and should be 'cleaned up' is also valid. An opinion that the
> compatibility headaches of coprocessor interfaces are annoying is valid. An
> opinion that Phoenix can be considered as a valid use case when considering
> interface changes is valid. An opinion that only HBase level concerns
> should motivate API changes is valid. These opinions are strawmen. I think
> they approach actual positions in the community but I do not imply any
> specific person has one of them. These strawmen are at least partially
> contradictory. It is going to be an ongoing process to sort them out into
> something that makes sense and can get consensus.
>
> So while as committer I am moving forward on HBASE-22623 because I don't
> see a veto but instead a disagreement on the margins (deprecation or not)
> motivated on larger principles, I also want to raise the visibility of the
> disagreement because I think it impacts our relationship with another
> project at Apache at a minimum, but also future technical directions of an
> important subset of interfaces.
>
> For your consideration.
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Re: Coprocessors, clean ups, compatibility, deprecations, Phoenix... it's a bit of a mess

Posted by Stack <st...@duboce.net>.
Thanks for raising this topic Andrew and for the judicious framing of
opinions (including the check that 'We have to make the best of the legacy
of past engineering choices.').

As to our current CP 'dilemma', after catching-up on the issue and seeing
the back and forth here, I am confident we can figure a pathway when I see
the evident respect and understanding for the positions of others and
apology given freely when misunderstanding.

@Geoffrey: Thanks for taking the time to study the CP API offering and for
turning up contradictions and for identifying a mistake (where comments
would seem to carry more weight than API annotation). The comments are
mine. They acknowledge that there is a problem with the class annotation
and with WALEdit. As is it is not fit for CP exposure. The comments point
out annotation-gymnastics that try to make it clear that the setters are
off-limits to CPs. The WALEdit refactor -- the Interface that Duo suggests
-- that should have been done as part of the Duo-led CP API necessary
cleanup was punted on by me as being too disruptive at the time (WALEdit
itself needs work).

S


On Thu, Aug 8, 2019 at 12:21 PM Andrew Purtell <ap...@apache.org> wrote:

> Please let me direct your attention to the tail of HBASE-22623 for a larger
> discussion.  I tried to sum it up as follows:
>
> An opinion that we should have more and more coprocessor interfaces to
> address new use cases is valid. An opinion that coprocessors are too
> invasive and should be 'cleaned up' is also valid. An opinion that the
> compatibility headaches of coprocessor interfaces are annoying is valid. An
> opinion that Phoenix can be considered as a valid use case when considering
> interface changes is valid. An opinion that only HBase level concerns
> should motivate API changes is valid. These opinions are strawmen. I think
> they approach actual positions in the community but I do not imply any
> specific person has one of them. These strawmen are at least partially
> contradictory. It is going to be an ongoing process to sort them out into
> something that makes sense and can get consensus.
>
> So while as committer I am moving forward on HBASE-22623 because I don't
> see a veto but instead a disagreement on the margins (deprecation or not)
> motivated on larger principles, I also want to raise the visibility of the
> disagreement because I think it impacts our relationship with another
> project at Apache at a minimum, but also future technical directions of an
> important subset of interfaces.
>
> For your consideration.
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>