You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Umesh Agashe <ua...@cloudera.com> on 2017/09/29 20:18:14 UTC

[DISCUSS] deprecating o.a.h.h.regionserver.RowProcessor

Hi,

Currently Region.processRowsWithLocks() API takes
o.a.h.h.regionserver.RowProcessor as an argument and only implementation of
this class is MultiRowMutationProcessor. This implementation is internal
and used from HRegion.mutateRows...() methods.

HRegion.processRowsWithLocks() implementation, doesn't call coprocessor
hooks but instead calls RowProcessor hooks at appropriate point in
execution. Many of these hooks/ methods have same names and are called at
similar points during the course of execution but they are not related!

HRegion.batchMutate() methods call coprocessor hooks but not row
RowProcessor hooks.

Internal implementation MultiRowMutationProcessor, call coprocessor hooks
from inside it's own methods/ hooks. But this can not be expected of all
implementations for RowProcessors.

In case of HRegion.batchMutate...() methods, CP mutations are merged with
input mutations and these merged mutations are applied to WALEdit fetched
from CPs.

In case of processRowsWithLocks(), mutations are fetched from RowProcessor
instance and are applied on WALEdit built by RowProcessor.

The major inconsistency here is, one code path uses coprocessors while
other uses RowProcessor. There are other minor inconsistencies along those
two code paths.

Proposed fix:

* Unify two code paths.
* Deprecate RowProcessor and API Region.processRowsWithLocks() that takes
RowProcessor as an argument.
* Provide alternate API that doesn't take RowProcessor.
* Modify batchMutate...() to take additional arguments: rowsToLock
(byte[][]) and atomic/ allOrNone (boolean).
* Remove MultiRowMutationProcessor. Make HRegion.mutateRows() methods to
use batchMutate().
* Make new implementation of Region.processRowsWithLocks() which doesn't
take RowProcessor as an argument use batchMutate().

Suggestion is that coprocessors can be used to do things RowProcessors are
doing.

Related JIRAs: HBASE-18703, HBASE-18183

Let me know your thoughts.

Thanks,
Umesh

Re: [DISCUSS] deprecating o.a.h.h.regionserver.RowProcessor

Posted by Andrew Purtell <an...@gmail.com>.
Looks like it. Two arrivals on parallel tracks. I never noticed this. I guess I’ve never really looked at RowProcessor. I’m glad we are going to clean this up. 


> On Sep 30, 2017, at 1:59 PM, Stack <st...@duboce.net> wrote:
> 
>> On Fri, Sep 29, 2017 at 1:18 PM, Umesh Agashe <ua...@cloudera.com> wrote:
>> 
>> Hi,
>> 
>> Currently Region.processRowsWithLocks() API takes
>> o.a.h.h.regionserver.RowProcessor as an argument and only implementation
>> of
>> this class is MultiRowMutationProcessor. This implementation is internal
>> and used from HRegion.mutateRows...() methods.
>> 
>> HRegion.processRowsWithLocks() implementation, doesn't call coprocessor
>> hooks but instead calls RowProcessor hooks at appropriate point in
>> execution. Many of these hooks/ methods have same names and are called at
>> similar points during the course of execution but they are not related!
>> 
>> 
> Confusing!
> 
> 
> 
>> HRegion.batchMutate() methods call coprocessor hooks but not row
>> RowProcessor hooks.
>> 
>> 
> Doubly confounding!
> 
> 
> 
>> Internal implementation MultiRowMutationProcessor, call coprocessor hooks
>> from inside it's own methods/ hooks. But this can not be expected of all
>> implementations for RowProcessors.
>> 
>> 
> 
> 
> 
>> In case of HRegion.batchMutate...() methods, CP mutations are merged with
>> input mutations and these merged mutations are applied to WALEdit fetched
>> from CPs.
>> 
>> In case of processRowsWithLocks(), mutations are fetched from RowProcessor
>> instance and are applied on WALEdit built by RowProcessor.
>> 
>> The major inconsistency here is, one code path uses coprocessors while
>> other uses RowProcessor. There are other minor inconsistencies along those
>> two code paths.
>> 
>> 
> RowProcessor seems to have arrived after Coprocessor.
> 
> commit ce36877d30efb21a6c62a72c47cf51b442576fda
> Author: Zhihong Yu <te...@apache.org>
> Date:   Wed Mar 21 18:25:18 2012 +0000
> 
>    HBASE-5542 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
> (Scott Chen)
> 
> Coprocessor main classes show up here:
> 
> commit 7299a72715f0ef1b05e478bee2e60d8f26fc2c24
> Author: Andrew Kyle Purtell <ap...@apache.org>
> Date:   Sat Nov 20 01:23:39 2010 +0000
> 
>    HBASE-2001 Coprocessors: Colocate user code with regions
> 
> Maybe the overlap was not considered back then?
> 
> 
> 
>> Proposed fix:
>> 
>> * Unify two code paths.
>> 
> 
> +1
> 
> 
>> * Deprecate RowProcessor and API Region.processRowsWithLocks() that takes
>> RowProcessor as an argument.
>> * Provide alternate API that doesn't take RowProcessor.
>> * Modify batchMutate...() to take additional arguments: rowsToLock
>> (byte[][]) and atomic/ allOrNone (boolean).
>> * Remove MultiRowMutationProcessor. Make HRegion.mutateRows() methods to
>> use batchMutate().
>> * Make new implementation of Region.processRowsWithLocks() which doesn't
>> take RowProcessor as an argument use batchMutate().
>> 
>> Suggestion is that coprocessors can be used to do things RowProcessors are
>> doing.
>> 
>> 
> Seems right to me.
> 
> Speak up if you have a reason for why RowProcessors should stick around. I
> see MultiRowMutationEndpoint for doing atomic x-row mutations on meta but
> is implemented otherwise. Otherwise, looking at HBASE-5542 linked issues,
> we have all the facility referenced implemented other than via RowProcessor
> (I think -- would need to dig more).
> 
> 
> Thanks Umesh,
> St.Ack
> 
> 
> 
>> Related JIRAs: HBASE-18703, HBASE-18183
>> 
>> Let me know your thoughts.
>> 
>> Thanks,
>> Umesh
>> 

Re: [DISCUSS] deprecating o.a.h.h.regionserver.RowProcessor

Posted by Stack <st...@duboce.net>.
On Fri, Sep 29, 2017 at 1:18 PM, Umesh Agashe <ua...@cloudera.com> wrote:

> Hi,
>
> Currently Region.processRowsWithLocks() API takes
> o.a.h.h.regionserver.RowProcessor as an argument and only implementation
> of
> this class is MultiRowMutationProcessor. This implementation is internal
> and used from HRegion.mutateRows...() methods.
>
> HRegion.processRowsWithLocks() implementation, doesn't call coprocessor
> hooks but instead calls RowProcessor hooks at appropriate point in
> execution. Many of these hooks/ methods have same names and are called at
> similar points during the course of execution but they are not related!
>
>
Confusing!



> HRegion.batchMutate() methods call coprocessor hooks but not row
> RowProcessor hooks.
>
>
Doubly confounding!



> Internal implementation MultiRowMutationProcessor, call coprocessor hooks
> from inside it's own methods/ hooks. But this can not be expected of all
> implementations for RowProcessors.
>
>



> In case of HRegion.batchMutate...() methods, CP mutations are merged with
> input mutations and these merged mutations are applied to WALEdit fetched
> from CPs.
>
> In case of processRowsWithLocks(), mutations are fetched from RowProcessor
> instance and are applied on WALEdit built by RowProcessor.
>
> The major inconsistency here is, one code path uses coprocessors while
> other uses RowProcessor. There are other minor inconsistencies along those
> two code paths.
>
>
RowProcessor seems to have arrived after Coprocessor.

commit ce36877d30efb21a6c62a72c47cf51b442576fda
Author: Zhihong Yu <te...@apache.org>
Date:   Wed Mar 21 18:25:18 2012 +0000

    HBASE-5542 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
(Scott Chen)

Coprocessor main classes show up here:

commit 7299a72715f0ef1b05e478bee2e60d8f26fc2c24
Author: Andrew Kyle Purtell <ap...@apache.org>
Date:   Sat Nov 20 01:23:39 2010 +0000

    HBASE-2001 Coprocessors: Colocate user code with regions

Maybe the overlap was not considered back then?



> Proposed fix:
>
> * Unify two code paths.
>

+1


> * Deprecate RowProcessor and API Region.processRowsWithLocks() that takes
> RowProcessor as an argument.
> * Provide alternate API that doesn't take RowProcessor.
> * Modify batchMutate...() to take additional arguments: rowsToLock
> (byte[][]) and atomic/ allOrNone (boolean).
> * Remove MultiRowMutationProcessor. Make HRegion.mutateRows() methods to
> use batchMutate().
> * Make new implementation of Region.processRowsWithLocks() which doesn't
> take RowProcessor as an argument use batchMutate().
>
> Suggestion is that coprocessors can be used to do things RowProcessors are
> doing.
>
>
Seems right to me.

Speak up if you have a reason for why RowProcessors should stick around. I
see MultiRowMutationEndpoint for doing atomic x-row mutations on meta but
is implemented otherwise. Otherwise, looking at HBASE-5542 linked issues,
we have all the facility referenced implemented other than via RowProcessor
(I think -- would need to dig more).


Thanks Umesh,
St.Ack



> Related JIRAs: HBASE-18703, HBASE-18183
>
> Let me know your thoughts.
>
> Thanks,
> Umesh
>