You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Stack <st...@duboce.net> on 2016/04/13 01:17:30 UTC

[DISCUSS] Shade protobuf so we can move to a newer version

I opened HBASE-15638 Shade protobuf but should probably raise the larger
intent here. You fellows might have an opinion (smile).

We need to shade PB so we can move to a different version.

We need to move to a different version because we want to save on buffer
copies -- newer versions of PB have some 'unsafe' facility so we can save
on copies before serializing -- but we also want to be able to keep our
data off heap always. Currently, our PB version expects byte arrays
(2.5.0). Later versions of PBs have some support for ByteBuffer handling
but even then, the BBs are expected to be onheap so we can't pass
DirectByteBuffers (TODO: add support to PB to work w/ DBBs -- and BBs w/o
copy -- and push these changes upstream).

On an initial pass, the only difficult part seems to be interaction with
HDFS in asyncwal (might just pull in the HDFS messages).

The plan was to remove all references to protobuf and just have all modules
depend on our hbase-protocol module. We'd bundle our PB in
hbase-protocol.*.jar with packages offset to be at
org.apache.hbase.shaded.com.google.protobuf. This way, our shading mess is
contained some.

Suggestions, opinions, all welcome.

This project is part of the Ram and Anoop offheaping of the readpath and
generally saving allocations effort.

St.Ack

Re: [DISCUSS] Shade protobuf so we can move to a newer version

Posted by Elliott Clark <ec...@apache.org>.
Yeah I think it's a good idea however we need to remove PB from all our
user facing api's first.
There are some brave souls working on it here:
https://issues.apache.org/jira/browse/HBASE-15174

On Tue, Apr 12, 2016 at 4:17 PM, Stack <st...@duboce.net> wrote:

> I opened HBASE-15638 Shade protobuf but should probably raise the larger
> intent here. You fellows might have an opinion (smile).
>
> We need to shade PB so we can move to a different version.
>
> We need to move to a different version because we want to save on buffer
> copies -- newer versions of PB have some 'unsafe' facility so we can save
> on copies before serializing -- but we also want to be able to keep our
> data off heap always. Currently, our PB version expects byte arrays
> (2.5.0). Later versions of PBs have some support for ByteBuffer handling
> but even then, the BBs are expected to be onheap so we can't pass
> DirectByteBuffers (TODO: add support to PB to work w/ DBBs -- and BBs w/o
> copy -- and push these changes upstream).
>
> On an initial pass, the only difficult part seems to be interaction with
> HDFS in asyncwal (might just pull in the HDFS messages).
>
> The plan was to remove all references to protobuf and just have all modules
> depend on our hbase-protocol module. We'd bundle our PB in
> hbase-protocol.*.jar with packages offset to be at
> org.apache.hbase.shaded.com.google.protobuf. This way, our shading mess is
> contained some.
>
> Suggestions, opinions, all welcome.
>
> This project is part of the Ram and Anoop offheaping of the readpath and
> generally saving allocations effort.
>
> St.Ack
>

Re: [DISCUSS] Shade protobuf so we can move to a newer version

Posted by Stack <st...@duboce.net>.
This project goes on. I updated HBASE-1563 "Shade protobuf" with some doc
on a final approach. We need to be able to refer to both shaded and
non-shaded protobuf so we can support sending HDFS old-school pb Messages
but also so Coprocessor Endpoints keep working though internally protobufs
have been relocated. Funny you should ask, but yes, there are some
downsides (as predicted by contributors on the JIRA). I'd be interested to
hear if they are too burdensome. In particular, your IDE experience gets a
little convoluted as you will need to add to your build path, a jar with
the relocated pbs. A pain.

Thanks,
St.Ack


On Wed, Apr 13, 2016 at 6:09 AM, Stack <st...@duboce.net> wrote:

> On Tue, Apr 12, 2016 at 9:26 PM, Sean Busbey <bu...@apache.org> wrote:
>
>> On Tue, Apr 12, 2016 at 6:17 PM, Stack <st...@duboce.net> wrote:
>> >
>> >
>> > On an initial pass, the only difficult part seems to be interaction with
>> > HDFS in asyncwal (might just pull in the HDFS messages).
>> >
>> >
>>
>> I have some idea how we can make this work either by pushing asyncwal
>> upstream to HDFS or through some maven tricks, depending on how much
>> time we have.
>>
>
> Maven tricks? Tell us more. Here or drop a note up in the issue.
> Thanks Sean,
> St.Ack
>

Re: [DISCUSS] Shade protobuf so we can move to a newer version

Posted by Stack <st...@duboce.net>.
On Tue, Apr 12, 2016 at 9:26 PM, Sean Busbey <bu...@apache.org> wrote:

> On Tue, Apr 12, 2016 at 6:17 PM, Stack <st...@duboce.net> wrote:
> >
> >
> > On an initial pass, the only difficult part seems to be interaction with
> > HDFS in asyncwal (might just pull in the HDFS messages).
> >
> >
>
> I have some idea how we can make this work either by pushing asyncwal
> upstream to HDFS or through some maven tricks, depending on how much
> time we have.
>

Maven tricks? Tell us more. Here or drop a note up in the issue.
Thanks Sean,
St.Ack

Re: [DISCUSS] Shade protobuf so we can move to a newer version

Posted by 张铎 <pa...@gmail.com>.
I think we could do it with some maven tricks. Just do not relocate the
protobuf things in asyncwal? Move the related classes to a new sub project?

Thanks.

2016-04-13 12:26 GMT+08:00 Sean Busbey <bu...@apache.org>:

> On Tue, Apr 12, 2016 at 6:17 PM, Stack <st...@duboce.net> wrote:
> >
> >
> > On an initial pass, the only difficult part seems to be interaction with
> > HDFS in asyncwal (might just pull in the HDFS messages).
> >
> >
>
> I have some idea how we can make this work either by pushing asyncwal
> upstream to HDFS or through some maven tricks, depending on how much
> time we have.
>

Re: [DISCUSS] Shade protobuf so we can move to a newer version

Posted by Sean Busbey <bu...@apache.org>.
On Tue, Apr 12, 2016 at 6:17 PM, Stack <st...@duboce.net> wrote:
>
>
> On an initial pass, the only difficult part seems to be interaction with
> HDFS in asyncwal (might just pull in the HDFS messages).
>
>

I have some idea how we can make this work either by pushing asyncwal
upstream to HDFS or through some maven tricks, depending on how much
time we have.