You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Stack <st...@duboce.net> on 2016/10/01 20:10:09 UTC

[DISCUSS] More Shading

HBASE-15638 is about shading protobufs. Lets shade other critical libs too
so we can run with versions of the libraries we favor rather than versions
dictated by dependencies.

For example, our guava is from the stone ages. Guava is a quality library
that we should be making use of throughout our code base. We are afraid to
update it because it will break when we share our CLASSPATH with another
component or a dependency of ours transitively includes a conflicting
version. Worse, there have been incidents where we undo Guava usage because
of CLASSPATH clashes (Running a recent HBase with a recent version of Drill
broke on Guava StopWatch import...).

That we should shade critical, popular, core libs seems self-evident
(though I would be interested if folks have other opinions). What I want to
discuss though is how we go about it.

The HBASE-15638 (protobuf shading) approach has us reference the relocated
artifact explicitly. This makes for an ugly ripple across the codebase as
we declare which protobuf Message is intended; either
com.google.protobuf.Message or
org.apache.hadoop.hbase.shaded.com.google.protobuf.Message. It is a pain
making all the changes but the intent is clear. I was thinking we'd do
similar for guava and whatever else we think fits this category. I'd make a
hbase-3rdparty-shaded or some such module and do all hackery therein.

Where it gets awkward is whether or not we check in the shaded artifact
source code (Over in HBASE-15638, we have checked in the relocated
protobuf3 source code because we are going to patch it, for a while at
least). For the build and runtime to work, we do not need the relocated
source code to be present but not having source code present is a hurdle
for devs who use IDEs (Everyone but Sean and Matteo). Their code will be
flagged w/ red flags saying the relocated artifact is missing/unresolvable.
To 'fix', they need build the shaded module and then in their IDE, drop the
shaded module and add the built shaded module jar to the IDE's build-time
CLASSPATH.

This is awkward. Is this too much to ask of devs, especially those getting
going for the first time? I could do up doc and IDE configs to help but
this would be an added hurdle getting setup.

Sean has suggested a pre-build step where in another repo we'd make hbase
shaded versions of critical libs, 'release' them (votes, etc.) and then
have core depend on these. It be a bunch of work but would make the dev's
life easier.

Interested in any thoughts you lot might have.

Thanks,
St.Ack

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <an...@gmail.com>
wrote:

> > Sean has suggested a pre-build step where in another repo we'd make hbase
> > shaded versions of critical libs, 'release' them (votes, etc.) and then
> > have core depend on these. It be a bunch of work but would make the dev's
> > life easier.
>
> So when we make changes that require updates to and rebuild of the
> supporting libraries, as a developer I would make local changes, install a
> snapshot of that into the local maven cache, then point the HBase build at
> the snapshot, then do the other half of the work, then push up to both?
>
> I think this could work.


That sounds about right.
St.Ack

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Thu, Jul 6, 2017 at 11:17 PM, Stack <st...@duboce.net> wrote:

> On Thu, Jul 6, 2017 at 11:52 AM, Stack <st...@duboce.net> wrote:
>
>> FYI:
>>
>> hbase-thirdparty has had its first release. Yesterday I
>> committed HBASE-17056 "Remove checked in PB generated files" which apart
>> from purging all checked-in generated files (30MB), it moves our hbase core
>> (master and branch-2) to start using the thirdparty jar.
>>
>> Things might be interesting over next few days so shout if you run into
>> issues.
>>
>>
> FYI, I reverted HBASE-17056 for the moment. Build seems unstable. OOMEs
> and other interesting issues along w/ some awkwardness w/ dependencies. Let
> me spend some more time on it. Will try again later.
>
>

Just a heads-up. With Chia-Ping Tsai and Guanghao Zhang's help, I was able
to figure a few odd issues in previous push. I just pushed again. Kick me
if you see OOMEs or weirdness.... I'll be keeping an eye out myself too.

Thanks for your patience.

St.Ack




> St.Ack
>
>
>
>> One issue is the need for mvn install where before mvn compile might have
>> been enough (see below for example of the issue you'd see now if you did
>> mvn clean compile only). We need the mvn install because we need to shade
>> the generated files so they use the relocated protobuf; shade happens after
>> we've made a module jar.
>>
>> IDEs will complain too if they pick up generated src from target dirs
>> since they'll be unsatisfied references to protobuf3 objects -- unless you
>> point at the shaded jar. Let me see if I can do something about the latter.
>>
>> Let me know if HBASE-17056 is too much for devs to bear. Can always
>> revert (though nice having the protobuf generation in-line w/ the build).
>> It is sort of a side-benefit on the general shading project so could back
>> it out and still have the shading of netty, guava, protobuf, etc.
>>
>> Thanks,
>>
>> St.Ack
>>
>> [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile
>> (default-compile) on project hbase-procedure: Compilation failure:
>> Compilation failure:
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[105,11]
>> cannot access com.google.protobuf.GeneratedMessageV3
>> [ERROR] class file for com.google.protobuf.GeneratedMessageV3 not found
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[129,7]
>> cannot access com.google.protobuf.GeneratedMessageV3.Builder
>> [ERROR] class file for com.google.protobuf.GeneratedMessageV3$Builder
>> not found
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[133,22]
>> cannot find symbol
>> [ERROR] symbol:   method writeDelimitedTo(org.apache.ha
>> doop.fs.FSDataOutputStream)
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ProcedureProtos.ProcedureStoreTracker
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[217,20]
>> cannot find symbol
>> [ERROR] symbol:   method writeDelimitedTo(org.apache.ha
>> doop.hbase.procedure2.util.ByteSlot)
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ProcedureProtos.ProcedureWALEntry
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[240,20]
>> cannot find symbol
>> [ERROR] symbol:   method writeDelimitedTo(org.apache.ha
>> doop.hbase.procedure2.util.ByteSlot)
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ProcedureProtos.ProcedureWALEntry
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[254,20]
>> cannot find symbol
>> [ERROR] symbol:   method writeDelimitedTo(org.apache.ha
>> doop.hbase.procedure2.util.ByteSlot)
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ProcedureProtos.ProcedureWALEntry
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/RemoteProcedureException.java:[98,30] cannot
>> find symbol
>> [ERROR] symbol:   method toByteArray()
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ErrorHandlingProtos.ForeignExceptionMessage
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/StateMachineProcedure.java:[267,17] cannot
>> find symbol
>> [ERROR] symbol:   method writeDelimitedTo(java.io.OutputStream)
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ProcedureProtos.StateMachineProcedureData
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/ProcedureUtil.java:[130,56] incompatible
>> types: org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString
>> cannot be converted to com.google.protobuf.ByteString
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/ProcedureUtil.java:[137,54] incompatible
>> types: org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString
>> cannot be converted to com.google.protobuf.ByteString
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/ProcedureUtil.java:[237,56] incompatible
>> types: org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString
>> cannot be converted to com.google.protobuf.ByteString
>> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/apac
>> he/hadoop/hbase/procedure2/SequentialProcedure.java:[75,17] cannot find
>> symbol
>> [ERROR] symbol:   method writeDelimitedTo(java.io.OutputStream)
>> [ERROR] location: class org.apache.hadoop.hbase.shaded
>> .protobuf.generated.ProcedureProtos.SequentialProcedureData
>> [ERROR] -> [Help 1]
>> [ERROR]
>> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>> -e switch.
>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>> [ERROR]
>> [ERROR] For more information about the errors and possible solutions,
>> please read the following articles:
>> [ERROR] [Help 1] http://cwiki.apache.org/conflu
>> ence/display/MAVEN/MojoFailureException
>> [ERROR]
>> [ERROR] After correcting the problems, you can resume the build with the
>> command
>> [ERROR]   mvn <goals> -rf :hbase-procedure
>>
>> On Fri, Jun 30, 2017 at 3:09 PM, Stack <st...@duboce.net> wrote:
>>
>>> I just started a VOTE on hbase-thirdparty and the first RC made from it.
>>> Thanks,
>>> St.Ack
>>>
>>> On Tue, Jun 27, 2017 at 3:02 PM, Stack <st...@duboce.net> wrote:
>>>
>>>> Bit of an update.
>>>>
>>>> I'd suggest we go ahead w/ the hbase-thirdparty project [2]. It took a
>>>> while but in its current form -- a few poms that package a few jars [1]--
>>>> it at least enables the below:
>>>>
>>>> + Allows us to skip checking in protobuf generated files (25MB!); they
>>>> can be generated inline w/ the build because the hackery patching protobuf
>>>> has been moved out to hbase-thirdparty. There is a patch up on HBASE-17056.
>>>> + Update our guava from 12.0 to 22.0 w/o clashing w/ the guava of
>>>> others. There is a patch at HBASE-17908. It is taking a bit of wrangling
>>>> getting it to land because I pared back transitive includes from hadoop and
>>>> it takes a while to work through the failures.
>>>>
>>>> Other benefits are the protobuf-util lib is on the classpath now -- its
>>>> in hbase-thirdparty relocated; depends on pb and guava -- so we have
>>>> facility to goat "HBASE-18106 Redo ProcedureInfo and LockInfo" and shading
>>>> netty is almost done so we can do with netty as we wilt independent of
>>>> hadoop and downstreamers (the hard part -- relocation of the .so -- should
>>>> be done).
>>>>
>>>> Let me figure how to run a vote for a couple of poms.....
>>>>
>>>> St.Ack
>>>>
>>>> 1. https://repository.apache.org/content/groups/snapshots/or
>>>> g/apache/hbase/thirdparty/ (see hbase-shaded-thirdparty and
>>>> hbase-shaded-protobuf)
>>>> 2. https://git-wip-us.apache.org/repos/asf/hbase-thirdparty
>>>>
>>>>
>>>> On Tue, Jun 20, 2017 at 11:04 AM, Josh Elser <jo...@gmail.com>
>>>> wrote:
>>>>
>>>>> On 6/20/17 1:28 AM, Stack wrote:
>>>>>
>>>>>> On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser<el...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>> ...
>>>>>>>
>>>>>>> I think pushing this part forward with some code is the next logical
>>>>>>> step.
>>>>>>> Seems to be consensus about taking our known internal dependencies
>>>>>>> and
>>>>>>> performing this shade magic.
>>>>>>>
>>>>>>>
>>>>>>> I opened HBASE-18240 "Add hbase-auxillary, a project with hbase
>>>>>> utility
>>>>>> including an hbase-shaded-thirdparty module with guava, netty, etc."
>>>>>>
>>>>>> It has a tarball attached that bundles the outline of an
>>>>>> hbase-auxillary
>>>>>> project (groupId:org.apache.hbase.auxillary). This project is
>>>>>> intended to
>>>>>> be standalone, in its own repository, publishing its own artifacts
>>>>>> under
>>>>>> the aegis of this project's PMC.
>>>>>>
>>>>>> It includes the first instance of an auxillary utility, a module named
>>>>>> hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein
>>>>>> we'll
>>>>>> pull down 3rd party libs and republish at an offset; e.g.
>>>>>> com.google.common.* from guava will be at
>>>>>> org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it
>>>>>> builds
>>>>>> a jar that includes a relocated guava 22.0.
>>>>>>
>>>>>> I then messed around making hbase-common use it (You have to build the
>>>>>> hbase-auxillary into your local repo). I put up a patch on the issue.
>>>>>> Mostly its mass find-and-replace w/ some clean up of transitive
>>>>>> includes of
>>>>>> guava from hadoop-common and some small fixup of methods renamed
>>>>>> between
>>>>>> guava 12.0 and 22.0.
>>>>>>
>>>>>> Unless objection, I was going to press on. Sean offered to help set
>>>>>> up new
>>>>>> repo. We can always undo and delete it if this project fails.
>>>>>>
>>>>>> When done, the hope is we are on a modern version of guava and our
>>>>>> netty
>>>>>> and protobuf 3 will be be relocated, 'hidden' from downstream (and
>>>>>> won't
>>>>>> clash w/ upstream). I hope to also purge the pre-build we have in our
>>>>>> modules that do protobuf moving this hackery out and under
>>>>>> hbase-thirdparty-shaded.
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>
>>>>> Kudos on the JFDI approach :). I think having something concrete to
>>>>> show is the best way to judge success of it.
>>>>>
>>>>> Will keep an eye on HBASE-18240.
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Thu, Jul 6, 2017 at 11:52 AM, Stack <st...@duboce.net> wrote:

> FYI:
>
> hbase-thirdparty has had its first release. Yesterday I
> committed HBASE-17056 "Remove checked in PB generated files" which apart
> from purging all checked-in generated files (30MB), it moves our hbase core
> (master and branch-2) to start using the thirdparty jar.
>
> Things might be interesting over next few days so shout if you run into
> issues.
>
>
FYI, I reverted HBASE-17056 for the moment. Build seems unstable. OOMEs and
other interesting issues along w/ some awkwardness w/ dependencies. Let me
spend some more time on it. Will try again later.

St.Ack



> One issue is the need for mvn install where before mvn compile might have
> been enough (see below for example of the issue you'd see now if you did
> mvn clean compile only). We need the mvn install because we need to shade
> the generated files so they use the relocated protobuf; shade happens after
> we've made a module jar.
>
> IDEs will complain too if they pick up generated src from target dirs
> since they'll be unsatisfied references to protobuf3 objects -- unless you
> point at the shaded jar. Let me see if I can do something about the latter.
>
> Let me know if HBASE-17056 is too much for devs to bear. Can always revert
> (though nice having the protobuf generation in-line w/ the build). It is
> sort of a side-benefit on the general shading project so could back it out
> and still have the shading of netty, guava, protobuf, etc.
>
> Thanks,
>
> St.Ack
>
> [ERROR] Failed to execute goal org.apache.maven.plugins:
> maven-compiler-plugin:3.6.1:compile (default-compile) on project
> hbase-procedure: Compilation failure: Compilation failure:
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[105,11]
> cannot access com.google.protobuf.GeneratedMessageV3
> [ERROR] class file for com.google.protobuf.GeneratedMessageV3 not found
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[129,7]
> cannot access com.google.protobuf.GeneratedMessageV3.Builder
> [ERROR] class file for com.google.protobuf.GeneratedMessageV3$Builder not
> found
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[133,22]
> cannot find symbol
> [ERROR] symbol:   method writeDelimitedTo(org.apache.
> hadoop.fs.FSDataOutputStream)
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ProcedureProtos.ProcedureStoreTracker
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[217,20]
> cannot find symbol
> [ERROR] symbol:   method writeDelimitedTo(org.apache.
> hadoop.hbase.procedure2.util.ByteSlot)
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ProcedureProtos.ProcedureWALEntry
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[240,20]
> cannot find symbol
> [ERROR] symbol:   method writeDelimitedTo(org.apache.
> hadoop.hbase.procedure2.util.ByteSlot)
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ProcedureProtos.ProcedureWALEntry
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[254,20]
> cannot find symbol
> [ERROR] symbol:   method writeDelimitedTo(org.apache.
> hadoop.hbase.procedure2.util.ByteSlot)
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ProcedureProtos.ProcedureWALEntry
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/RemoteProcedureException.java:[98,30]
> cannot find symbol
> [ERROR] symbol:   method toByteArray()
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ErrorHandlingProtos.ForeignExceptionMessage
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/StateMachineProcedure.java:[267,17] cannot
> find symbol
> [ERROR] symbol:   method writeDelimitedTo(java.io.OutputStream)
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ProcedureProtos.StateMachineProcedureData
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/ProcedureUtil.java:[130,56] incompatible
> types: org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString
> cannot be converted to com.google.protobuf.ByteString
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/ProcedureUtil.java:[137,54] incompatible
> types: org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString
> cannot be converted to com.google.protobuf.ByteString
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/ProcedureUtil.java:[237,56] incompatible
> types: org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString
> cannot be converted to com.google.protobuf.ByteString
> [ERROR] /home/stack/hbase.git/hbase-procedure/src/main/java/org/
> apache/hadoop/hbase/procedure2/SequentialProcedure.java:[75,17] cannot
> find symbol
> [ERROR] symbol:   method writeDelimitedTo(java.io.OutputStream)
> [ERROR] location: class org.apache.hadoop.hbase.shaded.protobuf.generated.
> ProcedureProtos.SequentialProcedureData
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the
> -e switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
> [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/
> MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the
> command
> [ERROR]   mvn <goals> -rf :hbase-procedure
>
> On Fri, Jun 30, 2017 at 3:09 PM, Stack <st...@duboce.net> wrote:
>
>> I just started a VOTE on hbase-thirdparty and the first RC made from it.
>> Thanks,
>> St.Ack
>>
>> On Tue, Jun 27, 2017 at 3:02 PM, Stack <st...@duboce.net> wrote:
>>
>>> Bit of an update.
>>>
>>> I'd suggest we go ahead w/ the hbase-thirdparty project [2]. It took a
>>> while but in its current form -- a few poms that package a few jars [1]--
>>> it at least enables the below:
>>>
>>> + Allows us to skip checking in protobuf generated files (25MB!); they
>>> can be generated inline w/ the build because the hackery patching protobuf
>>> has been moved out to hbase-thirdparty. There is a patch up on HBASE-17056.
>>> + Update our guava from 12.0 to 22.0 w/o clashing w/ the guava of
>>> others. There is a patch at HBASE-17908. It is taking a bit of wrangling
>>> getting it to land because I pared back transitive includes from hadoop and
>>> it takes a while to work through the failures.
>>>
>>> Other benefits are the protobuf-util lib is on the classpath now -- its
>>> in hbase-thirdparty relocated; depends on pb and guava -- so we have
>>> facility to goat "HBASE-18106 Redo ProcedureInfo and LockInfo" and shading
>>> netty is almost done so we can do with netty as we wilt independent of
>>> hadoop and downstreamers (the hard part -- relocation of the .so -- should
>>> be done).
>>>
>>> Let me figure how to run a vote for a couple of poms.....
>>>
>>> St.Ack
>>>
>>> 1. https://repository.apache.org/content/groups/snapshots/or
>>> g/apache/hbase/thirdparty/ (see hbase-shaded-thirdparty and
>>> hbase-shaded-protobuf)
>>> 2. https://git-wip-us.apache.org/repos/asf/hbase-thirdparty
>>>
>>>
>>> On Tue, Jun 20, 2017 at 11:04 AM, Josh Elser <jo...@gmail.com>
>>> wrote:
>>>
>>>> On 6/20/17 1:28 AM, Stack wrote:
>>>>
>>>>> On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser<el...@apache.org>  wrote:
>>>>>
>>>>> ...
>>>>>>
>>>>>> I think pushing this part forward with some code is the next logical
>>>>>> step.
>>>>>> Seems to be consensus about taking our known internal dependencies and
>>>>>> performing this shade magic.
>>>>>>
>>>>>>
>>>>>> I opened HBASE-18240 "Add hbase-auxillary, a project with hbase
>>>>> utility
>>>>> including an hbase-shaded-thirdparty module with guava, netty, etc."
>>>>>
>>>>> It has a tarball attached that bundles the outline of an
>>>>> hbase-auxillary
>>>>> project (groupId:org.apache.hbase.auxillary). This project is
>>>>> intended to
>>>>> be standalone, in its own repository, publishing its own artifacts
>>>>> under
>>>>> the aegis of this project's PMC.
>>>>>
>>>>> It includes the first instance of an auxillary utility, a module named
>>>>> hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein
>>>>> we'll
>>>>> pull down 3rd party libs and republish at an offset; e.g.
>>>>> com.google.common.* from guava will be at
>>>>> org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it
>>>>> builds
>>>>> a jar that includes a relocated guava 22.0.
>>>>>
>>>>> I then messed around making hbase-common use it (You have to build the
>>>>> hbase-auxillary into your local repo). I put up a patch on the issue.
>>>>> Mostly its mass find-and-replace w/ some clean up of transitive
>>>>> includes of
>>>>> guava from hadoop-common and some small fixup of methods renamed
>>>>> between
>>>>> guava 12.0 and 22.0.
>>>>>
>>>>> Unless objection, I was going to press on. Sean offered to help set up
>>>>> new
>>>>> repo. We can always undo and delete it if this project fails.
>>>>>
>>>>> When done, the hope is we are on a modern version of guava and our
>>>>> netty
>>>>> and protobuf 3 will be be relocated, 'hidden' from downstream (and
>>>>> won't
>>>>> clash w/ upstream). I hope to also purge the pre-build we have in our
>>>>> modules that do protobuf moving this hackery out and under
>>>>> hbase-thirdparty-shaded.
>>>>>
>>>>> St.Ack
>>>>>
>>>>
>>>> Kudos on the JFDI approach :). I think having something concrete to
>>>> show is the best way to judge success of it.
>>>>
>>>> Will keep an eye on HBASE-18240.
>>>>
>>>>
>>>
>>
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
FYI:

hbase-thirdparty has had its first release. Yesterday I
committed HBASE-17056 "Remove checked in PB generated files" which apart
from purging all checked-in generated files (30MB), it moves our hbase core
(master and branch-2) to start using the thirdparty jar.

Things might be interesting over next few days so shout if you run into
issues.

One issue is the need for mvn install where before mvn compile might have
been enough (see below for example of the issue you'd see now if you did
mvn clean compile only). We need the mvn install because we need to shade
the generated files so they use the relocated protobuf; shade happens after
we've made a module jar.

IDEs will complain too if they pick up generated src from target dirs since
they'll be unsatisfied references to protobuf3 objects -- unless you point
at the shaded jar. Let me see if I can do something about the latter.

Let me know if HBASE-17056 is too much for devs to bear. Can always revert
(though nice having the protobuf generation in-line w/ the build). It is
sort of a side-benefit on the general shading project so could back it out
and still have the shading of netty, guava, protobuf, etc.

Thanks,

St.Ack

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile
(default-compile) on project hbase-procedure: Compilation failure:
Compilation failure:
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[105,11]
cannot access com.google.protobuf.GeneratedMessageV3
[ERROR] class file for com.google.protobuf.GeneratedMessageV3 not found
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[129,7]
cannot access com.google.protobuf.GeneratedMessageV3.Builder
[ERROR] class file for com.google.protobuf.GeneratedMessageV3$Builder not
found
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[133,22]
cannot find symbol
[ERROR] symbol:   method
writeDelimitedTo(org.apache.hadoop.fs.FSDataOutputStream)
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureStoreTracker
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[217,20]
cannot find symbol
[ERROR] symbol:   method
writeDelimitedTo(org.apache.hadoop.hbase.procedure2.util.ByteSlot)
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureWALEntry
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[240,20]
cannot find symbol
[ERROR] symbol:   method
writeDelimitedTo(org.apache.hadoop.hbase.procedure2.util.ByteSlot)
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureWALEntry
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java:[254,20]
cannot find symbol
[ERROR] symbol:   method
writeDelimitedTo(org.apache.hadoop.hbase.procedure2.util.ByteSlot)
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureWALEntry
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureException.java:[98,30]
cannot find symbol
[ERROR] symbol:   method toByteArray()
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ErrorHandlingProtos.ForeignExceptionMessage
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/StateMachineProcedure.java:[267,17]
cannot find symbol
[ERROR] symbol:   method writeDelimitedTo(java.io.OutputStream)
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.StateMachineProcedureData
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureUtil.java:[130,56]
incompatible types:
org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString cannot be
converted to com.google.protobuf.ByteString
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureUtil.java:[137,54]
incompatible types:
org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString cannot be
converted to com.google.protobuf.ByteString
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureUtil.java:[237,56]
incompatible types:
org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString cannot be
converted to com.google.protobuf.ByteString
[ERROR]
/home/stack/hbase.git/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/SequentialProcedure.java:[75,17]
cannot find symbol
[ERROR] symbol:   method writeDelimitedTo(java.io.OutputStream)
[ERROR] location: class
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.SequentialProcedureData
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the
command
[ERROR]   mvn <goals> -rf :hbase-procedure

On Fri, Jun 30, 2017 at 3:09 PM, Stack <st...@duboce.net> wrote:

> I just started a VOTE on hbase-thirdparty and the first RC made from it.
> Thanks,
> St.Ack
>
> On Tue, Jun 27, 2017 at 3:02 PM, Stack <st...@duboce.net> wrote:
>
>> Bit of an update.
>>
>> I'd suggest we go ahead w/ the hbase-thirdparty project [2]. It took a
>> while but in its current form -- a few poms that package a few jars [1]--
>> it at least enables the below:
>>
>> + Allows us to skip checking in protobuf generated files (25MB!); they
>> can be generated inline w/ the build because the hackery patching protobuf
>> has been moved out to hbase-thirdparty. There is a patch up on HBASE-17056.
>> + Update our guava from 12.0 to 22.0 w/o clashing w/ the guava of others.
>> There is a patch at HBASE-17908. It is taking a bit of wrangling getting it
>> to land because I pared back transitive includes from hadoop and it takes a
>> while to work through the failures.
>>
>> Other benefits are the protobuf-util lib is on the classpath now -- its
>> in hbase-thirdparty relocated; depends on pb and guava -- so we have
>> facility to goat "HBASE-18106 Redo ProcedureInfo and LockInfo" and shading
>> netty is almost done so we can do with netty as we wilt independent of
>> hadoop and downstreamers (the hard part -- relocation of the .so -- should
>> be done).
>>
>> Let me figure how to run a vote for a couple of poms.....
>>
>> St.Ack
>>
>> 1. https://repository.apache.org/content/groups/snapshots/or
>> g/apache/hbase/thirdparty/ (see hbase-shaded-thirdparty and
>> hbase-shaded-protobuf)
>> 2. https://git-wip-us.apache.org/repos/asf/hbase-thirdparty
>>
>>
>> On Tue, Jun 20, 2017 at 11:04 AM, Josh Elser <jo...@gmail.com>
>> wrote:
>>
>>> On 6/20/17 1:28 AM, Stack wrote:
>>>
>>>> On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser<el...@apache.org>  wrote:
>>>>
>>>> ...
>>>>>
>>>>> I think pushing this part forward with some code is the next logical
>>>>> step.
>>>>> Seems to be consensus about taking our known internal dependencies and
>>>>> performing this shade magic.
>>>>>
>>>>>
>>>>> I opened HBASE-18240 "Add hbase-auxillary, a project with hbase utility
>>>> including an hbase-shaded-thirdparty module with guava, netty, etc."
>>>>
>>>> It has a tarball attached that bundles the outline of an hbase-auxillary
>>>> project (groupId:org.apache.hbase.auxillary). This project is intended
>>>> to
>>>> be standalone, in its own repository, publishing its own artifacts under
>>>> the aegis of this project's PMC.
>>>>
>>>> It includes the first instance of an auxillary utility, a module named
>>>> hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein
>>>> we'll
>>>> pull down 3rd party libs and republish at an offset; e.g.
>>>> com.google.common.* from guava will be at
>>>> org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it
>>>> builds
>>>> a jar that includes a relocated guava 22.0.
>>>>
>>>> I then messed around making hbase-common use it (You have to build the
>>>> hbase-auxillary into your local repo). I put up a patch on the issue.
>>>> Mostly its mass find-and-replace w/ some clean up of transitive
>>>> includes of
>>>> guava from hadoop-common and some small fixup of methods renamed between
>>>> guava 12.0 and 22.0.
>>>>
>>>> Unless objection, I was going to press on. Sean offered to help set up
>>>> new
>>>> repo. We can always undo and delete it if this project fails.
>>>>
>>>> When done, the hope is we are on a modern version of guava and our netty
>>>> and protobuf 3 will be be relocated, 'hidden' from downstream (and won't
>>>> clash w/ upstream). I hope to also purge the pre-build we have in our
>>>> modules that do protobuf moving this hackery out and under
>>>> hbase-thirdparty-shaded.
>>>>
>>>> St.Ack
>>>>
>>>
>>> Kudos on the JFDI approach :). I think having something concrete to show
>>> is the best way to judge success of it.
>>>
>>> Will keep an eye on HBASE-18240.
>>>
>>>
>>
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
I just started a VOTE on hbase-thirdparty and the first RC made from it.
Thanks,
St.Ack

On Tue, Jun 27, 2017 at 3:02 PM, Stack <st...@duboce.net> wrote:

> Bit of an update.
>
> I'd suggest we go ahead w/ the hbase-thirdparty project [2]. It took a
> while but in its current form -- a few poms that package a few jars [1]--
> it at least enables the below:
>
> + Allows us to skip checking in protobuf generated files (25MB!); they can
> be generated inline w/ the build because the hackery patching protobuf has
> been moved out to hbase-thirdparty. There is a patch up on HBASE-17056.
> + Update our guava from 12.0 to 22.0 w/o clashing w/ the guava of others.
> There is a patch at HBASE-17908. It is taking a bit of wrangling getting it
> to land because I pared back transitive includes from hadoop and it takes a
> while to work through the failures.
>
> Other benefits are the protobuf-util lib is on the classpath now -- its in
> hbase-thirdparty relocated; depends on pb and guava -- so we have facility
> to goat "HBASE-18106 Redo ProcedureInfo and LockInfo" and shading netty is
> almost done so we can do with netty as we wilt independent of hadoop and
> downstreamers (the hard part -- relocation of the .so -- should be done).
>
> Let me figure how to run a vote for a couple of poms.....
>
> St.Ack
>
> 1. https://repository.apache.org/content/groups/snapshots/
> org/apache/hbase/thirdparty/ (see hbase-shaded-thirdparty and
> hbase-shaded-protobuf)
> 2. https://git-wip-us.apache.org/repos/asf/hbase-thirdparty
>
>
> On Tue, Jun 20, 2017 at 11:04 AM, Josh Elser <jo...@gmail.com> wrote:
>
>> On 6/20/17 1:28 AM, Stack wrote:
>>
>>> On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser<el...@apache.org>  wrote:
>>>
>>> ...
>>>>
>>>> I think pushing this part forward with some code is the next logical
>>>> step.
>>>> Seems to be consensus about taking our known internal dependencies and
>>>> performing this shade magic.
>>>>
>>>>
>>>> I opened HBASE-18240 "Add hbase-auxillary, a project with hbase utility
>>> including an hbase-shaded-thirdparty module with guava, netty, etc."
>>>
>>> It has a tarball attached that bundles the outline of an hbase-auxillary
>>> project (groupId:org.apache.hbase.auxillary). This project is intended
>>> to
>>> be standalone, in its own repository, publishing its own artifacts under
>>> the aegis of this project's PMC.
>>>
>>> It includes the first instance of an auxillary utility, a module named
>>> hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein
>>> we'll
>>> pull down 3rd party libs and republish at an offset; e.g.
>>> com.google.common.* from guava will be at
>>> org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it
>>> builds
>>> a jar that includes a relocated guava 22.0.
>>>
>>> I then messed around making hbase-common use it (You have to build the
>>> hbase-auxillary into your local repo). I put up a patch on the issue.
>>> Mostly its mass find-and-replace w/ some clean up of transitive includes
>>> of
>>> guava from hadoop-common and some small fixup of methods renamed between
>>> guava 12.0 and 22.0.
>>>
>>> Unless objection, I was going to press on. Sean offered to help set up
>>> new
>>> repo. We can always undo and delete it if this project fails.
>>>
>>> When done, the hope is we are on a modern version of guava and our netty
>>> and protobuf 3 will be be relocated, 'hidden' from downstream (and won't
>>> clash w/ upstream). I hope to also purge the pre-build we have in our
>>> modules that do protobuf moving this hackery out and under
>>> hbase-thirdparty-shaded.
>>>
>>> St.Ack
>>>
>>
>> Kudos on the JFDI approach :). I think having something concrete to show
>> is the best way to judge success of it.
>>
>> Will keep an eye on HBASE-18240.
>>
>>
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
Bit of an update.

I'd suggest we go ahead w/ the hbase-thirdparty project [2]. It took a
while but in its current form -- a few poms that package a few jars [1]--
it at least enables the below:

+ Allows us to skip checking in protobuf generated files (25MB!); they can
be generated inline w/ the build because the hackery patching protobuf has
been moved out to hbase-thirdparty. There is a patch up on HBASE-17056.
+ Update our guava from 12.0 to 22.0 w/o clashing w/ the guava of others.
There is a patch at HBASE-17908. It is taking a bit of wrangling getting it
to land because I pared back transitive includes from hadoop and it takes a
while to work through the failures.

Other benefits are the protobuf-util lib is on the classpath now -- its in
hbase-thirdparty relocated; depends on pb and guava -- so we have facility
to goat "HBASE-18106 Redo ProcedureInfo and LockInfo" and shading netty is
almost done so we can do with netty as we wilt independent of hadoop and
downstreamers (the hard part -- relocation of the .so -- should be done).

Let me figure how to run a vote for a couple of poms.....

St.Ack

1.
https://repository.apache.org/content/groups/snapshots/org/apache/hbase/thirdparty/
(see hbase-shaded-thirdparty and hbase-shaded-protobuf)
2. https://git-wip-us.apache.org/repos/asf/hbase-thirdparty


On Tue, Jun 20, 2017 at 11:04 AM, Josh Elser <jo...@gmail.com> wrote:

> On 6/20/17 1:28 AM, Stack wrote:
>
>> On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser<el...@apache.org>  wrote:
>>
>> ...
>>>
>>> I think pushing this part forward with some code is the next logical
>>> step.
>>> Seems to be consensus about taking our known internal dependencies and
>>> performing this shade magic.
>>>
>>>
>>> I opened HBASE-18240 "Add hbase-auxillary, a project with hbase utility
>> including an hbase-shaded-thirdparty module with guava, netty, etc."
>>
>> It has a tarball attached that bundles the outline of an hbase-auxillary
>> project (groupId:org.apache.hbase.auxillary). This project is intended to
>> be standalone, in its own repository, publishing its own artifacts under
>> the aegis of this project's PMC.
>>
>> It includes the first instance of an auxillary utility, a module named
>> hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein
>> we'll
>> pull down 3rd party libs and republish at an offset; e.g.
>> com.google.common.* from guava will be at
>> org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it
>> builds
>> a jar that includes a relocated guava 22.0.
>>
>> I then messed around making hbase-common use it (You have to build the
>> hbase-auxillary into your local repo). I put up a patch on the issue.
>> Mostly its mass find-and-replace w/ some clean up of transitive includes
>> of
>> guava from hadoop-common and some small fixup of methods renamed between
>> guava 12.0 and 22.0.
>>
>> Unless objection, I was going to press on. Sean offered to help set up new
>> repo. We can always undo and delete it if this project fails.
>>
>> When done, the hope is we are on a modern version of guava and our netty
>> and protobuf 3 will be be relocated, 'hidden' from downstream (and won't
>> clash w/ upstream). I hope to also purge the pre-build we have in our
>> modules that do protobuf moving this hackery out and under
>> hbase-thirdparty-shaded.
>>
>> St.Ack
>>
>
> Kudos on the JFDI approach :). I think having something concrete to show
> is the best way to judge success of it.
>
> Will keep an eye on HBASE-18240.
>
>

Re: [DISCUSS] More Shading

Posted by Josh Elser <jo...@gmail.com>.
On 6/20/17 1:28 AM, Stack wrote:
> On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser<el...@apache.org>  wrote:
> 
>> ...
>>
>> I think pushing this part forward with some code is the next logical step.
>> Seems to be consensus about taking our known internal dependencies and
>> performing this shade magic.
>>
>>
> I opened HBASE-18240 "Add hbase-auxillary, a project with hbase utility
> including an hbase-shaded-thirdparty module with guava, netty, etc."
> 
> It has a tarball attached that bundles the outline of an hbase-auxillary
> project (groupId:org.apache.hbase.auxillary). This project is intended to
> be standalone, in its own repository, publishing its own artifacts under
> the aegis of this project's PMC.
> 
> It includes the first instance of an auxillary utility, a module named
> hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein we'll
> pull down 3rd party libs and republish at an offset; e.g.
> com.google.common.* from guava will be at
> org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it builds
> a jar that includes a relocated guava 22.0.
> 
> I then messed around making hbase-common use it (You have to build the
> hbase-auxillary into your local repo). I put up a patch on the issue.
> Mostly its mass find-and-replace w/ some clean up of transitive includes of
> guava from hadoop-common and some small fixup of methods renamed between
> guava 12.0 and 22.0.
> 
> Unless objection, I was going to press on. Sean offered to help set up new
> repo. We can always undo and delete it if this project fails.
> 
> When done, the hope is we are on a modern version of guava and our netty
> and protobuf 3 will be be relocated, 'hidden' from downstream (and won't
> clash w/ upstream). I hope to also purge the pre-build we have in our
> modules that do protobuf moving this hackery out and under
> hbase-thirdparty-shaded.
> 
> St.Ack

Kudos on the JFDI approach :). I think having something concrete to show 
is the best way to judge success of it.

Will keep an eye on HBASE-18240.


Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser <el...@apache.org> wrote:

> ...
>
> I think pushing this part forward with some code is the next logical step.
> Seems to be consensus about taking our known internal dependencies and
> performing this shade magic.
>
>
I opened HBASE-18240 "Add hbase-auxillary, a project with hbase utility
including an hbase-shaded-thirdparty module with guava, netty, etc."

It has a tarball attached that bundles the outline of an hbase-auxillary
project (groupId:org.apache.hbase.auxillary). This project is intended to
be standalone, in its own repository, publishing its own artifacts under
the aegis of this project's PMC.

It includes the first instance of an auxillary utility, a module named
hbase-thirdparty-shaded (artifactId:hbase-thirdparty-shaded). Herein we'll
pull down 3rd party libs and republish at an offset; e.g.
com.google.common.* from guava will be at
org.apache.hbase.thirdparty.shaded.com.google.common.*. Currently it builds
a jar that includes a relocated guava 22.0.

I then messed around making hbase-common use it (You have to build the
hbase-auxillary into your local repo). I put up a patch on the issue.
Mostly its mass find-and-replace w/ some clean up of transitive includes of
guava from hadoop-common and some small fixup of methods renamed between
guava 12.0 and 22.0.

Unless objection, I was going to press on. Sean offered to help set up new
repo. We can always undo and delete it if this project fails.

When done, the hope is we are on a modern version of guava and our netty
and protobuf 3 will be be relocated, 'hidden' from downstream (and won't
clash w/ upstream). I hope to also purge the pre-build we have in our
modules that do protobuf moving this hackery out and under
hbase-thirdparty-shaded.

St.Ack




> Don't want to stomp on your worries, Nick. I think your worries are more
> about the presentation to downstream and we're in agreement about isolating
> our internal deps with the described approach?
>
>
> Thanks,
>> St.Ack
>>
>>
>>
>> We (The HBase PMC) will have to make releases of this new artifact and
>>> vote
>>>
>>>> on them. I think it will be a relatively rare event.
>>>>
>>>> I'd be up for doing the first cut if folks are game.
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> 1. URL via Sean but for committers to view only:
>>>> https://reporeq.apache.org/
>>>>
>>>>
>>>> [2] https://maven.apache.org/plugins/maven-shade-plugin/shade-
>>> mojo.html#createSourcesJar
>>> [3] https://maven.apache.org/plugins/maven-shade-plugin/shade-
>>> mojo.html#shadeSourcesContent
>>>
>>>
>>

Re: [DISCUSS] More Shading

Posted by Nick Dimiduk <nd...@gmail.com>.
Related: very relevant concerns have been raised in the comments over
on HADOOP-11656.
See Christopher Tubbs's recent remarks.

On Thu, Apr 13, 2017 at 4:46 PM, Josh Elser <el...@apache.org> wrote:

>
>
> Stack wrote:
>
>> On Wed, Apr 12, 2017 at 8:22 AM, Josh Elser<el...@apache.org>  wrote:
>> ....
>>
>> This makes me wonder if we could construct source jars just the same as
>>> we're creating shaded jars. Google has lead me to [2][3], but I've never
>>> tried either. The latter option seems to be acknowledging that the source
>>> might not actually compile, but the package names would at least be
>>> correct.
>>>
>>> I think this would be a good early experiment which, if it does work out,
>>> removes the only acknowledged "hole" in the current plan.
>>>
>>>
>>> Mighty Josh.
>>
>> As it happens, the 'createSources' you cite is currently in use by our
>> hbase-protocol-shaded module for the specific purpose of keeping our
>> IDE'rs
>> happy; we relocate protobuf itself so we can have pb3.2 and pb2.5 on our
>> CLASSPATH but IDEs don't complain because the src for the relocated pb is
>> checked-in to our codebase.
>>
>
> Aha! Wizards are ahead of me on the path already :)
>
>
> Here's more if interested:
>>
>> All modules that make use of protos by convention require a 'pre-build'
>> step. Generally, the pre-build is required if you add or mod .proto files.
>> At pre-build we generate class files from the amended .protos and then
>> check-in the product. See the README in each of our proto-carrying
>> modules.
>>
>> In the hbase-protocol-shaded case, we abuse the pre-build notion so we can
>> house an internal pb version, one that does not agree w/ hadoops nor w/
>> that used by our CPs describing Endpoint services.  We do as follows (the
>> order may not be exact in below):
>>
>>   * Generate class files from protos
>>   * Shade the built artifact (with the createSource flag set).
>>   * Unjar the artifact which has generated proto and protobuf src classes
>> in
>> it
>>   * Apply a few patches to protobuf to support offheap work (stuff we need
>> to push back up to pb)
>>   * Overlay our current protobuf src w/ the new version in the src tree
>> -- a
>> big no-no (smile).
>>
>> You then check it all in....
>>
>> It has been suggested we undo these hokey pre-build steps and just
>> generate
>> classes from protobuf inline w/ the main build. A plugin we just figured
>> makes this possible since it provides the platform-appropriate protoc
>> (org.xolstice.maven.plugins). We could do the simplification currently for
>> all but the hbase-protocol-shaded module because of the above jujitsu. One
>> thought is that if we had a pre-build artifact as is being suggested in
>> this thread, I could move the patched pb there and purge the pre-build
>> step
>> everywhere.
>>
>
> The xolstice plugin has been working well for me elsewhere so far. +1
>
> I think we're speaking the same language. My realization was that,
> hopefully, we could automatically create that pre-built artifact via Maven
> magic instead of having a human copy source files around.
>
> I think pushing this part forward with some code is the next logical step.
> Seems to be consensus about taking our known internal dependencies and
> performing this shade magic.
>
> Don't want to stomp on your worries, Nick. I think your worries are more
> about the presentation to downstream and we're in agreement about isolating
> our internal deps with the described approach?
>
>
> Thanks,
>> St.Ack
>>
>>
>>
>> We (The HBase PMC) will have to make releases of this new artifact and
>>> vote
>>>
>>>> on them. I think it will be a relatively rare event.
>>>>
>>>> I'd be up for doing the first cut if folks are game.
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> 1. URL via Sean but for committers to view only:
>>>> https://reporeq.apache.org/
>>>>
>>>>
>>>> [2] https://maven.apache.org/plugins/maven-shade-plugin/shade-
>>> mojo.html#createSourcesJar
>>> [3] https://maven.apache.org/plugins/maven-shade-plugin/shade-
>>> mojo.html#shadeSourcesContent
>>>
>>>
>>

Re: [DISCUSS] More Shading

Posted by Josh Elser <el...@apache.org>.

Stack wrote:
> On Wed, Apr 12, 2017 at 8:22 AM, Josh Elser<el...@apache.org>  wrote:
> ....
>
>> This makes me wonder if we could construct source jars just the same as
>> we're creating shaded jars. Google has lead me to [2][3], but I've never
>> tried either. The latter option seems to be acknowledging that the source
>> might not actually compile, but the package names would at least be correct.
>>
>> I think this would be a good early experiment which, if it does work out,
>> removes the only acknowledged "hole" in the current plan.
>>
>>
> Mighty Josh.
>
> As it happens, the 'createSources' you cite is currently in use by our
> hbase-protocol-shaded module for the specific purpose of keeping our IDE'rs
> happy; we relocate protobuf itself so we can have pb3.2 and pb2.5 on our
> CLASSPATH but IDEs don't complain because the src for the relocated pb is
> checked-in to our codebase.

Aha! Wizards are ahead of me on the path already :)

> Here's more if interested:
>
> All modules that make use of protos by convention require a 'pre-build'
> step. Generally, the pre-build is required if you add or mod .proto files.
> At pre-build we generate class files from the amended .protos and then
> check-in the product. See the README in each of our proto-carrying modules.
>
> In the hbase-protocol-shaded case, we abuse the pre-build notion so we can
> house an internal pb version, one that does not agree w/ hadoops nor w/
> that used by our CPs describing Endpoint services.  We do as follows (the
> order may not be exact in below):
>
>   * Generate class files from protos
>   * Shade the built artifact (with the createSource flag set).
>   * Unjar the artifact which has generated proto and protobuf src classes in
> it
>   * Apply a few patches to protobuf to support offheap work (stuff we need
> to push back up to pb)
>   * Overlay our current protobuf src w/ the new version in the src tree -- a
> big no-no (smile).
>
> You then check it all in....
>
> It has been suggested we undo these hokey pre-build steps and just generate
> classes from protobuf inline w/ the main build. A plugin we just figured
> makes this possible since it provides the platform-appropriate protoc
> (org.xolstice.maven.plugins). We could do the simplification currently for
> all but the hbase-protocol-shaded module because of the above jujitsu. One
> thought is that if we had a pre-build artifact as is being suggested in
> this thread, I could move the patched pb there and purge the pre-build step
> everywhere.

The xolstice plugin has been working well for me elsewhere so far. +1

I think we're speaking the same language. My realization was that, 
hopefully, we could automatically create that pre-built artifact via 
Maven magic instead of having a human copy source files around.

I think pushing this part forward with some code is the next logical 
step. Seems to be consensus about taking our known internal dependencies 
and performing this shade magic.

Don't want to stomp on your worries, Nick. I think your worries are more 
about the presentation to downstream and we're in agreement about 
isolating our internal deps with the described approach?

> Thanks,
> St.Ack
>
>
>
>> We (The HBase PMC) will have to make releases of this new artifact and vote
>>> on them. I think it will be a relatively rare event.
>>>
>>> I'd be up for doing the first cut if folks are game.
>>>
>>> St.Ack
>>>
>>>
>>> 1. URL via Sean but for committers to view only:
>>> https://reporeq.apache.org/
>>>
>>>
>> [2] https://maven.apache.org/plugins/maven-shade-plugin/shade-
>> mojo.html#createSourcesJar
>> [3] https://maven.apache.org/plugins/maven-shade-plugin/shade-
>> mojo.html#shadeSourcesContent
>>
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Wed, Apr 12, 2017 at 8:22 AM, Josh Elser <el...@apache.org> wrote:
....

>
> This makes me wonder if we could construct source jars just the same as
> we're creating shaded jars. Google has lead me to [2][3], but I've never
> tried either. The latter option seems to be acknowledging that the source
> might not actually compile, but the package names would at least be correct.
>
> I think this would be a good early experiment which, if it does work out,
> removes the only acknowledged "hole" in the current plan.
>
>
Mighty Josh.

As it happens, the 'createSources' you cite is currently in use by our
hbase-protocol-shaded module for the specific purpose of keeping our IDE'rs
happy; we relocate protobuf itself so we can have pb3.2 and pb2.5 on our
CLASSPATH but IDEs don't complain because the src for the relocated pb is
checked-in to our codebase.

Here's more if interested:

All modules that make use of protos by convention require a 'pre-build'
step. Generally, the pre-build is required if you add or mod .proto files.
At pre-build we generate class files from the amended .protos and then
check-in the product. See the README in each of our proto-carrying modules.

In the hbase-protocol-shaded case, we abuse the pre-build notion so we can
house an internal pb version, one that does not agree w/ hadoops nor w/
that used by our CPs describing Endpoint services.  We do as follows (the
order may not be exact in below):

 * Generate class files from protos
 * Shade the built artifact (with the createSource flag set).
 * Unjar the artifact which has generated proto and protobuf src classes in
it
 * Apply a few patches to protobuf to support offheap work (stuff we need
to push back up to pb)
 * Overlay our current protobuf src w/ the new version in the src tree -- a
big no-no (smile).

You then check it all in....

It has been suggested we undo these hokey pre-build steps and just generate
classes from protobuf inline w/ the main build. A plugin we just figured
makes this possible since it provides the platform-appropriate protoc
(org.xolstice.maven.plugins). We could do the simplification currently for
all but the hbase-protocol-shaded module because of the above jujitsu. One
thought is that if we had a pre-build artifact as is being suggested in
this thread, I could move the patched pb there and purge the pre-build step
everywhere.

Thanks,
St.Ack



> We (The HBase PMC) will have to make releases of this new artifact and vote
>> on them. I think it will be a relatively rare event.
>>
>> I'd be up for doing the first cut if folks are game.
>>
>> St.Ack
>>
>>
>> 1. URL via Sean but for committers to view only:
>> https://reporeq.apache.org/
>>
>>
> [2] https://maven.apache.org/plugins/maven-shade-plugin/shade-
> mojo.html#createSourcesJar
> [3] https://maven.apache.org/plugins/maven-shade-plugin/shade-
> mojo.html#shadeSourcesContent
>

Re: [DISCUSS] More Shading

Posted by Josh Elser <el...@apache.org>.

Stack wrote:
> Let me revive this thread.
>
<snip />
>
> Lets do Sean's idea of a pre-build step where we package and relocate
> ('shade') critical dependencies (Going by the thread above, Ram, Anoop, and
> Andy seems good w/ general idea).
>
> In implementation, we (The HBase PMC) would ask for a new repo [1]. In here
> we'd create a new mvn project. This project would produce a single artifact
> (jar) called hbase-dependencies or hbase-3rdparty or hbase-shaded-3rdparty
> libs. In it would be relocated core libs such as guava and netty (and maybe
> protobuf). We'd publish this artifact and then have hbase depend on it
> changing all references to point at the relocation: e.g. rather than import
> com.google.common.collect.Maps, we'd import
> org.apache.hadoop.hbase.com.google.common.collect.Maps.

This makes me wonder if we could construct source jars just the same as 
we're creating shaded jars. Google has lead me to [2][3], but I've never 
tried either. The latter option seems to be acknowledging that the 
source might not actually compile, but the package names would at least 
be correct.

I think this would be a good early experiment which, if it does work 
out, removes the only acknowledged "hole" in the current plan.

> We (The HBase PMC) will have to make releases of this new artifact and vote
> on them. I think it will be a relatively rare event.
>
> I'd be up for doing the first cut if folks are game.
>
> St.Ack
>
>
> 1. URL via Sean but for committers to view only: https://reporeq.apache.org/
>

[2] 
https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#createSourcesJar
[3] 
https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#shadeSourcesContent

Re: [DISCUSS] More Shading

Posted by Sean Busbey <bu...@apache.org>.
FWIW, the existing shaded stuff in both HBase and Hadoop should already
take this behavior into account.

On Mon, Apr 24, 2017 at 11:40 AM Nick Dimiduk <nd...@gmail.com> wrote:

> FYI, MNG-5899 makes shaded builds fragile, effectively limiting
> multi-module shaded projects to maven 3.2.x. Apparently the Apache Storm
> folks tripped over this earlier, and as I recall, Apache Flink used to
> require building with 3.2.x for the same reason.
>
> https://issues.apache.org/jira/browse/MNG-5899
>
> On Tue, Apr 18, 2017 at 9:20 PM, Nick Dimiduk <nd...@gmail.com> wrote:
>
> > On Wed, Apr 12, 2017 at 2:30 PM, Stack <st...@duboce.net> wrote:
> >
> >> > >> If the above quote is true, then I think what we want is a set of
> >> > shaded
> >> > > >> Hadoop client libs that we can depend on so as to not get all the
> >> > > >> transitive deps. Hadoop doesn't provide it, but we could do so
> >> > ourselves
> >> > > >> with (yet another) module in our project. Assuming, that is, the
> >> > > upstream
> >> > > >> client interfaces are well defined and don't leak stuff we care
> >> about.
> >> >
> >>
> >>
> >> We should do this too (I think you've identified the big 'if' w/ the
> above
> >> identified assumption). As you say later, "... it's time we firm up the
> >> boundaries between us and Hadoop.". There is some precedent with
> >> hadoop-compat-* modules. Hadoop would be relocated?
> >>
> >
> > Ideally we'd relocate any parts of Hadoop that are not part of our public
> > contract. Not sure if there's an intersection between "ideal" and
> > "practical" though.
> >
> > Spitballing, IIUC, I think this would be a big job (once per version and
> >> the vagaries of hadoop/spark) with no guarantee of success on other end
> >> because of assumption you call out. Do I have this right?
> >>
> >
> > Yeah you have my meaning. My argument is not whether we should shade but
> > rather how we make it a maintainable deployment tool for our team of
> > volunteers. Hence interest in compatibility verification tools like we do
> > with our api compatibility tools.
> >
> > > Isolating our clients from our deps is best served by our shaded
> modules.
> >> > What do you think about turning things on their head: for 2.0 the
> >> > hbase-client jar is the shaded artifact by default, not the other way
> >> > around? We have cleanup to get our deps out of our public interfaces
> in
> >> > order to make this work.
> >> >
> >> >
> >> We should do this at least going forward. hbase2 is the opportunity.
> >> Testing and doc is all that is needed? I added it to our hbase2
> >> description
> >> doc as a deliverable (though not a blocker).
> >>
> >
> > I've not tried to consume these efforts. A reasonable test-case to see if
> > these are ready for prime-time would be to try rebuilding one of the more
> > complex downstream projects (i.e, Phoenix, Trafodion, Splice) using the
> > shaded jars and see how bad the diff is.
> >
> > > This proposal of an external shaded dependencies module sounds like an
> >> > attempt to solve both concerns at once. It would isolate ourselves
> from
> >> > Hadoop's deps, and it would isolate our clients from our deps.
> However,
> >> it
> >> > doesn't isolate our clients from Hadoop's deps, so our users don't
> >> really
> >> > gain anything from it. I also argue that it creates an unreasonable
> >> release
> >> > engineering burden on our project. I'm also not clear on the
> >> implications
> >> > to downstreamers who extend us with coprocessors.
> >> >
> >>
> >>
> >> Other than a missing 'quick-fix' descriptor, you call what is proposed
> >> well
> >> ....except where you think the prebuild will be burdensome. Here I think
> >> otherwise as I think releases will be rare, there is nought 'new' in a
> >> release but packaged 3rd-party libs, and verification/vote by PMCers
> >> should
> >> be a simple affair.
> >>
> >
> > Maybe it's not such a burden? If the 2.0 and 3.0 RM's are brave and true,
> > it's worth a go.
> >
> > Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
> >> distinct from the narrower task proposed here where we are trying to
> >> unhitch ourselves of the netty/guava hadoop uses? (Currently we break
> >> against hadoop3 because of netty incompat., HADOOP-13866, which we might
> >> be
> >> able to solve w/ exclusions.....but....).
> >>
> >> The two tasks can be run in parallel?
> >>
> >
> > Indeed, they seem distinct but quite related.
> >
> > For CPs, they should bring their own bedding and towels and not be trying
> >> to use ours. On the plus-side, we could upgrade core 3rd-party libs and
> >> the
> >> CP would keep working.
> >>
> >
> > All of this sounds like an ideal state.
> >
>

Re: [DISCUSS] More Shading

Posted by Nick Dimiduk <nd...@gmail.com>.
FYI, MNG-5899 makes shaded builds fragile, effectively limiting
multi-module shaded projects to maven 3.2.x. Apparently the Apache Storm
folks tripped over this earlier, and as I recall, Apache Flink used to
require building with 3.2.x for the same reason.

https://issues.apache.org/jira/browse/MNG-5899

On Tue, Apr 18, 2017 at 9:20 PM, Nick Dimiduk <nd...@gmail.com> wrote:

> On Wed, Apr 12, 2017 at 2:30 PM, Stack <st...@duboce.net> wrote:
>
>> > >> If the above quote is true, then I think what we want is a set of
>> > shaded
>> > > >> Hadoop client libs that we can depend on so as to not get all the
>> > > >> transitive deps. Hadoop doesn't provide it, but we could do so
>> > ourselves
>> > > >> with (yet another) module in our project. Assuming, that is, the
>> > > upstream
>> > > >> client interfaces are well defined and don't leak stuff we care
>> about.
>> >
>>
>>
>> We should do this too (I think you've identified the big 'if' w/ the above
>> identified assumption). As you say later, "... it's time we firm up the
>> boundaries between us and Hadoop.". There is some precedent with
>> hadoop-compat-* modules. Hadoop would be relocated?
>>
>
> Ideally we'd relocate any parts of Hadoop that are not part of our public
> contract. Not sure if there's an intersection between "ideal" and
> "practical" though.
>
> Spitballing, IIUC, I think this would be a big job (once per version and
>> the vagaries of hadoop/spark) with no guarantee of success on other end
>> because of assumption you call out. Do I have this right?
>>
>
> Yeah you have my meaning. My argument is not whether we should shade but
> rather how we make it a maintainable deployment tool for our team of
> volunteers. Hence interest in compatibility verification tools like we do
> with our api compatibility tools.
>
> > Isolating our clients from our deps is best served by our shaded modules.
>> > What do you think about turning things on their head: for 2.0 the
>> > hbase-client jar is the shaded artifact by default, not the other way
>> > around? We have cleanup to get our deps out of our public interfaces in
>> > order to make this work.
>> >
>> >
>> We should do this at least going forward. hbase2 is the opportunity.
>> Testing and doc is all that is needed? I added it to our hbase2
>> description
>> doc as a deliverable (though not a blocker).
>>
>
> I've not tried to consume these efforts. A reasonable test-case to see if
> these are ready for prime-time would be to try rebuilding one of the more
> complex downstream projects (i.e, Phoenix, Trafodion, Splice) using the
> shaded jars and see how bad the diff is.
>
> > This proposal of an external shaded dependencies module sounds like an
>> > attempt to solve both concerns at once. It would isolate ourselves from
>> > Hadoop's deps, and it would isolate our clients from our deps. However,
>> it
>> > doesn't isolate our clients from Hadoop's deps, so our users don't
>> really
>> > gain anything from it. I also argue that it creates an unreasonable
>> release
>> > engineering burden on our project. I'm also not clear on the
>> implications
>> > to downstreamers who extend us with coprocessors.
>> >
>>
>>
>> Other than a missing 'quick-fix' descriptor, you call what is proposed
>> well
>> ....except where you think the prebuild will be burdensome. Here I think
>> otherwise as I think releases will be rare, there is nought 'new' in a
>> release but packaged 3rd-party libs, and verification/vote by PMCers
>> should
>> be a simple affair.
>>
>
> Maybe it's not such a burden? If the 2.0 and 3.0 RM's are brave and true,
> it's worth a go.
>
> Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
>> distinct from the narrower task proposed here where we are trying to
>> unhitch ourselves of the netty/guava hadoop uses? (Currently we break
>> against hadoop3 because of netty incompat., HADOOP-13866, which we might
>> be
>> able to solve w/ exclusions.....but....).
>>
>> The two tasks can be run in parallel?
>>
>
> Indeed, they seem distinct but quite related.
>
> For CPs, they should bring their own bedding and towels and not be trying
>> to use ours. On the plus-side, we could upgrade core 3rd-party libs and
>> the
>> CP would keep working.
>>
>
> All of this sounds like an ideal state.
>

Re: [DISCUSS] More Shading

Posted by Nick Dimiduk <nd...@gmail.com>.
On Wed, Apr 12, 2017 at 2:30 PM, Stack <st...@duboce.net> wrote:

> > >> If the above quote is true, then I think what we want is a set of
> > shaded
> > > >> Hadoop client libs that we can depend on so as to not get all the
> > > >> transitive deps. Hadoop doesn't provide it, but we could do so
> > ourselves
> > > >> with (yet another) module in our project. Assuming, that is, the
> > > upstream
> > > >> client interfaces are well defined and don't leak stuff we care
> about.
> >
>
>
> We should do this too (I think you've identified the big 'if' w/ the above
> identified assumption). As you say later, "... it's time we firm up the
> boundaries between us and Hadoop.". There is some precedent with
> hadoop-compat-* modules. Hadoop would be relocated?
>

Ideally we'd relocate any parts of Hadoop that are not part of our public
contract. Not sure if there's an intersection between "ideal" and
"practical" though.

Spitballing, IIUC, I think this would be a big job (once per version and
> the vagaries of hadoop/spark) with no guarantee of success on other end
> because of assumption you call out. Do I have this right?
>

Yeah you have my meaning. My argument is not whether we should shade but
rather how we make it a maintainable deployment tool for our team of
volunteers. Hence interest in compatibility verification tools like we do
with our api compatibility tools.

> Isolating our clients from our deps is best served by our shaded modules.
> > What do you think about turning things on their head: for 2.0 the
> > hbase-client jar is the shaded artifact by default, not the other way
> > around? We have cleanup to get our deps out of our public interfaces in
> > order to make this work.
> >
> >
> We should do this at least going forward. hbase2 is the opportunity.
> Testing and doc is all that is needed? I added it to our hbase2 description
> doc as a deliverable (though not a blocker).
>

I've not tried to consume these efforts. A reasonable test-case to see if
these are ready for prime-time would be to try rebuilding one of the more
complex downstream projects (i.e, Phoenix, Trafodion, Splice) using the
shaded jars and see how bad the diff is.

> This proposal of an external shaded dependencies module sounds like an
> > attempt to solve both concerns at once. It would isolate ourselves from
> > Hadoop's deps, and it would isolate our clients from our deps. However,
> it
> > doesn't isolate our clients from Hadoop's deps, so our users don't really
> > gain anything from it. I also argue that it creates an unreasonable
> release
> > engineering burden on our project. I'm also not clear on the implications
> > to downstreamers who extend us with coprocessors.
> >
>
>
> Other than a missing 'quick-fix' descriptor, you call what is proposed well
> ....except where you think the prebuild will be burdensome. Here I think
> otherwise as I think releases will be rare, there is nought 'new' in a
> release but packaged 3rd-party libs, and verification/vote by PMCers should
> be a simple affair.
>

Maybe it's not such a burden? If the 2.0 and 3.0 RM's are brave and true,
it's worth a go.

Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
> distinct from the narrower task proposed here where we are trying to
> unhitch ourselves of the netty/guava hadoop uses? (Currently we break
> against hadoop3 because of netty incompat., HADOOP-13866, which we might be
> able to solve w/ exclusions.....but....).
>
> The two tasks can be run in parallel?
>

Indeed, they seem distinct but quite related.

For CPs, they should bring their own bedding and towels and not be trying
> to use ours. On the plus-side, we could upgrade core 3rd-party libs and the
> CP would keep working.
>

All of this sounds like an ideal state.

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
Thanks for the great input all.

See below:


On Wed, Apr 12, 2017 at 9:01 AM, Nick Dimiduk <nd...@gmail.com> wrote:

> On Wed, Apr 12, 2017 at 8:28 AM Josh Elser <el...@apache.org> wrote:
>
> >
> >
> > Sean Busbey wrote:
> > > On Tue, Apr 11, 2017 at 11:43 PM Nick Dimiduk<nd...@gmail.com>
> > wrote:
> > >
> > >>> This effort is about our internals. We have a mess of other
> components
> > >> all
> > >>> up inside us such as HDFS, etc., each with their own sets of
> > dependencies
> > >>> many of which we have in common. This project t is about making it so
> > we
> > >>> can upgrade at a rate independent of when our upstreamers choose to
> > >> change.
> > >>
>


(I'd add to the above that we can upgrade libs w/o breaking downstreamers
also -- but this point becomes an intrinsic later in the thread)


> >> If the above quote is true, then I think what we want is a set of
> shaded
> > >> Hadoop client libs that we can depend on so as to not get all the
> > >> transitive deps. Hadoop doesn't provide it, but we could do so
> ourselves
> > >> with (yet another) module in our project. Assuming, that is, the
> > upstream
> > >> client interfaces are well defined and don't leak stuff we care about.
>


We should do this too (I think you've identified the big 'if' w/ the above
identified assumption). As you say later, "... it's time we firm up the
boundaries between us and Hadoop.". There is some precedent with
hadoop-compat-* modules. Hadoop would be relocated?

Spitballing, IIUC, I think this would be a big job (once per version and
the vagaries of hadoop/spark) with no guarantee of success on other end
because of assumption you call out. Do I have this right?


...

> Isolating our clients from our deps is best served by our shaded modules.
> What do you think about turning things on their head: for 2.0 the
> hbase-client jar is the shaded artifact by default, not the other way
> around? We have cleanup to get our deps out of our public interfaces in
> order to make this work.
>
>
We should do this at least going forward. hbase2 is the opportunity.
Testing and doc is all that is needed? I added it to our hbase2 description
doc as a deliverable (though not a blocker).



> This proposal of an external shaded dependencies module sounds like an
> attempt to solve both concerns at once. It would isolate ourselves from
> Hadoop's deps, and it would isolate our clients from our deps. However, it
> doesn't isolate our clients from Hadoop's deps, so our users don't really
> gain anything from it. I also argue that it creates an unreasonable release
> engineering burden on our project. I'm also not clear on the implications
> to downstreamers who extend us with coprocessors.
>


Other than a missing 'quick-fix' descriptor, you call what is proposed well
....except where you think the prebuild will be burdensome. Here I think
otherwise as I think releases will be rare, there is nought 'new' in a
release but packaged 3rd-party libs, and verification/vote by PMCers should
be a simple affair.

Do you agree that the fixing-what-we-leak-of-hadoop-to-downstreamers is
distinct from the narrower task proposed here where we are trying to
unhitch ourselves of the netty/guava hadoop uses? (Currently we break
against hadoop3 because of netty incompat., HADOOP-13866, which we might be
able to solve w/ exclusions.....but....).

The two tasks can be run in parallel?

For CPs, they should bring their own bedding and towels and not be trying
to use ours. On the plus-side, we could upgrade core 3rd-party libs and the
CP would keep working.

St.Ack

Re: [DISCUSS] More Shading

Posted by Josh Elser <el...@apache.org>.

Nick Dimiduk wrote:
>> >  Well put, Nick.
>> >
>> >  With Sean's point about the Hadoop shaded client, it seems to me that we
>> >  have things which could be pursued in parallel:
>> >
>> >  1) Roadmap to Hadoop3 (and shaded hdfs client).
>> >  2) Identify components which we use from Hadoop, for each component:
>> >      2a) Work with Hadoop to isolate that component from other cruft (best
>> >  example is the Configuration class -- you get something like 8MB of
>> >  "jar" just to parse an xml file).
>> >      2b) Pull the implementation into HBase, removing dependency from
>> >  Hadoop entirely.
>> >
>> >  I think that both of these can/should be done in parallel to the
>> >  isolation of the dependencies which HBase requires (isolating ourselves
>> >  from upstream, and isolating downstream from us).
>
>
> Hang on, these are two different concerns.
>
> Isolating ourselves from Hadoop follows the line of thought around Hadoop's
> shaded client jars. If we must have this for HBase2.0/Hadoop2.8,  we can
> probably backport their efforts as modules in our own build. See my early
> comment about this being error prone for folks who re-package us. Either
> way, it's time we firm up the boundaries between us and Hadoop.
>
> Isolating our clients from our deps is best served by our shaded modules.
> What do you think about turning things on their head: for 2.0 the
> hbase-client jar is the shaded artifact by default, not the other way
> around? We have cleanup to get our deps out of our public interfaces in
> order to make this work.

+1 Worst case, people bloat their applications with dependencies (worst 
case 2x the size), but it removes the runtime breakages due to multiple 
versions of a class (because of HBase). I'd gladly pay size cost any day.

> This proposal of an external shaded dependencies module sounds like an
> attempt to solve both concerns at once. It would isolate ourselves from
> Hadoop's deps, and it would isolate our clients from our deps. However, it
> doesn't isolate our clients from Hadoop's deps, so our users don't really
> gain anything from it. I also argue that it creates an unreasonable release
> engineering burden on our project. I'm also not clear on the implications
> to downstreamers who extend us with coprocessors.

I thought I had a reason as to why reducing our reliance on Hadoop 
classes was important (even with a HDFS shaded client), but maybe my 
only consideration was long-term cleanliness (avoid the double-packaging 
of the same classes). I can't come up with an example anymore.

Constructing some exemplars for this work would likely be the best kind 
of "acceptance test". Maybe we can pull common use-cases and just create 
sample stub-projects as to how they'd work with whatever we come up 
with. Hopefully, this would help us minimize downstream burden.

re CPs: I have also not considered them..


Re: [DISCUSS] More Shading

Posted by Nick Dimiduk <nd...@gmail.com>.
On Wed, Apr 12, 2017 at 8:28 AM Josh Elser <el...@apache.org> wrote:

>
>
> Sean Busbey wrote:
> > On Tue, Apr 11, 2017 at 11:43 PM Nick Dimiduk<nd...@gmail.com>
> wrote:
> >
> >>> This effort is about our internals. We have a mess of other components
> >> all
> >>> up inside us such as HDFS, etc., each with their own sets of
> dependencies
> >>> many of which we have in common. This project t is about making it so
> we
> >>> can upgrade at a rate independent of when our upstreamers choose to
> >> change.
> >>
> >> Pardon as I try to get a handle on the intention behind this thread.
> >>
> >> If the above quote is true, then I think what we want is a set of shaded
> >> Hadoop client libs that we can depend on so as to not get all the
> >> transitive deps. Hadoop doesn't provide it, but we could do so ourselves
> >> with (yet another) module in our project. Assuming, that is, the
> upstream
> >> client interfaces are well defined and don't leak stuff we care about.
> It
> >> also creates a terrible nightmare for anyone downstream of us who
> >> repackages HBase. The whole thing is extremely error-prone, because
> there's
> >> not very good tooling for this. Realistically, we end up with a
> combination
> >> of the enforcer plugin and maybe our own custom plugin to ensure clean
> >> transitive dependencies...
> >>
> >>
> > Hadoop does provide a shaded client as of the 3.0.0* release line. We
> could
> > push as a community for a version of that for Hadoop's branch-2.
> >
> > Unfortunately, that shaded client won't help where we're reaching into
> the
> > guts of Hadoop (like our reliance on their web stuff).
>
> Well put, Nick.
>
> With Sean's point about the Hadoop shaded client, it seems to me that we
> have things which could be pursued in parallel:
>
> 1) Roadmap to Hadoop3 (and shaded hdfs client).
> 2) Identify components which we use from Hadoop, for each component:
>    2a) Work with Hadoop to isolate that component from other cruft (best
> example is the Configuration class -- you get something like 8MB of
> "jar" just to parse an xml file).
>    2b) Pull the implementation into HBase, removing dependency from
> Hadoop entirely.
>
> I think that both of these can/should be done in parallel to the
> isolation of the dependencies which HBase requires (isolating ourselves
> from upstream, and isolating downstream from us).


Hang on, these are two different concerns.

Isolating ourselves from Hadoop follows the line of thought around Hadoop's
shaded client jars. If we must have this for HBase 2.0/Hadoop 2.8, we can
probably backport their efforts as modules in our own build. See my early
comment about this being error prone for folks who re-package us. Either
way, it's time we firm up the boundaries between us and Hadoop.

Isolating our clients from our deps is best served by our shaded modules.
What do you think about turning things on their head: for 2.0 the
hbase-client jar is the shaded artifact by default, not the other way
around? We have cleanup to get our deps out of our public interfaces in
order to make this work.

This proposal of an external shaded dependencies module sounds like an
attempt to solve both concerns at once. It would isolate ourselves from
Hadoop's deps, and it would isolate our clients from our deps. However, it
doesn't isolate our clients from Hadoop's deps, so our users don't really
gain anything from it. I also argue that it creates an unreasonable release
engineering burden on our project. I'm also not clear on the implications
to downstreamers who extend us with coprocessors.

>

Re: [DISCUSS] More Shading

Posted by Josh Elser <el...@apache.org>.

Sean Busbey wrote:
> On Tue, Apr 11, 2017 at 11:43 PM Nick Dimiduk<nd...@gmail.com>  wrote:
>
>>> This effort is about our internals. We have a mess of other components
>> all
>>> up inside us such as HDFS, etc., each with their own sets of dependencies
>>> many of which we have in common. This project t is about making it so we
>>> can upgrade at a rate independent of when our upstreamers choose to
>> change.
>>
>> Pardon as I try to get a handle on the intention behind this thread.
>>
>> If the above quote is true, then I think what we want is a set of shaded
>> Hadoop client libs that we can depend on so as to not get all the
>> transitive deps. Hadoop doesn't provide it, but we could do so ourselves
>> with (yet another) module in our project. Assuming, that is, the upstream
>> client interfaces are well defined and don't leak stuff we care about. It
>> also creates a terrible nightmare for anyone downstream of us who
>> repackages HBase. The whole thing is extremely error-prone, because there's
>> not very good tooling for this. Realistically, we end up with a combination
>> of the enforcer plugin and maybe our own custom plugin to ensure clean
>> transitive dependencies...
>>
>>
> Hadoop does provide a shaded client as of the 3.0.0* release line. We could
> push as a community for a version of that for Hadoop's branch-2.
>
> Unfortunately, that shaded client won't help where we're reaching into the
> guts of Hadoop (like our reliance on their web stuff).

Well put, Nick.

With Sean's point about the Hadoop shaded client, it seems to me that we 
have things which could be pursued in parallel:

1) Roadmap to Hadoop3 (and shaded hdfs client).
2) Identify components which we use from Hadoop, for each component:
   2a) Work with Hadoop to isolate that component from other cruft (best 
example is the Configuration class -- you get something like 8MB of 
"jar" just to parse an xml file).
   2b) Pull the implementation into HBase, removing dependency from 
Hadoop entirely.

I think that both of these can/should be done in parallel to the 
isolation of the dependencies which HBase requires (isolating ourselves 
from upstream, and isolating downstream from us).

Re: [DISCUSS] More Shading

Posted by Sean Busbey <bu...@apache.org>.
On Tue, Apr 11, 2017 at 11:43 PM Nick Dimiduk <nd...@gmail.com> wrote:

> > This effort is about our internals. We have a mess of other components
> all
> > up inside us such as HDFS, etc., each with their own sets of dependencies
> > many of which we have in common. This project t is about making it so we
> > can upgrade at a rate independent of when our upstreamers choose to
> change.
>
> Pardon as I try to get a handle on the intention behind this thread.
>
> If the above quote is true, then I think what we want is a set of shaded
> Hadoop client libs that we can depend on so as to not get all the
> transitive deps. Hadoop doesn't provide it, but we could do so ourselves
> with (yet another) module in our project. Assuming, that is, the upstream
> client interfaces are well defined and don't leak stuff we care about. It
> also creates a terrible nightmare for anyone downstream of us who
> repackages HBase. The whole thing is extremely error-prone, because there's
> not very good tooling for this. Realistically, we end up with a combination
> of the enforcer plugin and maybe our own custom plugin to ensure clean
> transitive dependencies...
>
>
Hadoop does provide a shaded client as of the 3.0.0* release line. We could
push as a community for a version of that for Hadoop's branch-2.

Unfortunately, that shaded client won't help where we're reaching into the
guts of Hadoop (like our reliance on their web stuff).


> I guess the suggestion of the external repo containing our shaded fork of
> everything we depend on allows us to continue to compile, run on Hadoop's
> transitive dependency list w.o actually using any of it, I have that right?
> How would we version this thing?
>

Yes, that's correct.  simplest would be to version it similar to how we do
now, starting at version 1.0.0 and bump whenever we change a dependency.


> Between these two choices, I prefer the former as a "more correct"
> solution, but it depends entirely on how clean of a shaded hadoop we can
> reliably produce inline our build.
>

If we're going to try to go the route of cleaning up how we rely on Hadoop,
the bigger issue IMHO is getting ourselves off of things not included in
their client jars.

Re: [DISCUSS] More Shading

Posted by Nick Dimiduk <nd...@gmail.com>.
> This effort is about our internals. We have a mess of other components all
> up inside us such as HDFS, etc., each with their own sets of dependencies
> many of which we have in common. This project t is about making it so we
> can upgrade at a rate independent of when our upstreamers choose to
change.

Pardon as I try to get a handle on the intention behind this thread.

If the above quote is true, then I think what we want is a set of shaded
Hadoop client libs that we can depend on so as to not get all the
transitive deps. Hadoop doesn't provide it, but we could do so ourselves
with (yet another) module in our project. Assuming, that is, the upstream
client interfaces are well defined and don't leak stuff we care about. It
also creates a terrible nightmare for anyone downstream of us who
repackages HBase. The whole thing is extremely error-prone, because there's
not very good tooling for this. Realistically, we end up with a combination
of the enforcer plugin and maybe our own custom plugin to ensure clean
transitive dependencies...

I guess the suggestion of the external repo containing our shaded fork of
everything we depend on allows us to continue to compile, run on Hadoop's
transitive dependency list w.o actually using any of it, I have that right?
How would we version this thing?

Between these two choices, I prefer the former as a "more correct"
solution, but it depends entirely on how clean of a shaded hadoop we can
reliably produce inline our build.

On Tue, Apr 11, 2017 at 1:03 PM, Stack <st...@duboce.net> wrote:

> On Tue, Apr 11, 2017 at 10:23 AM, York, Zach <zy...@amazon.com> wrote:
>
> > Should we allow dependent projects (such as Phoenix) to weigh in on this
> > issue since they are likely going to be the ones that benefit/are
> effected?
> >
> > I dumped a pointer to here into dev@phoenix.
> St.Ack
>
>
>
> > On 4/11/17, 10:17 AM, "York, Zach" <zy...@amazon.com> wrote:
> >
> >     +1 (non-binding)
> >
> >     This sounds like a good idea to me!
> >
> >     Zach
> >
> >     On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <
> > saint.ack@gmail.com on behalf of stack@duboce.net> wrote:
> >
> >         Let me revive this thread.
> >
> >         Recall, we are stuck on old or particular versions of critical
> > libs. We are
> >         unable to update because our versions will clash w/ versions from
> >         upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client.
> > We need
> >         to message downstreamers that they should use it going forward.
> > This will
> >         help going forward but it will not inoculate our internals nor an
> > existing
> >         context where we'd like to be a compatible drop-in.
> >
> >         We could try hackery filtering transitive includes up in poms for
> > each
> >         version of hadoop/spark that we support but in the end, its a
> > bunch of
> >         effort, hard to test, and we are unable to dictate the CLASSPATH
> > order in
> >         all situations.
> >
> >         We could try some shading voodoo inline w/ build. Because shading
> > is a
> >         post-package step and because we are modularized and shading
> > includes the
> >         shaded classes in the artifact produced, we'd end up w/ multiple
> > copies of
> >         guava/netty/etc. classes, an instance per module that makes a
> > reference.
> >
> >         Lets do Sean's idea of a pre-build step where we package and
> > relocate
> >         ('shade') critical dependencies (Going by the thread above, Ram,
> > Anoop, and
> >         Andy seems good w/ general idea).
> >
> >         In implementation, we (The HBase PMC) would ask for a new repo
> > [1]. In here
> >         we'd create a new mvn project. This project would produce a
> single
> > artifact
> >         (jar) called hbase-dependencies or hbase-3rdparty or
> > hbase-shaded-3rdparty
> >         libs. In it would be relocated core libs such as guava and netty
> > (and maybe
> >         protobuf). We'd publish this artifact and then have hbase depend
> > on it
> >         changing all references to point at the relocation: e.g. rather
> > than import
> >         com.google.common.collect.Maps, we'd import
> >         org.apache.hadoop.hbase.com.google.common.collect.Maps.
> >
> >         We (The HBase PMC) will have to make releases of this new
> artifact
> > and vote
> >         on them. I think it will be a relatively rare event.
> >
> >         I'd be up for doing the first cut if folks are game.
> >
> >         St.Ack
> >
> >
> >         1. URL via Sean but for committers to view only:
> > https://reporeq.apache.org/
> >
> >         On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
> >         ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> >         > +1 for Sean's ideas. Bundling all the dependent libraries and
> > shading them
> >         > into one jar and HBase referring to it makes sense and should
> > avoid some of
> >         > the pain in terms of IDE usage. Stack's doc clearly talks about
> > the IDE
> >         > issues that we may get after this protobuf shading goes in. It
> > may be
> >         > difficult for new comers and those who don't know this
> > background of why it
> >         > has to be like that.
> >         >
> >         > Regards
> >         > Ram
> >         >
> >         > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net>
> wrote:
> >         >
> >         > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <jerryjch@gmail.com
> >
> > wrote:
> >         > >
> >         > > > How is the proposed going to impact the existing
> > shaded-client and
> >         > > > shaded-server modules, making them unnecessary and go away?
> >         > > >
> >         > >
> >         > > No. We still need the blanket shading of hbase client and
> > server.
> >         > >
> >         > > This effort is about our internals. We have a mess of other
> > components
> >         > all
> >         > > up inside us such as HDFS, etc., each with their own sets of
> > dependencies
> >         > > many of which we have in common. This project t is about
> > making it so we
> >         > > can upgrade at a rate independent of when our upstreamers
> > choose to
> >         > change.
> >         > >
> >         > >
> >         > > > It doesn't seem so.  These modules are supposed to shade
> > HBase and
> >         > > upstream
> >         > > > from downstream users.
> >         > > >
> >         > >
> >         > > Agree.
> >         > >
> >         > > Thanks for drawing out the difference between these two
> > shading efforts,
> >         > >
> >         > > St.Ack
> >         > >
> >         > >
> >         > >
> >         > > > Thanks.
> >         > > >
> >         > > > Jerry
> >         > > >
> >         > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
> >         > andrew.purtell@gmail.com
> >         > > >
> >         > > > wrote:
> >         > > >
> >         > > > > > Sean has suggested a pre-build step where in another
> > repo we'd make
> >         > > > hbase
> >         > > > > > shaded versions of critical libs, 'release' them
> (votes,
> > etc.) and
> >         > > then
> >         > > > > > have core depend on these. It be a bunch of work but
> > would make the
> >         > > > dev's
> >         > > > > > life easier.
> >         > > > >
> >         > > > > So when we make changes that require updates to and
> > rebuild of the
> >         > > > > supporting libraries, as a developer I would make local
> > changes,
> >         > > install
> >         > > > a
> >         > > > > snapshot of that into the local maven cache, then point
> > the HBase
> >         > build
> >         > > > at
> >         > > > > the snapshot, then do the other half of the work, then
> > push up to
> >         > both?
> >         > > > >
> >         > > > > I think this could work.
> >         > > >
> >         > >
> >         >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Tue, Apr 11, 2017 at 10:23 AM, York, Zach <zy...@amazon.com> wrote:

> Should we allow dependent projects (such as Phoenix) to weigh in on this
> issue since they are likely going to be the ones that benefit/are effected?
>
> I dumped a pointer to here into dev@phoenix.
St.Ack



> On 4/11/17, 10:17 AM, "York, Zach" <zy...@amazon.com> wrote:
>
>     +1 (non-binding)
>
>     This sounds like a good idea to me!
>
>     Zach
>
>     On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <
> saint.ack@gmail.com on behalf of stack@duboce.net> wrote:
>
>         Let me revive this thread.
>
>         Recall, we are stuck on old or particular versions of critical
> libs. We are
>         unable to update because our versions will clash w/ versions from
>         upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client.
> We need
>         to message downstreamers that they should use it going forward.
> This will
>         help going forward but it will not inoculate our internals nor an
> existing
>         context where we'd like to be a compatible drop-in.
>
>         We could try hackery filtering transitive includes up in poms for
> each
>         version of hadoop/spark that we support but in the end, its a
> bunch of
>         effort, hard to test, and we are unable to dictate the CLASSPATH
> order in
>         all situations.
>
>         We could try some shading voodoo inline w/ build. Because shading
> is a
>         post-package step and because we are modularized and shading
> includes the
>         shaded classes in the artifact produced, we'd end up w/ multiple
> copies of
>         guava/netty/etc. classes, an instance per module that makes a
> reference.
>
>         Lets do Sean's idea of a pre-build step where we package and
> relocate
>         ('shade') critical dependencies (Going by the thread above, Ram,
> Anoop, and
>         Andy seems good w/ general idea).
>
>         In implementation, we (The HBase PMC) would ask for a new repo
> [1]. In here
>         we'd create a new mvn project. This project would produce a single
> artifact
>         (jar) called hbase-dependencies or hbase-3rdparty or
> hbase-shaded-3rdparty
>         libs. In it would be relocated core libs such as guava and netty
> (and maybe
>         protobuf). We'd publish this artifact and then have hbase depend
> on it
>         changing all references to point at the relocation: e.g. rather
> than import
>         com.google.common.collect.Maps, we'd import
>         org.apache.hadoop.hbase.com.google.common.collect.Maps.
>
>         We (The HBase PMC) will have to make releases of this new artifact
> and vote
>         on them. I think it will be a relatively rare event.
>
>         I'd be up for doing the first cut if folks are game.
>
>         St.Ack
>
>
>         1. URL via Sean but for committers to view only:
> https://reporeq.apache.org/
>
>         On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
>         ramkrishna.s.vasudevan@gmail.com> wrote:
>
>         > +1 for Sean's ideas. Bundling all the dependent libraries and
> shading them
>         > into one jar and HBase referring to it makes sense and should
> avoid some of
>         > the pain in terms of IDE usage. Stack's doc clearly talks about
> the IDE
>         > issues that we may get after this protobuf shading goes in. It
> may be
>         > difficult for new comers and those who don't know this
> background of why it
>         > has to be like that.
>         >
>         > Regards
>         > Ram
>         >
>         > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net> wrote:
>         >
>         > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com>
> wrote:
>         > >
>         > > > How is the proposed going to impact the existing
> shaded-client and
>         > > > shaded-server modules, making them unnecessary and go away?
>         > > >
>         > >
>         > > No. We still need the blanket shading of hbase client and
> server.
>         > >
>         > > This effort is about our internals. We have a mess of other
> components
>         > all
>         > > up inside us such as HDFS, etc., each with their own sets of
> dependencies
>         > > many of which we have in common. This project t is about
> making it so we
>         > > can upgrade at a rate independent of when our upstreamers
> choose to
>         > change.
>         > >
>         > >
>         > > > It doesn't seem so.  These modules are supposed to shade
> HBase and
>         > > upstream
>         > > > from downstream users.
>         > > >
>         > >
>         > > Agree.
>         > >
>         > > Thanks for drawing out the difference between these two
> shading efforts,
>         > >
>         > > St.Ack
>         > >
>         > >
>         > >
>         > > > Thanks.
>         > > >
>         > > > Jerry
>         > > >
>         > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
>         > andrew.purtell@gmail.com
>         > > >
>         > > > wrote:
>         > > >
>         > > > > > Sean has suggested a pre-build step where in another
> repo we'd make
>         > > > hbase
>         > > > > > shaded versions of critical libs, 'release' them (votes,
> etc.) and
>         > > then
>         > > > > > have core depend on these. It be a bunch of work but
> would make the
>         > > > dev's
>         > > > > > life easier.
>         > > > >
>         > > > > So when we make changes that require updates to and
> rebuild of the
>         > > > > supporting libraries, as a developer I would make local
> changes,
>         > > install
>         > > > a
>         > > > > snapshot of that into the local maven cache, then point
> the HBase
>         > build
>         > > > at
>         > > > > the snapshot, then do the other half of the work, then
> push up to
>         > both?
>         > > > >
>         > > > > I think this could work.
>         > > >
>         > >
>         >
>
>
>
>
>

Re: [DISCUSS] More Shading

Posted by Jesse Yates <je...@gmail.com>.
Right, hence the install/package phase (probably with -DskipTests) first.
Probably only have to do this occasionally, as dependencies change.

Agree the IDEs are really unhappy with this though.

Seems like more headache to create another repo, but I'm not too tied
either way. Just asking. Thanks Sean.

-J

On Tue, Apr 11, 2017 at 12:24 PM Sean Busbey <bu...@apache.org> wrote:

> A new module probably won't work due to the fact that we need to reference
> the relocated classes in source code and maven won't do that until the
> "package" phase.
>
> IDEs in particular will barf all over the place.
>
> On Tue, Apr 11, 2017 at 1:04 PM Jesse Yates <je...@gmail.com>
> wrote:
>
> > > would ask for a new repo [1]. In here we'd create a new mvn project.
> >
> > Why get a new repo? A different (new) HBase mvn module that is depended
> > upon via other modules should cover it, IIRC. That module can handle all
> > the shading and not include transitive dependencies. Then in "downstream
> > modules" you should be able to just use the shaded classes. Building
> would
> > require doing a 'mvn install', but that's nothing new.
> >
> > If this was going to support the client I'd be concerned with size of the
> > resulting jar, with all the potential dependencies, but meh - its the
> > server only!
> >
> > Just my $0.02,
> > Jesse
> >
> > On Tue, Apr 11, 2017 at 10:23 AM York, Zach <zy...@amazon.com> wrote:
> >
> > > Should we allow dependent projects (such as Phoenix) to weigh in on
> this
> > > issue since they are likely going to be the ones that benefit/are
> > effected?
> > >
> > > On 4/11/17, 10:17 AM, "York, Zach" <zy...@amazon.com> wrote:
> > >
> > >     +1 (non-binding)
> > >
> > >     This sounds like a good idea to me!
> > >
> > >     Zach
> > >
> > >     On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <
> > > saint.ack@gmail.com on behalf of stack@duboce.net> wrote:
> > >
> > >         Let me revive this thread.
> > >
> > >         Recall, we are stuck on old or particular versions of critical
> > > libs. We are
> > >         unable to update because our versions will clash w/ versions
> from
> > >         upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded
> client.
> > > We need
> > >         to message downstreamers that they should use it going forward.
> > > This will
> > >         help going forward but it will not inoculate our internals nor
> an
> > > existing
> > >         context where we'd like to be a compatible drop-in.
> > >
> > >         We could try hackery filtering transitive includes up in poms
> for
> > > each
> > >         version of hadoop/spark that we support but in the end, its a
> > > bunch of
> > >         effort, hard to test, and we are unable to dictate the
> CLASSPATH
> > > order in
> > >         all situations.
> > >
> > >         We could try some shading voodoo inline w/ build. Because
> shading
> > > is a
> > >         post-package step and because we are modularized and shading
> > > includes the
> > >         shaded classes in the artifact produced, we'd end up w/
> multiple
> > > copies of
> > >         guava/netty/etc. classes, an instance per module that makes a
> > > reference.
> > >
> > >         Lets do Sean's idea of a pre-build step where we package and
> > > relocate
> > >         ('shade') critical dependencies (Going by the thread above,
> Ram,
> > > Anoop, and
> > >         Andy seems good w/ general idea).
> > >
> > >         In implementation, we (The HBase PMC) would ask for a new repo
> > > [1]. In here
> > >         we'd create a new mvn project. This project would produce a
> > single
> > > artifact
> > >         (jar) called hbase-dependencies or hbase-3rdparty or
> > > hbase-shaded-3rdparty
> > >         libs. In it would be relocated core libs such as guava and
> netty
> > > (and maybe
> > >         protobuf). We'd publish this artifact and then have hbase
> depend
> > > on it
> > >         changing all references to point at the relocation: e.g. rather
> > > than import
> > >         com.google.common.collect.Maps, we'd import
> > >         org.apache.hadoop.hbase.com.google.common.collect.Maps.
> > >
> > >         We (The HBase PMC) will have to make releases of this new
> > artifact
> > > and vote
> > >         on them. I think it will be a relatively rare event.
> > >
> > >         I'd be up for doing the first cut if folks are game.
> > >
> > >         St.Ack
> > >
> > >
> > >         1. URL via Sean but for committers to view only:
> > > https://reporeq.apache.org/
> > >
> > >         On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
> > >         ramkrishna.s.vasudevan@gmail.com> wrote:
> > >
> > >         > +1 for Sean's ideas. Bundling all the dependent libraries and
> > > shading them
> > >         > into one jar and HBase referring to it makes sense and should
> > > avoid some of
> > >         > the pain in terms of IDE usage. Stack's doc clearly talks
> about
> > > the IDE
> > >         > issues that we may get after this protobuf shading goes in.
> It
> > > may be
> > >         > difficult for new comers and those who don't know this
> > > background of why it
> > >         > has to be like that.
> > >         >
> > >         > Regards
> > >         > Ram
> > >         >
> > >         > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net>
> > wrote:
> > >         >
> > >         > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <
> jerryjch@gmail.com
> > >
> > > wrote:
> > >         > >
> > >         > > > How is the proposed going to impact the existing
> > > shaded-client and
> > >         > > > shaded-server modules, making them unnecessary and go
> away?
> > >         > > >
> > >         > >
> > >         > > No. We still need the blanket shading of hbase client and
> > > server.
> > >         > >
> > >         > > This effort is about our internals. We have a mess of other
> > > components
> > >         > all
> > >         > > up inside us such as HDFS, etc., each with their own sets
> of
> > > dependencies
> > >         > > many of which we have in common. This project t is about
> > > making it so we
> > >         > > can upgrade at a rate independent of when our upstreamers
> > > choose to
> > >         > change.
> > >         > >
> > >         > >
> > >         > > > It doesn't seem so.  These modules are supposed to shade
> > > HBase and
> > >         > > upstream
> > >         > > > from downstream users.
> > >         > > >
> > >         > >
> > >         > > Agree.
> > >         > >
> > >         > > Thanks for drawing out the difference between these two
> > > shading efforts,
> > >         > >
> > >         > > St.Ack
> > >         > >
> > >         > >
> > >         > >
> > >         > > > Thanks.
> > >         > > >
> > >         > > > Jerry
> > >         > > >
> > >         > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
> > >         > andrew.purtell@gmail.com
> > >         > > >
> > >         > > > wrote:
> > >         > > >
> > >         > > > > > Sean has suggested a pre-build step where in another
> > > repo we'd make
> > >         > > > hbase
> > >         > > > > > shaded versions of critical libs, 'release' them
> > (votes,
> > > etc.) and
> > >         > > then
> > >         > > > > > have core depend on these. It be a bunch of work but
> > > would make the
> > >         > > > dev's
> > >         > > > > > life easier.
> > >         > > > >
> > >         > > > > So when we make changes that require updates to and
> > > rebuild of the
> > >         > > > > supporting libraries, as a developer I would make local
> > > changes,
> > >         > > install
> > >         > > > a
> > >         > > > > snapshot of that into the local maven cache, then point
> > > the HBase
> > >         > build
> > >         > > > at
> > >         > > > > the snapshot, then do the other half of the work, then
> > > push up to
> > >         > both?
> > >         > > > >
> > >         > > > > I think this could work.
> > >         > > >
> > >         > >
> > >         >
> > >
> > >
> > >
> > >
> > > --
> > Jesse Yates
> > Founder/CEO Fineo.io
> > Book a meeting: https://calendly.com/jyates
> >
>
-- 
Jesse Yates
Founder/CEO Fineo.io
Book a meeting: https://calendly.com/jyates

Re: [DISCUSS] More Shading

Posted by Sean Busbey <bu...@apache.org>.
A new module probably won't work due to the fact that we need to reference
the relocated classes in source code and maven won't do that until the
"package" phase.

IDEs in particular will barf all over the place.

On Tue, Apr 11, 2017 at 1:04 PM Jesse Yates <je...@gmail.com> wrote:

> > would ask for a new repo [1]. In here we'd create a new mvn project.
>
> Why get a new repo? A different (new) HBase mvn module that is depended
> upon via other modules should cover it, IIRC. That module can handle all
> the shading and not include transitive dependencies. Then in "downstream
> modules" you should be able to just use the shaded classes. Building would
> require doing a 'mvn install', but that's nothing new.
>
> If this was going to support the client I'd be concerned with size of the
> resulting jar, with all the potential dependencies, but meh - its the
> server only!
>
> Just my $0.02,
> Jesse
>
> On Tue, Apr 11, 2017 at 10:23 AM York, Zach <zy...@amazon.com> wrote:
>
> > Should we allow dependent projects (such as Phoenix) to weigh in on this
> > issue since they are likely going to be the ones that benefit/are
> effected?
> >
> > On 4/11/17, 10:17 AM, "York, Zach" <zy...@amazon.com> wrote:
> >
> >     +1 (non-binding)
> >
> >     This sounds like a good idea to me!
> >
> >     Zach
> >
> >     On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <
> > saint.ack@gmail.com on behalf of stack@duboce.net> wrote:
> >
> >         Let me revive this thread.
> >
> >         Recall, we are stuck on old or particular versions of critical
> > libs. We are
> >         unable to update because our versions will clash w/ versions from
> >         upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client.
> > We need
> >         to message downstreamers that they should use it going forward.
> > This will
> >         help going forward but it will not inoculate our internals nor an
> > existing
> >         context where we'd like to be a compatible drop-in.
> >
> >         We could try hackery filtering transitive includes up in poms for
> > each
> >         version of hadoop/spark that we support but in the end, its a
> > bunch of
> >         effort, hard to test, and we are unable to dictate the CLASSPATH
> > order in
> >         all situations.
> >
> >         We could try some shading voodoo inline w/ build. Because shading
> > is a
> >         post-package step and because we are modularized and shading
> > includes the
> >         shaded classes in the artifact produced, we'd end up w/ multiple
> > copies of
> >         guava/netty/etc. classes, an instance per module that makes a
> > reference.
> >
> >         Lets do Sean's idea of a pre-build step where we package and
> > relocate
> >         ('shade') critical dependencies (Going by the thread above, Ram,
> > Anoop, and
> >         Andy seems good w/ general idea).
> >
> >         In implementation, we (The HBase PMC) would ask for a new repo
> > [1]. In here
> >         we'd create a new mvn project. This project would produce a
> single
> > artifact
> >         (jar) called hbase-dependencies or hbase-3rdparty or
> > hbase-shaded-3rdparty
> >         libs. In it would be relocated core libs such as guava and netty
> > (and maybe
> >         protobuf). We'd publish this artifact and then have hbase depend
> > on it
> >         changing all references to point at the relocation: e.g. rather
> > than import
> >         com.google.common.collect.Maps, we'd import
> >         org.apache.hadoop.hbase.com.google.common.collect.Maps.
> >
> >         We (The HBase PMC) will have to make releases of this new
> artifact
> > and vote
> >         on them. I think it will be a relatively rare event.
> >
> >         I'd be up for doing the first cut if folks are game.
> >
> >         St.Ack
> >
> >
> >         1. URL via Sean but for committers to view only:
> > https://reporeq.apache.org/
> >
> >         On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
> >         ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> >         > +1 for Sean's ideas. Bundling all the dependent libraries and
> > shading them
> >         > into one jar and HBase referring to it makes sense and should
> > avoid some of
> >         > the pain in terms of IDE usage. Stack's doc clearly talks about
> > the IDE
> >         > issues that we may get after this protobuf shading goes in. It
> > may be
> >         > difficult for new comers and those who don't know this
> > background of why it
> >         > has to be like that.
> >         >
> >         > Regards
> >         > Ram
> >         >
> >         > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net>
> wrote:
> >         >
> >         > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <jerryjch@gmail.com
> >
> > wrote:
> >         > >
> >         > > > How is the proposed going to impact the existing
> > shaded-client and
> >         > > > shaded-server modules, making them unnecessary and go away?
> >         > > >
> >         > >
> >         > > No. We still need the blanket shading of hbase client and
> > server.
> >         > >
> >         > > This effort is about our internals. We have a mess of other
> > components
> >         > all
> >         > > up inside us such as HDFS, etc., each with their own sets of
> > dependencies
> >         > > many of which we have in common. This project t is about
> > making it so we
> >         > > can upgrade at a rate independent of when our upstreamers
> > choose to
> >         > change.
> >         > >
> >         > >
> >         > > > It doesn't seem so.  These modules are supposed to shade
> > HBase and
> >         > > upstream
> >         > > > from downstream users.
> >         > > >
> >         > >
> >         > > Agree.
> >         > >
> >         > > Thanks for drawing out the difference between these two
> > shading efforts,
> >         > >
> >         > > St.Ack
> >         > >
> >         > >
> >         > >
> >         > > > Thanks.
> >         > > >
> >         > > > Jerry
> >         > > >
> >         > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
> >         > andrew.purtell@gmail.com
> >         > > >
> >         > > > wrote:
> >         > > >
> >         > > > > > Sean has suggested a pre-build step where in another
> > repo we'd make
> >         > > > hbase
> >         > > > > > shaded versions of critical libs, 'release' them
> (votes,
> > etc.) and
> >         > > then
> >         > > > > > have core depend on these. It be a bunch of work but
> > would make the
> >         > > > dev's
> >         > > > > > life easier.
> >         > > > >
> >         > > > > So when we make changes that require updates to and
> > rebuild of the
> >         > > > > supporting libraries, as a developer I would make local
> > changes,
> >         > > install
> >         > > > a
> >         > > > > snapshot of that into the local maven cache, then point
> > the HBase
> >         > build
> >         > > > at
> >         > > > > the snapshot, then do the other half of the work, then
> > push up to
> >         > both?
> >         > > > >
> >         > > > > I think this could work.
> >         > > >
> >         > >
> >         >
> >
> >
> >
> >
> > --
> Jesse Yates
> Founder/CEO Fineo.io
> Book a meeting: https://calendly.com/jyates
>

Re: [DISCUSS] More Shading

Posted by Jesse Yates <je...@gmail.com>.
> would ask for a new repo [1]. In here we'd create a new mvn project.

Why get a new repo? A different (new) HBase mvn module that is depended
upon via other modules should cover it, IIRC. That module can handle all
the shading and not include transitive dependencies. Then in "downstream
modules" you should be able to just use the shaded classes. Building would
require doing a 'mvn install', but that's nothing new.

If this was going to support the client I'd be concerned with size of the
resulting jar, with all the potential dependencies, but meh - its the
server only!

Just my $0.02,
Jesse

On Tue, Apr 11, 2017 at 10:23 AM York, Zach <zy...@amazon.com> wrote:

> Should we allow dependent projects (such as Phoenix) to weigh in on this
> issue since they are likely going to be the ones that benefit/are effected?
>
> On 4/11/17, 10:17 AM, "York, Zach" <zy...@amazon.com> wrote:
>
>     +1 (non-binding)
>
>     This sounds like a good idea to me!
>
>     Zach
>
>     On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <
> saint.ack@gmail.com on behalf of stack@duboce.net> wrote:
>
>         Let me revive this thread.
>
>         Recall, we are stuck on old or particular versions of critical
> libs. We are
>         unable to update because our versions will clash w/ versions from
>         upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client.
> We need
>         to message downstreamers that they should use it going forward.
> This will
>         help going forward but it will not inoculate our internals nor an
> existing
>         context where we'd like to be a compatible drop-in.
>
>         We could try hackery filtering transitive includes up in poms for
> each
>         version of hadoop/spark that we support but in the end, its a
> bunch of
>         effort, hard to test, and we are unable to dictate the CLASSPATH
> order in
>         all situations.
>
>         We could try some shading voodoo inline w/ build. Because shading
> is a
>         post-package step and because we are modularized and shading
> includes the
>         shaded classes in the artifact produced, we'd end up w/ multiple
> copies of
>         guava/netty/etc. classes, an instance per module that makes a
> reference.
>
>         Lets do Sean's idea of a pre-build step where we package and
> relocate
>         ('shade') critical dependencies (Going by the thread above, Ram,
> Anoop, and
>         Andy seems good w/ general idea).
>
>         In implementation, we (The HBase PMC) would ask for a new repo
> [1]. In here
>         we'd create a new mvn project. This project would produce a single
> artifact
>         (jar) called hbase-dependencies or hbase-3rdparty or
> hbase-shaded-3rdparty
>         libs. In it would be relocated core libs such as guava and netty
> (and maybe
>         protobuf). We'd publish this artifact and then have hbase depend
> on it
>         changing all references to point at the relocation: e.g. rather
> than import
>         com.google.common.collect.Maps, we'd import
>         org.apache.hadoop.hbase.com.google.common.collect.Maps.
>
>         We (The HBase PMC) will have to make releases of this new artifact
> and vote
>         on them. I think it will be a relatively rare event.
>
>         I'd be up for doing the first cut if folks are game.
>
>         St.Ack
>
>
>         1. URL via Sean but for committers to view only:
> https://reporeq.apache.org/
>
>         On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
>         ramkrishna.s.vasudevan@gmail.com> wrote:
>
>         > +1 for Sean's ideas. Bundling all the dependent libraries and
> shading them
>         > into one jar and HBase referring to it makes sense and should
> avoid some of
>         > the pain in terms of IDE usage. Stack's doc clearly talks about
> the IDE
>         > issues that we may get after this protobuf shading goes in. It
> may be
>         > difficult for new comers and those who don't know this
> background of why it
>         > has to be like that.
>         >
>         > Regards
>         > Ram
>         >
>         > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net> wrote:
>         >
>         > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com>
> wrote:
>         > >
>         > > > How is the proposed going to impact the existing
> shaded-client and
>         > > > shaded-server modules, making them unnecessary and go away?
>         > > >
>         > >
>         > > No. We still need the blanket shading of hbase client and
> server.
>         > >
>         > > This effort is about our internals. We have a mess of other
> components
>         > all
>         > > up inside us such as HDFS, etc., each with their own sets of
> dependencies
>         > > many of which we have in common. This project t is about
> making it so we
>         > > can upgrade at a rate independent of when our upstreamers
> choose to
>         > change.
>         > >
>         > >
>         > > > It doesn't seem so.  These modules are supposed to shade
> HBase and
>         > > upstream
>         > > > from downstream users.
>         > > >
>         > >
>         > > Agree.
>         > >
>         > > Thanks for drawing out the difference between these two
> shading efforts,
>         > >
>         > > St.Ack
>         > >
>         > >
>         > >
>         > > > Thanks.
>         > > >
>         > > > Jerry
>         > > >
>         > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
>         > andrew.purtell@gmail.com
>         > > >
>         > > > wrote:
>         > > >
>         > > > > > Sean has suggested a pre-build step where in another
> repo we'd make
>         > > > hbase
>         > > > > > shaded versions of critical libs, 'release' them (votes,
> etc.) and
>         > > then
>         > > > > > have core depend on these. It be a bunch of work but
> would make the
>         > > > dev's
>         > > > > > life easier.
>         > > > >
>         > > > > So when we make changes that require updates to and
> rebuild of the
>         > > > > supporting libraries, as a developer I would make local
> changes,
>         > > install
>         > > > a
>         > > > > snapshot of that into the local maven cache, then point
> the HBase
>         > build
>         > > > at
>         > > > > the snapshot, then do the other half of the work, then
> push up to
>         > both?
>         > > > >
>         > > > > I think this could work.
>         > > >
>         > >
>         >
>
>
>
>
> --
Jesse Yates
Founder/CEO Fineo.io
Book a meeting: https://calendly.com/jyates

Re: [DISCUSS] More Shading

Posted by "York, Zach" <zy...@amazon.com>.
Should we allow dependent projects (such as Phoenix) to weigh in on this issue since they are likely going to be the ones that benefit/are effected?

On 4/11/17, 10:17 AM, "York, Zach" <zy...@amazon.com> wrote:

    +1 (non-binding)
    
    This sounds like a good idea to me!
    
    Zach
    
    On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <saint.ack@gmail.com on behalf of stack@duboce.net> wrote:
    
        Let me revive this thread.
        
        Recall, we are stuck on old or particular versions of critical libs. We are
        unable to update because our versions will clash w/ versions from
        upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client. We need
        to message downstreamers that they should use it going forward.  This will
        help going forward but it will not inoculate our internals nor an existing
        context where we'd like to be a compatible drop-in.
        
        We could try hackery filtering transitive includes up in poms for each
        version of hadoop/spark that we support but in the end, its a bunch of
        effort, hard to test, and we are unable to dictate the CLASSPATH order in
        all situations.
        
        We could try some shading voodoo inline w/ build. Because shading is a
        post-package step and because we are modularized and shading includes the
        shaded classes in the artifact produced, we'd end up w/ multiple copies of
        guava/netty/etc. classes, an instance per module that makes a reference.
        
        Lets do Sean's idea of a pre-build step where we package and relocate
        ('shade') critical dependencies (Going by the thread above, Ram, Anoop, and
        Andy seems good w/ general idea).
        
        In implementation, we (The HBase PMC) would ask for a new repo [1]. In here
        we'd create a new mvn project. This project would produce a single artifact
        (jar) called hbase-dependencies or hbase-3rdparty or hbase-shaded-3rdparty
        libs. In it would be relocated core libs such as guava and netty (and maybe
        protobuf). We'd publish this artifact and then have hbase depend on it
        changing all references to point at the relocation: e.g. rather than import
        com.google.common.collect.Maps, we'd import
        org.apache.hadoop.hbase.com.google.common.collect.Maps.
        
        We (The HBase PMC) will have to make releases of this new artifact and vote
        on them. I think it will be a relatively rare event.
        
        I'd be up for doing the first cut if folks are game.
        
        St.Ack
        
        
        1. URL via Sean but for committers to view only: https://reporeq.apache.org/
        
        On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
        ramkrishna.s.vasudevan@gmail.com> wrote:
        
        > +1 for Sean's ideas. Bundling all the dependent libraries and shading them
        > into one jar and HBase referring to it makes sense and should avoid some of
        > the pain in terms of IDE usage. Stack's doc clearly talks about the IDE
        > issues that we may get after this protobuf shading goes in. It may be
        > difficult for new comers and those who don't know this background of why it
        > has to be like that.
        >
        > Regards
        > Ram
        >
        > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net> wrote:
        >
        > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com> wrote:
        > >
        > > > How is the proposed going to impact the existing shaded-client and
        > > > shaded-server modules, making them unnecessary and go away?
        > > >
        > >
        > > No. We still need the blanket shading of hbase client and server.
        > >
        > > This effort is about our internals. We have a mess of other components
        > all
        > > up inside us such as HDFS, etc., each with their own sets of dependencies
        > > many of which we have in common. This project t is about making it so we
        > > can upgrade at a rate independent of when our upstreamers choose to
        > change.
        > >
        > >
        > > > It doesn't seem so.  These modules are supposed to shade HBase and
        > > upstream
        > > > from downstream users.
        > > >
        > >
        > > Agree.
        > >
        > > Thanks for drawing out the difference between these two shading efforts,
        > >
        > > St.Ack
        > >
        > >
        > >
        > > > Thanks.
        > > >
        > > > Jerry
        > > >
        > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
        > andrew.purtell@gmail.com
        > > >
        > > > wrote:
        > > >
        > > > > > Sean has suggested a pre-build step where in another repo we'd make
        > > > hbase
        > > > > > shaded versions of critical libs, 'release' them (votes, etc.) and
        > > then
        > > > > > have core depend on these. It be a bunch of work but would make the
        > > > dev's
        > > > > > life easier.
        > > > >
        > > > > So when we make changes that require updates to and rebuild of the
        > > > > supporting libraries, as a developer I would make local changes,
        > > install
        > > > a
        > > > > snapshot of that into the local maven cache, then point the HBase
        > build
        > > > at
        > > > > the snapshot, then do the other half of the work, then push up to
        > both?
        > > > >
        > > > > I think this could work.
        > > >
        > >
        >
        
    
    


Re: [DISCUSS] More Shading

Posted by "York, Zach" <zy...@amazon.com>.
+1 (non-binding)

This sounds like a good idea to me!

Zach

On 4/11/17, 9:48 AM, "saint.ack@gmail.com on behalf of Stack" <saint.ack@gmail.com on behalf of stack@duboce.net> wrote:

    Let me revive this thread.
    
    Recall, we are stuck on old or particular versions of critical libs. We are
    unable to update because our versions will clash w/ versions from
    upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client. We need
    to message downstreamers that they should use it going forward.  This will
    help going forward but it will not inoculate our internals nor an existing
    context where we'd like to be a compatible drop-in.
    
    We could try hackery filtering transitive includes up in poms for each
    version of hadoop/spark that we support but in the end, its a bunch of
    effort, hard to test, and we are unable to dictate the CLASSPATH order in
    all situations.
    
    We could try some shading voodoo inline w/ build. Because shading is a
    post-package step and because we are modularized and shading includes the
    shaded classes in the artifact produced, we'd end up w/ multiple copies of
    guava/netty/etc. classes, an instance per module that makes a reference.
    
    Lets do Sean's idea of a pre-build step where we package and relocate
    ('shade') critical dependencies (Going by the thread above, Ram, Anoop, and
    Andy seems good w/ general idea).
    
    In implementation, we (The HBase PMC) would ask for a new repo [1]. In here
    we'd create a new mvn project. This project would produce a single artifact
    (jar) called hbase-dependencies or hbase-3rdparty or hbase-shaded-3rdparty
    libs. In it would be relocated core libs such as guava and netty (and maybe
    protobuf). We'd publish this artifact and then have hbase depend on it
    changing all references to point at the relocation: e.g. rather than import
    com.google.common.collect.Maps, we'd import
    org.apache.hadoop.hbase.com.google.common.collect.Maps.
    
    We (The HBase PMC) will have to make releases of this new artifact and vote
    on them. I think it will be a relatively rare event.
    
    I'd be up for doing the first cut if folks are game.
    
    St.Ack
    
    
    1. URL via Sean but for committers to view only: https://reporeq.apache.org/
    
    On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
    ramkrishna.s.vasudevan@gmail.com> wrote:
    
    > +1 for Sean's ideas. Bundling all the dependent libraries and shading them
    > into one jar and HBase referring to it makes sense and should avoid some of
    > the pain in terms of IDE usage. Stack's doc clearly talks about the IDE
    > issues that we may get after this protobuf shading goes in. It may be
    > difficult for new comers and those who don't know this background of why it
    > has to be like that.
    >
    > Regards
    > Ram
    >
    > On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net> wrote:
    >
    > > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com> wrote:
    > >
    > > > How is the proposed going to impact the existing shaded-client and
    > > > shaded-server modules, making them unnecessary and go away?
    > > >
    > >
    > > No. We still need the blanket shading of hbase client and server.
    > >
    > > This effort is about our internals. We have a mess of other components
    > all
    > > up inside us such as HDFS, etc., each with their own sets of dependencies
    > > many of which we have in common. This project t is about making it so we
    > > can upgrade at a rate independent of when our upstreamers choose to
    > change.
    > >
    > >
    > > > It doesn't seem so.  These modules are supposed to shade HBase and
    > > upstream
    > > > from downstream users.
    > > >
    > >
    > > Agree.
    > >
    > > Thanks for drawing out the difference between these two shading efforts,
    > >
    > > St.Ack
    > >
    > >
    > >
    > > > Thanks.
    > > >
    > > > Jerry
    > > >
    > > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
    > andrew.purtell@gmail.com
    > > >
    > > > wrote:
    > > >
    > > > > > Sean has suggested a pre-build step where in another repo we'd make
    > > > hbase
    > > > > > shaded versions of critical libs, 'release' them (votes, etc.) and
    > > then
    > > > > > have core depend on these. It be a bunch of work but would make the
    > > > dev's
    > > > > > life easier.
    > > > >
    > > > > So when we make changes that require updates to and rebuild of the
    > > > > supporting libraries, as a developer I would make local changes,
    > > install
    > > > a
    > > > > snapshot of that into the local maven cache, then point the HBase
    > build
    > > > at
    > > > > the snapshot, then do the other half of the work, then push up to
    > both?
    > > > >
    > > > > I think this could work.
    > > >
    > >
    >
    


Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
Let me revive this thread.

Recall, we are stuck on old or particular versions of critical libs. We are
unable to update because our versions will clash w/ versions from
upstreamer hadoop2.7/2.8/3.0/spark, etc. We have a shaded client. We need
to message downstreamers that they should use it going forward.  This will
help going forward but it will not inoculate our internals nor an existing
context where we'd like to be a compatible drop-in.

We could try hackery filtering transitive includes up in poms for each
version of hadoop/spark that we support but in the end, its a bunch of
effort, hard to test, and we are unable to dictate the CLASSPATH order in
all situations.

We could try some shading voodoo inline w/ build. Because shading is a
post-package step and because we are modularized and shading includes the
shaded classes in the artifact produced, we'd end up w/ multiple copies of
guava/netty/etc. classes, an instance per module that makes a reference.

Lets do Sean's idea of a pre-build step where we package and relocate
('shade') critical dependencies (Going by the thread above, Ram, Anoop, and
Andy seems good w/ general idea).

In implementation, we (The HBase PMC) would ask for a new repo [1]. In here
we'd create a new mvn project. This project would produce a single artifact
(jar) called hbase-dependencies or hbase-3rdparty or hbase-shaded-3rdparty
libs. In it would be relocated core libs such as guava and netty (and maybe
protobuf). We'd publish this artifact and then have hbase depend on it
changing all references to point at the relocation: e.g. rather than import
com.google.common.collect.Maps, we'd import
org.apache.hadoop.hbase.com.google.common.collect.Maps.

We (The HBase PMC) will have to make releases of this new artifact and vote
on them. I think it will be a relatively rare event.

I'd be up for doing the first cut if folks are game.

St.Ack


1. URL via Sean but for committers to view only: https://reporeq.apache.org/

On Sun, Oct 2, 2016 at 10:29 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> +1 for Sean's ideas. Bundling all the dependent libraries and shading them
> into one jar and HBase referring to it makes sense and should avoid some of
> the pain in terms of IDE usage. Stack's doc clearly talks about the IDE
> issues that we may get after this protobuf shading goes in. It may be
> difficult for new comers and those who don't know this background of why it
> has to be like that.
>
> Regards
> Ram
>
> On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net> wrote:
>
> > On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com> wrote:
> >
> > > How is the proposed going to impact the existing shaded-client and
> > > shaded-server modules, making them unnecessary and go away?
> > >
> >
> > No. We still need the blanket shading of hbase client and server.
> >
> > This effort is about our internals. We have a mess of other components
> all
> > up inside us such as HDFS, etc., each with their own sets of dependencies
> > many of which we have in common. This project t is about making it so we
> > can upgrade at a rate independent of when our upstreamers choose to
> change.
> >
> >
> > > It doesn't seem so.  These modules are supposed to shade HBase and
> > upstream
> > > from downstream users.
> > >
> >
> > Agree.
> >
> > Thanks for drawing out the difference between these two shading efforts,
> >
> > St.Ack
> >
> >
> >
> > > Thanks.
> > >
> > > Jerry
> > >
> > > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <
> andrew.purtell@gmail.com
> > >
> > > wrote:
> > >
> > > > > Sean has suggested a pre-build step where in another repo we'd make
> > > hbase
> > > > > shaded versions of critical libs, 'release' them (votes, etc.) and
> > then
> > > > > have core depend on these. It be a bunch of work but would make the
> > > dev's
> > > > > life easier.
> > > >
> > > > So when we make changes that require updates to and rebuild of the
> > > > supporting libraries, as a developer I would make local changes,
> > install
> > > a
> > > > snapshot of that into the local maven cache, then point the HBase
> build
> > > at
> > > > the snapshot, then do the other half of the work, then push up to
> both?
> > > >
> > > > I think this could work.
> > >
> >
>

Re: [DISCUSS] More Shading

Posted by ramkrishna vasudevan <ra...@gmail.com>.
+1 for Sean's ideas. Bundling all the dependent libraries and shading them
into one jar and HBase referring to it makes sense and should avoid some of
the pain in terms of IDE usage. Stack's doc clearly talks about the IDE
issues that we may get after this protobuf shading goes in. It may be
difficult for new comers and those who don't know this background of why it
has to be like that.

Regards
Ram

On Sun, Oct 2, 2016 at 10:51 AM, Stack <st...@duboce.net> wrote:

> On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com> wrote:
>
> > How is the proposed going to impact the existing shaded-client and
> > shaded-server modules, making them unnecessary and go away?
> >
>
> No. We still need the blanket shading of hbase client and server.
>
> This effort is about our internals. We have a mess of other components all
> up inside us such as HDFS, etc., each with their own sets of dependencies
> many of which we have in common. This project t is about making it so we
> can upgrade at a rate independent of when our upstreamers choose to change.
>
>
> > It doesn't seem so.  These modules are supposed to shade HBase and
> upstream
> > from downstream users.
> >
>
> Agree.
>
> Thanks for drawing out the difference between these two shading efforts,
>
> St.Ack
>
>
>
> > Thanks.
> >
> > Jerry
> >
> > On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <andrew.purtell@gmail.com
> >
> > wrote:
> >
> > > > Sean has suggested a pre-build step where in another repo we'd make
> > hbase
> > > > shaded versions of critical libs, 'release' them (votes, etc.) and
> then
> > > > have core depend on these. It be a bunch of work but would make the
> > dev's
> > > > life easier.
> > >
> > > So when we make changes that require updates to and rebuild of the
> > > supporting libraries, as a developer I would make local changes,
> install
> > a
> > > snapshot of that into the local maven cache, then point the HBase build
> > at
> > > the snapshot, then do the other half of the work, then push up to both?
> > >
> > > I think this could work.
> >
>

Re: [DISCUSS] More Shading

Posted by Stack <st...@duboce.net>.
On Sat, Oct 1, 2016 at 6:32 PM, Jerry He <je...@gmail.com> wrote:

> How is the proposed going to impact the existing shaded-client and
> shaded-server modules, making them unnecessary and go away?
>

No. We still need the blanket shading of hbase client and server.

This effort is about our internals. We have a mess of other components all
up inside us such as HDFS, etc., each with their own sets of dependencies
many of which we have in common. This project t is about making it so we
can upgrade at a rate independent of when our upstreamers choose to change.


> It doesn't seem so.  These modules are supposed to shade HBase and upstream
> from downstream users.
>

Agree.

Thanks for drawing out the difference between these two shading efforts,

St.Ack



> Thanks.
>
> Jerry
>
> On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <an...@gmail.com>
> wrote:
>
> > > Sean has suggested a pre-build step where in another repo we'd make
> hbase
> > > shaded versions of critical libs, 'release' them (votes, etc.) and then
> > > have core depend on these. It be a bunch of work but would make the
> dev's
> > > life easier.
> >
> > So when we make changes that require updates to and rebuild of the
> > supporting libraries, as a developer I would make local changes, install
> a
> > snapshot of that into the local maven cache, then point the HBase build
> at
> > the snapshot, then do the other half of the work, then push up to both?
> >
> > I think this could work.
>

Re: [DISCUSS] More Shading

Posted by Anoop John <an...@gmail.com>.
Thanks for the details Stack.  Ya I feel Sean's idea would give the devs
the cleanest way. The shaded (possibly patched 3rd party libs) available in
our related repo.  I like it.

The shaded client and server artifacts is giving a single fat jar right?
This includes hbase stuff+ all 3rd parties shaded. That can co exists may
be. Need change in their build steps may be. If we shade all of
dependencies into our related repo, this might not be much diff from the
shaded client/server stuff then.

Anoop


On Sunday, October 2, 2016, Jerry He <je...@gmail.com> wrote:
> How is the proposed going to impact the existing shaded-client and
> shaded-server modules, making them unnecessary and go away?
> It doesn't seem so.  These modules are supposed to shade HBase and
upstream
> from downstream users.
> The proposed shades and protects hbase, but upstream dependencies can
still
> leak into downstream?
>
> Thanks.
>
> Jerry
>
> On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <an...@gmail.com>
> wrote:
>
>> > Sean has suggested a pre-build step where in another repo we'd make
hbase
>> > shaded versions of critical libs, 'release' them (votes, etc.) and then
>> > have core depend on these. It be a bunch of work but would make the
dev's
>> > life easier.
>>
>> So when we make changes that require updates to and rebuild of the
>> supporting libraries, as a developer I would make local changes, install
a
>> snapshot of that into the local maven cache, then point the HBase build
at
>> the snapshot, then do the other half of the work, then push up to both?
>>
>> I think this could work.
>

Re: [DISCUSS] More Shading

Posted by Jerry He <je...@gmail.com>.
How is the proposed going to impact the existing shaded-client and
shaded-server modules, making them unnecessary and go away?
It doesn't seem so.  These modules are supposed to shade HBase and upstream
from downstream users.
The proposed shades and protects hbase, but upstream dependencies can still
leak into downstream?

Thanks.

Jerry

On Sat, Oct 1, 2016 at 2:33 PM, Andrew Purtell <an...@gmail.com>
wrote:

> > Sean has suggested a pre-build step where in another repo we'd make hbase
> > shaded versions of critical libs, 'release' them (votes, etc.) and then
> > have core depend on these. It be a bunch of work but would make the dev's
> > life easier.
>
> So when we make changes that require updates to and rebuild of the
> supporting libraries, as a developer I would make local changes, install a
> snapshot of that into the local maven cache, then point the HBase build at
> the snapshot, then do the other half of the work, then push up to both?
>
> I think this could work.

Re: [DISCUSS] More Shading

Posted by Andrew Purtell <an...@gmail.com>.
> Sean has suggested a pre-build step where in another repo we'd make hbase
> shaded versions of critical libs, 'release' them (votes, etc.) and then
> have core depend on these. It be a bunch of work but would make the dev's
> life easier.

So when we make changes that require updates to and rebuild of the supporting libraries, as a developer I would make local changes, install a snapshot of that into the local maven cache, then point the HBase build at the snapshot, then do the other half of the work, then push up to both? 

I think this could work.