You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Ismaël Mejía <ie...@gmail.com> on 2019/01/14 10:55:03 UTC

Merge of vendored Guava (Some PRs need a rebase)

We merged today the PR [1] that changes most of the code to use our
new guava vendored dependency. In practice it means that most of the
imports of the classes were changed from `com.google.common.` to
`org.apache.beam.vendor.guava.v20_0.com.google.common.`

This is a great improvement to fix a long existing problem of guava
leaking through some Beam modules. This also reduces the size of most
jars in the project because they don't need to relocate and include
guava anymore, they just use the vendored dependency.

Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
worked (are working) to make this possible.

Sadly as a side effect of the merge of this PR multiple PRs were
broken so please review if yours was and do a rebase and fix the
imports to use the new vendored dependency. Sorry for the
inconvenience. From now one all uses of guava should use the vendored
version. Expect some updates in the docs.

[1]  https://github.com/apache/beam/pull/6809

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Gleb Kanterov <gl...@spotify.com>.
Ismaël, I was looking into BEAM-5723, is it possible to relocate both guava
and Cassandra client instead of not relocating Guava in BEAM-6620?

On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov <gl...@spotify.com> wrote:

> I agree with the points that Kenneth has raised, mainly:
>
> > In both of the above approaches, diamond dependency problems between IOs
> are quite possible.
>
> Option 1. With having more IO-s in Beam we would start hitting diamond
> dependency problem more often. Relocating dependencies will help, but for
> this, we should avoid exposing relocated classes to end-users of IO-s. I
> can't speak about everything, but in the case of BigtableIO, it only
> exposes proto-classes that aren't part of bigtable-client-core, and
> shouldn't be relocated.
>
> Option 2. Without relocation, every other IO can be potentially broken,
> and we can solve this problem on a case-by-case basis. In maven
> world situation becomes a little better with requireUpperBoundDeps [1] from
> maven-enforcer-plugin. I don't know if there is a similar solution for
> gradle.
>
> Option 3. There is a potential future solution for dependency conflicts
> between IO-s with Java 9 JPMS [2], however, it could take a while before we
> could use it due to compatibility issues.
>
> As a short term solution, option 2 seems the best, we could go through
> known conflicts and see if it's possible to resolve them, potentially
> looking into option 1 that would take much more time.
>
> [1]:
> https://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html
> [2]: https://en.wikipedia.org/wiki/Java_Platform_Module_System
>
>
> On Mon, Mar 4, 2019 at 4:45 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> That looks interesting but I am not sure if I understand correctly,
>> isn't the problem that the system API (Bigtable, Cassandra, etc)
>> exposes guava related stuff? Or in other words, wouldn't the
>> transitivie version of guava leak anyway?
>> If it does not I am pretty interested on doing this to fix the
>> Cassandra IO from leaking too.
>> https://issues.apache.org/jira/browse/BEAM-5723
>>
>> On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <ke...@apache.org> wrote:
>> >
>> > If someone is using BigTableIO with bigtable-client-core then having
>> BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
>> right? Specifically, a user of BigTableIO after
>> https://github.com/apache/beam/pull/7957 will still have non-vendored
>> Guava on the classpath due to the transitive deps of bigtable-client-core.
>> >
>> > In any case it seems very wrong for the Beam root project to manage the
>> version of Guava in BigTableIO since the whole point is to be compatible
>> with bigtable-client-core. Would it work to delete our pinned Guava version
>> [1] and chase down all the places it breaks, moving Guava dependencies
>> local to places where an IO or extension must use it for interop? Then you
>> don't need adapters.
>> >
>> > In both of the above approaches, diamond dependency problems between
>> IOs are quite possible.
>> >
>> > I don't know if we can do better. For example, producing a
>> bigtable-client-core where we have relocated Guava internally and using
>> that could really be an interop nightmare as things that look like the same
>> type would not be. Less likely to be broken would be bigtable-client-core
>> entirely relocated and vendored, but generally IO connectors exchange
>> objects with users and the users would have to use the relocated versions,
>> so that's gross.
>> >
>> > Kenn
>> >
>> > [1]
>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
>> >
>> >
>> > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com> wrote:
>> >>
>> >> For the past week, two independent people have asked me if I can help
>> with guava NoSuchMethodError in BigtableIO. It turns out we still have a
>> potential problem with dependencies that don't vendor guava, in this case,
>> it was bigtable-client-core that depends on guava-26.0. However, the root
>> cause of the classpath problem was in the usage of a deprecated method from
>> non-vendored guava in BigtableServiceClientImpl in the code path where we
>> integrate with bigtable client.
>> >>
>> >> I created apache/beam#7957 [1] to fix that. There few other IO-s where
>> we use non-vendored guava that we can fix using adapters.
>> >>
>> >> And there is an unknown number of conflicts between guava versions in
>> our dependencies that don't vendor it, that as I understand it, could be
>> fixed by relocating them, in a similar way we do for Calcite [2].
>> >>
>> >> [1]: https://github.com/apache/beam/pull/7957
>> >> [2]:
>> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>> >>
>> >> Gleb
>> >>
>> >> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com>
>> wrote:
>> >>>
>> >>> I didn't look deep into it, but it seems we can put
>> .idea/codeInsightSettings.xml into our repository where we blacklist
>> packages from auto-import. See an example in
>> JetBrains/kotlin/.idea/codeInsightSettings.xml.
>> >>>
>> >>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
>> >>>>
>> >>>> Bad IDEs automatically generate the wrong import. I think we need to
>> automatically prevent this, otherwise the bad imports will inevitably slip
>> back in.
>> >>>>
>> >>>> Reuven
>> >>>>
>> >>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <
>> lukasz.gajowy@gmail.com> wrote:
>> >>>>>
>> >>>>> Great news. Thanks all for this work!
>> >>>>>
>> >>>>> +1 to enforcing this on dependency level as Kenn suggested.
>> >>>>>
>> >>>>> Łukasz
>> >>>>>
>> >>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org>
>> napisał(a):
>> >>>>>>
>> >>>>>> We can enforce at the dependency level, since it is a compile
>> error. I think some IDEs and build tools may allow the compile-time
>> classpath to get polluted by transitive runtime deps, so protecting against
>> bad imports is also a good idea.
>> >>>>>>
>> >>>>>> Kenn
>> >>>>>>
>> >>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> Not yet, we need to add that too, there are still some tasks to be
>> >>>>>>> done like improve the contribution guide with this info, and
>> document
>> >>>>>>> how to  generate a src build artifact locally since I doubt we can
>> >>>>>>> publish that into Apache for copyright reasons.
>> >>>>>>> I will message in the future for awareness for awareness when
>> most of
>> >>>>>>> the pending tasks are finished.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <
>> mxm@apache.org> wrote:
>> >>>>>>> >
>> >>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored
>> Guava version is used
>> >>>>>>> > everywhere now.
>> >>>>>>> >
>> >>>>>>> > Do we already have a Checkstyle rule that prevents people from
>> using the
>> >>>>>>> > unvendored Guava? If not, such a rule could be useful because
>> it's almost
>> >>>>>>> > inevitable that the unvedored Guava will slip back in.
>> >>>>>>> >
>> >>>>>>> > Cheers,
>> >>>>>>> > Max
>> >>>>>>> >
>> >>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>> >>>>>>> > > We merged today the PR [1] that changes most of the code to
>> use our
>> >>>>>>> > > new guava vendored dependency. In practice it means that most
>> of the
>> >>>>>>> > > imports of the classes were changed from `com.google.common.`
>> to
>> >>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>> >>>>>>> > >
>> >>>>>>> > > This is a great improvement to fix a long existing problem of
>> guava
>> >>>>>>> > > leaking through some Beam modules. This also reduces the size
>> of most
>> >>>>>>> > > jars in the project because they don't need to relocate and
>> include
>> >>>>>>> > > guava anymore, they just use the vendored dependency.
>> >>>>>>> > >
>> >>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the
>> others that
>> >>>>>>> > > worked (are working) to make this possible.
>> >>>>>>> > >
>> >>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs
>> were
>> >>>>>>> > > broken so please review if yours was and do a rebase and fix
>> the
>> >>>>>>> > > imports to use the new vendored dependency. Sorry for the
>> >>>>>>> > > inconvenience. From now one all uses of guava should use the
>> vendored
>> >>>>>>> > > version. Expect some updates in the docs.
>> >>>>>>> > >
>> >>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>> >>>>>>> > >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Cheers,
>> >>> Gleb
>> >>
>> >>
>> >>
>> >> --
>> >> Cheers,
>> >> Gleb
>>
>
>
> --
> Cheers,
> Gleb
>


-- 
Cheers,
Gleb

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Michael Luckey <ad...@gmail.com>.
Honestly, I dod not even consider Runner and Beam FileSystem dependencies
here, which will most likely also add some complexity to the problem here.
And there might be more.

On Thu, Mar 14, 2019 at 8:53 PM Kenneth Knowles <ke...@apache.org> wrote:

> Thanks for the super-detailed explanation and illustration.
>
> I especially think you emphasized something we generally have not
> considered: the end-user pinning a different version of a client lib, and
> the fact that if we vendor the client lib they cannot change the version.
> To me, that makes vendoring client libs kind of a deal-breaker. So I think:
>
> a) yes, solved by vendored (already done)
> b) leaving the IO's version of Guava on the classpath seems the only
> solution, and end-user has to resolve
> c) good point that in the not-on-API-surface case also, we can't really
> tweak anything and we need the end-user to resolve
>
> So the main action item is to remove the top-level Guava version and to
> let IOs determine their Guava version, with resolution of diamond deps when
> needed in test configurations.
>
> Kenn
>
> On Wed, Mar 6, 2019 at 3:18 AM Michael Luckey <ad...@gmail.com> wrote:
>
>> Unfortunately, I am unable to see, how option 2 could be done from beam
>> side? Or is it meant to delay to the end user?
>>
>> Let me try to elaborate:
>>
>> Assume user implements a pipeline with 2 different IOs, both depending on
>> incompatible versions of e.g. guava. Also, user code has a (direct or
>> transitive) dependency to
>> yet another incompatible version of guava. I ll try to get into more
>> details later, but for now assume
>>
>>      AIO   ----     ParDo    ----    ParDo   ----     Pardo   ----    BIO
>>        |                                          |
>>                     |
>>     guava-X.jar                       guava-Y.jar
>>  guava-Z.jar
>>
>> What currently happens (AFAIU) is during build we do pin the versions of
>> guava to some fixed version [1], which means AIO and BIO are compiled with
>> guava-20, which might or might not
>> be ABI compatible to guava-X and/or guava-Z. (This is the problem these 2
>> users encountered, right?)
>>
>> Now the user trying to run this pipeline has to decide, which version to
>> use. Whatever she chooses the might be incompatibilities, She could do so
>> with maven, and of course also with Gradle or any other build system
>> (hopefully).
>>
>> If we now replace guava.20 with the newest on build, lets say
>> guava-LATEST which happens to be a dependency of CIO, how would that change
>> the game. So is there anything beam could do about it?
>>
>> Now I try to understand those deps in more detail. We have those IOs, and
>> the user code thingy. I assume, beam code switched to vendors guava, so
>> nothing is exposed here, we only have issues on external interfaces as IOs.
>>
>> 1. The user code.
>> Whether guava is used directly or transitive by some used library
>> dependency, there is nothing beam can do about it. The user has to solve
>> this issue by herself. She will probably encounter exactly the same
>> problems as beam within the IOs,
>>
>> 2. IOs
>>
>> a: Beam IO code uses guava, no transitive deps. This is solved by
>> vendoring, right?
>>
>>    SomeIO <--- guava-20.jar           is replaced by
>>  SomeIO <--- vendored_guava_v20.ja
>>
>> b: Beam uses some client lib (e.g. some-client-core) which has a
>> dependency on guava but does not expose any guava class to the outside
>>
>>    SomeIO <---- some-client-lib <---- guava-X.jar
>>
>>  In this case beam 'could' relocate the client lib and 'solve' that guava
>> exposing on the class path, *but* we might be tight to a specific
>> version of the backend. as the user can not replace the 'some-client-core'
>> dependency with a different version (hoping there are no interop issues
>> with beams IO)
>> I guess, we are better off, not doing that? Which unfortuantely means
>> delaying version resolution to the end user, as transitive dependency will
>> end up on the end user runtime classpath and might cause diamond
>> dependency issues.
>>
>> c: Beam uses some client lib (e.g. bigquery-client-core) which has a
>> dependency on guava which does expose guava classes in its external API
>>
>>    SomeIO <---- some-client-lib <---- guava-X.jar
>>       |                                                      |
>>       -----------------------------------------------
>>
>> We could of course also try to shade here with above drawbacks, but this
>> might not even work as the client lib communicates with the backend and
>> might also expose guava there. So the best we could do is, as Kenn stated,
>> to ensure interoperability SomeIO with the client.
>>
>> So to my current understanding I second Kenn here. We should stop forcing
>> a specific version of (of course uncensored) guava and align on IO basis to
>> the requirements of dependencies. Still, on that lower (IO) level we might
>> need to force some version as already there we might have diamond deps on
>> guava. But I think we are better of not to force a beam global version here.
>>
>> Of course, that will force the user to handle the problem, but she needs
>> to do it anyway, and, unfortunately me also fails to see anything we could
>> do better here. (At least current forcing to v20 seems to worsen the
>> problem).
>>
>> [1]
>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L351
>>
>> On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov <gl...@spotify.com> wrote:
>>
>>> I agree with the points that Kenneth has raised, mainly:
>>>
>>> > In both of the above approaches, diamond dependency problems between
>>> IOs are quite possible.
>>>
>>> Option 1. With having more IO-s in Beam we would start hitting diamond
>>> dependency problem more often. Relocating dependencies will help, but for
>>> this, we should avoid exposing relocated classes to end-users of IO-s. I
>>> can't speak about everything, but in the case of BigtableIO, it only
>>> exposes proto-classes that aren't part of bigtable-client-core, and
>>> shouldn't be relocated.
>>>
>>> Option 2. Without relocation, every other IO can be potentially broken,
>>> and we can solve this problem on a case-by-case basis. In maven
>>> world situation becomes a little better with requireUpperBoundDeps [1] from
>>> maven-enforcer-plugin. I don't know if there is a similar solution for
>>> gradle.
>>>
>>> Option 3. There is a potential future solution for dependency conflicts
>>> between IO-s with Java 9 JPMS [2], however, it could take a while before we
>>> could use it due to compatibility issues.
>>>
>>> As a short term solution, option 2 seems the best, we could go through
>>> known conflicts and see if it's possible to resolve them, potentially
>>> looking into option 1 that would take much more time.
>>>
>>> [1]:
>>> https://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html
>>> [2]: https://en.wikipedia.org/wiki/Java_Platform_Module_System
>>>
>>>
>>> On Mon, Mar 4, 2019 at 4:45 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> That looks interesting but I am not sure if I understand correctly,
>>>> isn't the problem that the system API (Bigtable, Cassandra, etc)
>>>> exposes guava related stuff? Or in other words, wouldn't the
>>>> transitivie version of guava leak anyway?
>>>> If it does not I am pretty interested on doing this to fix the
>>>> Cassandra IO from leaking too.
>>>> https://issues.apache.org/jira/browse/BEAM-5723
>>>>
>>>> On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>> >
>>>> > If someone is using BigTableIO with bigtable-client-core then having
>>>> BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
>>>> right? Specifically, a user of BigTableIO after
>>>> https://github.com/apache/beam/pull/7957 will still have non-vendored
>>>> Guava on the classpath due to the transitive deps of bigtable-client-core.
>>>> >
>>>> > In any case it seems very wrong for the Beam root project to manage
>>>> the version of Guava in BigTableIO since the whole point is to be
>>>> compatible with bigtable-client-core. Would it work to delete our pinned
>>>> Guava version [1] and chase down all the places it breaks, moving Guava
>>>> dependencies local to places where an IO or extension must use it for
>>>> interop? Then you don't need adapters.
>>>> >
>>>> > In both of the above approaches, diamond dependency problems between
>>>> IOs are quite possible.
>>>> >
>>>> > I don't know if we can do better. For example, producing a
>>>> bigtable-client-core where we have relocated Guava internally and using
>>>> that could really be an interop nightmare as things that look like the same
>>>> type would not be. Less likely to be broken would be bigtable-client-core
>>>> entirely relocated and vendored, but generally IO connectors exchange
>>>> objects with users and the users would have to use the relocated versions,
>>>> so that's gross.
>>>> >
>>>> > Kenn
>>>> >
>>>> > [1]
>>>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
>>>> >
>>>> >
>>>> > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com>
>>>> wrote:
>>>> >>
>>>> >> For the past week, two independent people have asked me if I can
>>>> help with guava NoSuchMethodError in BigtableIO. It turns out we still have
>>>> a potential problem with dependencies that don't vendor guava, in this
>>>> case, it was bigtable-client-core that depends on guava-26.0. However, the
>>>> root cause of the classpath problem was in the usage of a deprecated method
>>>> from non-vendored guava in BigtableServiceClientImpl in the code path where
>>>> we integrate with bigtable client.
>>>> >>
>>>> >> I created apache/beam#7957 [1] to fix that. There few other IO-s
>>>> where we use non-vendored guava that we can fix using adapters.
>>>> >>
>>>> >> And there is an unknown number of conflicts between guava versions
>>>> in our dependencies that don't vendor it, that as I understand it, could be
>>>> fixed by relocating them, in a similar way we do for Calcite [2].
>>>> >>
>>>> >> [1]: https://github.com/apache/beam/pull/7957
>>>> >> [2]:
>>>> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>>>> >>
>>>> >> Gleb
>>>> >>
>>>> >> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com>
>>>> wrote:
>>>> >>>
>>>> >>> I didn't look deep into it, but it seems we can put
>>>> .idea/codeInsightSettings.xml into our repository where we blacklist
>>>> packages from auto-import. See an example in
>>>> JetBrains/kotlin/.idea/codeInsightSettings.xml.
>>>> >>>
>>>> >>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com>
>>>> wrote:
>>>> >>>>
>>>> >>>> Bad IDEs automatically generate the wrong import. I think we need
>>>> to automatically prevent this, otherwise the bad imports will inevitably
>>>> slip back in.
>>>> >>>>
>>>> >>>> Reuven
>>>> >>>>
>>>> >>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <
>>>> lukasz.gajowy@gmail.com> wrote:
>>>> >>>>>
>>>> >>>>> Great news. Thanks all for this work!
>>>> >>>>>
>>>> >>>>> +1 to enforcing this on dependency level as Kenn suggested.
>>>> >>>>>
>>>> >>>>> Łukasz
>>>> >>>>>
>>>> >>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org>
>>>> napisał(a):
>>>> >>>>>>
>>>> >>>>>> We can enforce at the dependency level, since it is a compile
>>>> error. I think some IDEs and build tools may allow the compile-time
>>>> classpath to get polluted by transitive runtime deps, so protecting against
>>>> bad imports is also a good idea.
>>>> >>>>>>
>>>> >>>>>> Kenn
>>>> >>>>>>
>>>> >>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com>
>>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Not yet, we need to add that too, there are still some tasks to
>>>> be
>>>> >>>>>>> done like improve the contribution guide with this info, and
>>>> document
>>>> >>>>>>> how to  generate a src build artifact locally since I doubt we
>>>> can
>>>> >>>>>>> publish that into Apache for copyright reasons.
>>>> >>>>>>> I will message in the future for awareness for awareness when
>>>> most of
>>>> >>>>>>> the pending tasks are finished.
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <
>>>> mxm@apache.org> wrote:
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored
>>>> Guava version is used
>>>> >>>>>>> > everywhere now.
>>>> >>>>>>> >
>>>> >>>>>>> > Do we already have a Checkstyle rule that prevents people
>>>> from using the
>>>> >>>>>>> > unvendored Guava? If not, such a rule could be useful because
>>>> it's almost
>>>> >>>>>>> > inevitable that the unvedored Guava will slip back in.
>>>> >>>>>>> >
>>>> >>>>>>> > Cheers,
>>>> >>>>>>> > Max
>>>> >>>>>>> >
>>>> >>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>>> >>>>>>> > > We merged today the PR [1] that changes most of the code to
>>>> use our
>>>> >>>>>>> > > new guava vendored dependency. In practice it means that
>>>> most of the
>>>> >>>>>>> > > imports of the classes were changed from
>>>> `com.google.common.` to
>>>> >>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>>> >>>>>>> > >
>>>> >>>>>>> > > This is a great improvement to fix a long existing problem
>>>> of guava
>>>> >>>>>>> > > leaking through some Beam modules. This also reduces the
>>>> size of most
>>>> >>>>>>> > > jars in the project because they don't need to relocate and
>>>> include
>>>> >>>>>>> > > guava anymore, they just use the vendored dependency.
>>>> >>>>>>> > >
>>>> >>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the
>>>> others that
>>>> >>>>>>> > > worked (are working) to make this possible.
>>>> >>>>>>> > >
>>>> >>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs
>>>> were
>>>> >>>>>>> > > broken so please review if yours was and do a rebase and
>>>> fix the
>>>> >>>>>>> > > imports to use the new vendored dependency. Sorry for the
>>>> >>>>>>> > > inconvenience. From now one all uses of guava should use
>>>> the vendored
>>>> >>>>>>> > > version. Expect some updates in the docs.
>>>> >>>>>>> > >
>>>> >>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>>> >>>>>>> > >
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Cheers,
>>>> >>> Gleb
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Cheers,
>>>> >> Gleb
>>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Gleb
>>>
>>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Kenneth Knowles <ke...@apache.org>.
Thanks for the super-detailed explanation and illustration.

I especially think you emphasized something we generally have not
considered: the end-user pinning a different version of a client lib, and
the fact that if we vendor the client lib they cannot change the version.
To me, that makes vendoring client libs kind of a deal-breaker. So I think:

a) yes, solved by vendored (already done)
b) leaving the IO's version of Guava on the classpath seems the only
solution, and end-user has to resolve
c) good point that in the not-on-API-surface case also, we can't really
tweak anything and we need the end-user to resolve

So the main action item is to remove the top-level Guava version and to let
IOs determine their Guava version, with resolution of diamond deps when
needed in test configurations.

Kenn

On Wed, Mar 6, 2019 at 3:18 AM Michael Luckey <ad...@gmail.com> wrote:

> Unfortunately, I am unable to see, how option 2 could be done from beam
> side? Or is it meant to delay to the end user?
>
> Let me try to elaborate:
>
> Assume user implements a pipeline with 2 different IOs, both depending on
> incompatible versions of e.g. guava. Also, user code has a (direct or
> transitive) dependency to
> yet another incompatible version of guava. I ll try to get into more
> details later, but for now assume
>
>      AIO   ----     ParDo    ----    ParDo   ----     Pardo   ----    BIO
>        |                                          |
>                     |
>     guava-X.jar                       guava-Y.jar
>  guava-Z.jar
>
> What currently happens (AFAIU) is during build we do pin the versions of
> guava to some fixed version [1], which means AIO and BIO are compiled with
> guava-20, which might or might not
> be ABI compatible to guava-X and/or guava-Z. (This is the problem these 2
> users encountered, right?)
>
> Now the user trying to run this pipeline has to decide, which version to
> use. Whatever she chooses the might be incompatibilities, She could do so
> with maven, and of course also with Gradle or any other build system
> (hopefully).
>
> If we now replace guava.20 with the newest on build, lets say guava-LATEST
> which happens to be a dependency of CIO, how would that change the game. So
> is there anything beam could do about it?
>
> Now I try to understand those deps in more detail. We have those IOs, and
> the user code thingy. I assume, beam code switched to vendors guava, so
> nothing is exposed here, we only have issues on external interfaces as IOs.
>
> 1. The user code.
> Whether guava is used directly or transitive by some used library
> dependency, there is nothing beam can do about it. The user has to solve
> this issue by herself. She will probably encounter exactly the same
> problems as beam within the IOs,
>
> 2. IOs
>
> a: Beam IO code uses guava, no transitive deps. This is solved by
> vendoring, right?
>
>    SomeIO <--- guava-20.jar           is replaced by
>  SomeIO <--- vendored_guava_v20.ja
>
> b: Beam uses some client lib (e.g. some-client-core) which has a
> dependency on guava but does not expose any guava class to the outside
>
>    SomeIO <---- some-client-lib <---- guava-X.jar
>
>  In this case beam 'could' relocate the client lib and 'solve' that guava
> exposing on the class path, *but* we might be tight to a specific version
> of the backend. as the user can not replace the 'some-client-core'
> dependency with a different version (hoping there are no interop issues
> with beams IO)
> I guess, we are better off, not doing that? Which unfortuantely means
> delaying version resolution to the end user, as transitive dependency will
> end up on the end user runtime classpath and might cause diamond
> dependency issues.
>
> c: Beam uses some client lib (e.g. bigquery-client-core) which has a
> dependency on guava which does expose guava classes in its external API
>
>    SomeIO <---- some-client-lib <---- guava-X.jar
>       |                                                      |
>       -----------------------------------------------
>
> We could of course also try to shade here with above drawbacks, but this
> might not even work as the client lib communicates with the backend and
> might also expose guava there. So the best we could do is, as Kenn stated,
> to ensure interoperability SomeIO with the client.
>
> So to my current understanding I second Kenn here. We should stop forcing
> a specific version of (of course uncensored) guava and align on IO basis to
> the requirements of dependencies. Still, on that lower (IO) level we might
> need to force some version as already there we might have diamond deps on
> guava. But I think we are better of not to force a beam global version here.
>
> Of course, that will force the user to handle the problem, but she needs
> to do it anyway, and, unfortunately me also fails to see anything we could
> do better here. (At least current forcing to v20 seems to worsen the
> problem).
>
> [1]
> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L351
>
> On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov <gl...@spotify.com> wrote:
>
>> I agree with the points that Kenneth has raised, mainly:
>>
>> > In both of the above approaches, diamond dependency problems between
>> IOs are quite possible.
>>
>> Option 1. With having more IO-s in Beam we would start hitting diamond
>> dependency problem more often. Relocating dependencies will help, but for
>> this, we should avoid exposing relocated classes to end-users of IO-s. I
>> can't speak about everything, but in the case of BigtableIO, it only
>> exposes proto-classes that aren't part of bigtable-client-core, and
>> shouldn't be relocated.
>>
>> Option 2. Without relocation, every other IO can be potentially broken,
>> and we can solve this problem on a case-by-case basis. In maven
>> world situation becomes a little better with requireUpperBoundDeps [1] from
>> maven-enforcer-plugin. I don't know if there is a similar solution for
>> gradle.
>>
>> Option 3. There is a potential future solution for dependency conflicts
>> between IO-s with Java 9 JPMS [2], however, it could take a while before we
>> could use it due to compatibility issues.
>>
>> As a short term solution, option 2 seems the best, we could go through
>> known conflicts and see if it's possible to resolve them, potentially
>> looking into option 1 that would take much more time.
>>
>> [1]:
>> https://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html
>> [2]: https://en.wikipedia.org/wiki/Java_Platform_Module_System
>>
>>
>> On Mon, Mar 4, 2019 at 4:45 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> That looks interesting but I am not sure if I understand correctly,
>>> isn't the problem that the system API (Bigtable, Cassandra, etc)
>>> exposes guava related stuff? Or in other words, wouldn't the
>>> transitivie version of guava leak anyway?
>>> If it does not I am pretty interested on doing this to fix the
>>> Cassandra IO from leaking too.
>>> https://issues.apache.org/jira/browse/BEAM-5723
>>>
>>> On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <ke...@apache.org> wrote:
>>> >
>>> > If someone is using BigTableIO with bigtable-client-core then having
>>> BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
>>> right? Specifically, a user of BigTableIO after
>>> https://github.com/apache/beam/pull/7957 will still have non-vendored
>>> Guava on the classpath due to the transitive deps of bigtable-client-core.
>>> >
>>> > In any case it seems very wrong for the Beam root project to manage
>>> the version of Guava in BigTableIO since the whole point is to be
>>> compatible with bigtable-client-core. Would it work to delete our pinned
>>> Guava version [1] and chase down all the places it breaks, moving Guava
>>> dependencies local to places where an IO or extension must use it for
>>> interop? Then you don't need adapters.
>>> >
>>> > In both of the above approaches, diamond dependency problems between
>>> IOs are quite possible.
>>> >
>>> > I don't know if we can do better. For example, producing a
>>> bigtable-client-core where we have relocated Guava internally and using
>>> that could really be an interop nightmare as things that look like the same
>>> type would not be. Less likely to be broken would be bigtable-client-core
>>> entirely relocated and vendored, but generally IO connectors exchange
>>> objects with users and the users would have to use the relocated versions,
>>> so that's gross.
>>> >
>>> > Kenn
>>> >
>>> > [1]
>>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
>>> >
>>> >
>>> > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com>
>>> wrote:
>>> >>
>>> >> For the past week, two independent people have asked me if I can help
>>> with guava NoSuchMethodError in BigtableIO. It turns out we still have a
>>> potential problem with dependencies that don't vendor guava, in this case,
>>> it was bigtable-client-core that depends on guava-26.0. However, the root
>>> cause of the classpath problem was in the usage of a deprecated method from
>>> non-vendored guava in BigtableServiceClientImpl in the code path where we
>>> integrate with bigtable client.
>>> >>
>>> >> I created apache/beam#7957 [1] to fix that. There few other IO-s
>>> where we use non-vendored guava that we can fix using adapters.
>>> >>
>>> >> And there is an unknown number of conflicts between guava versions in
>>> our dependencies that don't vendor it, that as I understand it, could be
>>> fixed by relocating them, in a similar way we do for Calcite [2].
>>> >>
>>> >> [1]: https://github.com/apache/beam/pull/7957
>>> >> [2]:
>>> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>>> >>
>>> >> Gleb
>>> >>
>>> >> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com>
>>> wrote:
>>> >>>
>>> >>> I didn't look deep into it, but it seems we can put
>>> .idea/codeInsightSettings.xml into our repository where we blacklist
>>> packages from auto-import. See an example in
>>> JetBrains/kotlin/.idea/codeInsightSettings.xml.
>>> >>>
>>> >>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
>>> >>>>
>>> >>>> Bad IDEs automatically generate the wrong import. I think we need
>>> to automatically prevent this, otherwise the bad imports will inevitably
>>> slip back in.
>>> >>>>
>>> >>>> Reuven
>>> >>>>
>>> >>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <
>>> lukasz.gajowy@gmail.com> wrote:
>>> >>>>>
>>> >>>>> Great news. Thanks all for this work!
>>> >>>>>
>>> >>>>> +1 to enforcing this on dependency level as Kenn suggested.
>>> >>>>>
>>> >>>>> Łukasz
>>> >>>>>
>>> >>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org>
>>> napisał(a):
>>> >>>>>>
>>> >>>>>> We can enforce at the dependency level, since it is a compile
>>> error. I think some IDEs and build tools may allow the compile-time
>>> classpath to get polluted by transitive runtime deps, so protecting against
>>> bad imports is also a good idea.
>>> >>>>>>
>>> >>>>>> Kenn
>>> >>>>>>
>>> >>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com>
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> Not yet, we need to add that too, there are still some tasks to
>>> be
>>> >>>>>>> done like improve the contribution guide with this info, and
>>> document
>>> >>>>>>> how to  generate a src build artifact locally since I doubt we
>>> can
>>> >>>>>>> publish that into Apache for copyright reasons.
>>> >>>>>>> I will message in the future for awareness for awareness when
>>> most of
>>> >>>>>>> the pending tasks are finished.
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <
>>> mxm@apache.org> wrote:
>>> >>>>>>> >
>>> >>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored
>>> Guava version is used
>>> >>>>>>> > everywhere now.
>>> >>>>>>> >
>>> >>>>>>> > Do we already have a Checkstyle rule that prevents people from
>>> using the
>>> >>>>>>> > unvendored Guava? If not, such a rule could be useful because
>>> it's almost
>>> >>>>>>> > inevitable that the unvedored Guava will slip back in.
>>> >>>>>>> >
>>> >>>>>>> > Cheers,
>>> >>>>>>> > Max
>>> >>>>>>> >
>>> >>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>> >>>>>>> > > We merged today the PR [1] that changes most of the code to
>>> use our
>>> >>>>>>> > > new guava vendored dependency. In practice it means that
>>> most of the
>>> >>>>>>> > > imports of the classes were changed from
>>> `com.google.common.` to
>>> >>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>> >>>>>>> > >
>>> >>>>>>> > > This is a great improvement to fix a long existing problem
>>> of guava
>>> >>>>>>> > > leaking through some Beam modules. This also reduces the
>>> size of most
>>> >>>>>>> > > jars in the project because they don't need to relocate and
>>> include
>>> >>>>>>> > > guava anymore, they just use the vendored dependency.
>>> >>>>>>> > >
>>> >>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the
>>> others that
>>> >>>>>>> > > worked (are working) to make this possible.
>>> >>>>>>> > >
>>> >>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs
>>> were
>>> >>>>>>> > > broken so please review if yours was and do a rebase and fix
>>> the
>>> >>>>>>> > > imports to use the new vendored dependency. Sorry for the
>>> >>>>>>> > > inconvenience. From now one all uses of guava should use the
>>> vendored
>>> >>>>>>> > > version. Expect some updates in the docs.
>>> >>>>>>> > >
>>> >>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>> >>>>>>> > >
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Cheers,
>>> >>> Gleb
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Cheers,
>>> >> Gleb
>>>
>>
>>
>> --
>> Cheers,
>> Gleb
>>
>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Michael Luckey <ad...@gmail.com>.
Unfortunately, I am unable to see, how option 2 could be done from beam
side? Or is it meant to delay to the end user?

Let me try to elaborate:

Assume user implements a pipeline with 2 different IOs, both depending on
incompatible versions of e.g. guava. Also, user code has a (direct or
transitive) dependency to
yet another incompatible version of guava. I ll try to get into more
details later, but for now assume

     AIO   ----     ParDo    ----    ParDo   ----     Pardo   ----    BIO
       |                                          |
                  |
    guava-X.jar                       guava-Y.jar
 guava-Z.jar

What currently happens (AFAIU) is during build we do pin the versions of
guava to some fixed version [1], which means AIO and BIO are compiled with
guava-20, which might or might not
be ABI compatible to guava-X and/or guava-Z. (This is the problem these 2
users encountered, right?)

Now the user trying to run this pipeline has to decide, which version to
use. Whatever she chooses the might be incompatibilities, She could do so
with maven, and of course also with Gradle or any other build system
(hopefully).

If we now replace guava.20 with the newest on build, lets say guava-LATEST
which happens to be a dependency of CIO, how would that change the game. So
is there anything beam could do about it?

Now I try to understand those deps in more detail. We have those IOs, and
the user code thingy. I assume, beam code switched to vendors guava, so
nothing is exposed here, we only have issues on external interfaces as IOs.

1. The user code.
Whether guava is used directly or transitive by some used library
dependency, there is nothing beam can do about it. The user has to solve
this issue by herself. She will probably encounter exactly the same
problems as beam within the IOs,

2. IOs

a: Beam IO code uses guava, no transitive deps. This is solved by
vendoring, right?

   SomeIO <--- guava-20.jar           is replaced by
 SomeIO <--- vendored_guava_v20.ja

b: Beam uses some client lib (e.g. some-client-core) which has a dependency
on guava but does not expose any guava class to the outside

   SomeIO <---- some-client-lib <---- guava-X.jar

 In this case beam 'could' relocate the client lib and 'solve' that guava
exposing on the class path, *but* we might be tight to a specific version
of the backend. as the user can not replace the 'some-client-core'
dependency with a different version (hoping there are no interop issues
with beams IO)
I guess, we are better off, not doing that? Which unfortuantely means
delaying version resolution to the end user, as transitive dependency will
end up on the end user runtime classpath and might cause diamond dependency
issues.

c: Beam uses some client lib (e.g. bigquery-client-core) which has a
dependency on guava which does expose guava classes in its external API

   SomeIO <---- some-client-lib <---- guava-X.jar
      |                                                      |
      -----------------------------------------------

We could of course also try to shade here with above drawbacks, but this
might not even work as the client lib communicates with the backend and
might also expose guava there. So the best we could do is, as Kenn stated,
to ensure interoperability SomeIO with the client.

So to my current understanding I second Kenn here. We should stop forcing
a specific version of (of course uncensored) guava and align on IO basis to
the requirements of dependencies. Still, on that lower (IO) level we might
need to force some version as already there we might have diamond deps on
guava. But I think we are better of not to force a beam global version here.

Of course, that will force the user to handle the problem, but she needs to
do it anyway, and, unfortunately me also fails to see anything we could do
better here. (At least current forcing to v20 seems to worsen the problem).

[1]
https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L351

On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov <gl...@spotify.com> wrote:

> I agree with the points that Kenneth has raised, mainly:
>
> > In both of the above approaches, diamond dependency problems between IOs
> are quite possible.
>
> Option 1. With having more IO-s in Beam we would start hitting diamond
> dependency problem more often. Relocating dependencies will help, but for
> this, we should avoid exposing relocated classes to end-users of IO-s. I
> can't speak about everything, but in the case of BigtableIO, it only
> exposes proto-classes that aren't part of bigtable-client-core, and
> shouldn't be relocated.
>
> Option 2. Without relocation, every other IO can be potentially broken,
> and we can solve this problem on a case-by-case basis. In maven
> world situation becomes a little better with requireUpperBoundDeps [1] from
> maven-enforcer-plugin. I don't know if there is a similar solution for
> gradle.
>
> Option 3. There is a potential future solution for dependency conflicts
> between IO-s with Java 9 JPMS [2], however, it could take a while before we
> could use it due to compatibility issues.
>
> As a short term solution, option 2 seems the best, we could go through
> known conflicts and see if it's possible to resolve them, potentially
> looking into option 1 that would take much more time.
>
> [1]:
> https://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html
> [2]: https://en.wikipedia.org/wiki/Java_Platform_Module_System
>
>
> On Mon, Mar 4, 2019 at 4:45 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> That looks interesting but I am not sure if I understand correctly,
>> isn't the problem that the system API (Bigtable, Cassandra, etc)
>> exposes guava related stuff? Or in other words, wouldn't the
>> transitivie version of guava leak anyway?
>> If it does not I am pretty interested on doing this to fix the
>> Cassandra IO from leaking too.
>> https://issues.apache.org/jira/browse/BEAM-5723
>>
>> On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <ke...@apache.org> wrote:
>> >
>> > If someone is using BigTableIO with bigtable-client-core then having
>> BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
>> right? Specifically, a user of BigTableIO after
>> https://github.com/apache/beam/pull/7957 will still have non-vendored
>> Guava on the classpath due to the transitive deps of bigtable-client-core.
>> >
>> > In any case it seems very wrong for the Beam root project to manage the
>> version of Guava in BigTableIO since the whole point is to be compatible
>> with bigtable-client-core. Would it work to delete our pinned Guava version
>> [1] and chase down all the places it breaks, moving Guava dependencies
>> local to places where an IO or extension must use it for interop? Then you
>> don't need adapters.
>> >
>> > In both of the above approaches, diamond dependency problems between
>> IOs are quite possible.
>> >
>> > I don't know if we can do better. For example, producing a
>> bigtable-client-core where we have relocated Guava internally and using
>> that could really be an interop nightmare as things that look like the same
>> type would not be. Less likely to be broken would be bigtable-client-core
>> entirely relocated and vendored, but generally IO connectors exchange
>> objects with users and the users would have to use the relocated versions,
>> so that's gross.
>> >
>> > Kenn
>> >
>> > [1]
>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
>> >
>> >
>> > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com> wrote:
>> >>
>> >> For the past week, two independent people have asked me if I can help
>> with guava NoSuchMethodError in BigtableIO. It turns out we still have a
>> potential problem with dependencies that don't vendor guava, in this case,
>> it was bigtable-client-core that depends on guava-26.0. However, the root
>> cause of the classpath problem was in the usage of a deprecated method from
>> non-vendored guava in BigtableServiceClientImpl in the code path where we
>> integrate with bigtable client.
>> >>
>> >> I created apache/beam#7957 [1] to fix that. There few other IO-s where
>> we use non-vendored guava that we can fix using adapters.
>> >>
>> >> And there is an unknown number of conflicts between guava versions in
>> our dependencies that don't vendor it, that as I understand it, could be
>> fixed by relocating them, in a similar way we do for Calcite [2].
>> >>
>> >> [1]: https://github.com/apache/beam/pull/7957
>> >> [2]:
>> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>> >>
>> >> Gleb
>> >>
>> >> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com>
>> wrote:
>> >>>
>> >>> I didn't look deep into it, but it seems we can put
>> .idea/codeInsightSettings.xml into our repository where we blacklist
>> packages from auto-import. See an example in
>> JetBrains/kotlin/.idea/codeInsightSettings.xml.
>> >>>
>> >>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
>> >>>>
>> >>>> Bad IDEs automatically generate the wrong import. I think we need to
>> automatically prevent this, otherwise the bad imports will inevitably slip
>> back in.
>> >>>>
>> >>>> Reuven
>> >>>>
>> >>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <
>> lukasz.gajowy@gmail.com> wrote:
>> >>>>>
>> >>>>> Great news. Thanks all for this work!
>> >>>>>
>> >>>>> +1 to enforcing this on dependency level as Kenn suggested.
>> >>>>>
>> >>>>> Łukasz
>> >>>>>
>> >>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org>
>> napisał(a):
>> >>>>>>
>> >>>>>> We can enforce at the dependency level, since it is a compile
>> error. I think some IDEs and build tools may allow the compile-time
>> classpath to get polluted by transitive runtime deps, so protecting against
>> bad imports is also a good idea.
>> >>>>>>
>> >>>>>> Kenn
>> >>>>>>
>> >>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> Not yet, we need to add that too, there are still some tasks to be
>> >>>>>>> done like improve the contribution guide with this info, and
>> document
>> >>>>>>> how to  generate a src build artifact locally since I doubt we can
>> >>>>>>> publish that into Apache for copyright reasons.
>> >>>>>>> I will message in the future for awareness for awareness when
>> most of
>> >>>>>>> the pending tasks are finished.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <
>> mxm@apache.org> wrote:
>> >>>>>>> >
>> >>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored
>> Guava version is used
>> >>>>>>> > everywhere now.
>> >>>>>>> >
>> >>>>>>> > Do we already have a Checkstyle rule that prevents people from
>> using the
>> >>>>>>> > unvendored Guava? If not, such a rule could be useful because
>> it's almost
>> >>>>>>> > inevitable that the unvedored Guava will slip back in.
>> >>>>>>> >
>> >>>>>>> > Cheers,
>> >>>>>>> > Max
>> >>>>>>> >
>> >>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>> >>>>>>> > > We merged today the PR [1] that changes most of the code to
>> use our
>> >>>>>>> > > new guava vendored dependency. In practice it means that most
>> of the
>> >>>>>>> > > imports of the classes were changed from `com.google.common.`
>> to
>> >>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>> >>>>>>> > >
>> >>>>>>> > > This is a great improvement to fix a long existing problem of
>> guava
>> >>>>>>> > > leaking through some Beam modules. This also reduces the size
>> of most
>> >>>>>>> > > jars in the project because they don't need to relocate and
>> include
>> >>>>>>> > > guava anymore, they just use the vendored dependency.
>> >>>>>>> > >
>> >>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the
>> others that
>> >>>>>>> > > worked (are working) to make this possible.
>> >>>>>>> > >
>> >>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs
>> were
>> >>>>>>> > > broken so please review if yours was and do a rebase and fix
>> the
>> >>>>>>> > > imports to use the new vendored dependency. Sorry for the
>> >>>>>>> > > inconvenience. From now one all uses of guava should use the
>> vendored
>> >>>>>>> > > version. Expect some updates in the docs.
>> >>>>>>> > >
>> >>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>> >>>>>>> > >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Cheers,
>> >>> Gleb
>> >>
>> >>
>> >>
>> >> --
>> >> Cheers,
>> >> Gleb
>>
>
>
> --
> Cheers,
> Gleb
>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Gleb Kanterov <gl...@spotify.com>.
I agree with the points that Kenneth has raised, mainly:

> In both of the above approaches, diamond dependency problems between IOs
are quite possible.

Option 1. With having more IO-s in Beam we would start hitting diamond
dependency problem more often. Relocating dependencies will help, but for
this, we should avoid exposing relocated classes to end-users of IO-s. I
can't speak about everything, but in the case of BigtableIO, it only
exposes proto-classes that aren't part of bigtable-client-core, and
shouldn't be relocated.

Option 2. Without relocation, every other IO can be potentially broken, and
we can solve this problem on a case-by-case basis. In maven world situation
becomes a little better with requireUpperBoundDeps [1] from
maven-enforcer-plugin. I don't know if there is a similar solution for
gradle.

Option 3. There is a potential future solution for dependency conflicts
between IO-s with Java 9 JPMS [2], however, it could take a while before we
could use it due to compatibility issues.

As a short term solution, option 2 seems the best, we could go through
known conflicts and see if it's possible to resolve them, potentially
looking into option 1 that would take much more time.

[1]:
https://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html
[2]: https://en.wikipedia.org/wiki/Java_Platform_Module_System


On Mon, Mar 4, 2019 at 4:45 PM Ismaël Mejía <ie...@gmail.com> wrote:

> That looks interesting but I am not sure if I understand correctly,
> isn't the problem that the system API (Bigtable, Cassandra, etc)
> exposes guava related stuff? Or in other words, wouldn't the
> transitivie version of guava leak anyway?
> If it does not I am pretty interested on doing this to fix the
> Cassandra IO from leaking too.
> https://issues.apache.org/jira/browse/BEAM-5723
>
> On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <ke...@apache.org> wrote:
> >
> > If someone is using BigTableIO with bigtable-client-core then having
> BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
> right? Specifically, a user of BigTableIO after
> https://github.com/apache/beam/pull/7957 will still have non-vendored
> Guava on the classpath due to the transitive deps of bigtable-client-core.
> >
> > In any case it seems very wrong for the Beam root project to manage the
> version of Guava in BigTableIO since the whole point is to be compatible
> with bigtable-client-core. Would it work to delete our pinned Guava version
> [1] and chase down all the places it breaks, moving Guava dependencies
> local to places where an IO or extension must use it for interop? Then you
> don't need adapters.
> >
> > In both of the above approaches, diamond dependency problems between IOs
> are quite possible.
> >
> > I don't know if we can do better. For example, producing a
> bigtable-client-core where we have relocated Guava internally and using
> that could really be an interop nightmare as things that look like the same
> type would not be. Less likely to be broken would be bigtable-client-core
> entirely relocated and vendored, but generally IO connectors exchange
> objects with users and the users would have to use the relocated versions,
> so that's gross.
> >
> > Kenn
> >
> > [1]
> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
> >
> >
> > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com> wrote:
> >>
> >> For the past week, two independent people have asked me if I can help
> with guava NoSuchMethodError in BigtableIO. It turns out we still have a
> potential problem with dependencies that don't vendor guava, in this case,
> it was bigtable-client-core that depends on guava-26.0. However, the root
> cause of the classpath problem was in the usage of a deprecated method from
> non-vendored guava in BigtableServiceClientImpl in the code path where we
> integrate with bigtable client.
> >>
> >> I created apache/beam#7957 [1] to fix that. There few other IO-s where
> we use non-vendored guava that we can fix using adapters.
> >>
> >> And there is an unknown number of conflicts between guava versions in
> our dependencies that don't vendor it, that as I understand it, could be
> fixed by relocating them, in a similar way we do for Calcite [2].
> >>
> >> [1]: https://github.com/apache/beam/pull/7957
> >> [2]:
> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
> >>
> >> Gleb
> >>
> >> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com>
> wrote:
> >>>
> >>> I didn't look deep into it, but it seems we can put
> .idea/codeInsightSettings.xml into our repository where we blacklist
> packages from auto-import. See an example in
> JetBrains/kotlin/.idea/codeInsightSettings.xml.
> >>>
> >>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
> >>>>
> >>>> Bad IDEs automatically generate the wrong import. I think we need to
> automatically prevent this, otherwise the bad imports will inevitably slip
> back in.
> >>>>
> >>>> Reuven
> >>>>
> >>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <
> lukasz.gajowy@gmail.com> wrote:
> >>>>>
> >>>>> Great news. Thanks all for this work!
> >>>>>
> >>>>> +1 to enforcing this on dependency level as Kenn suggested.
> >>>>>
> >>>>> Łukasz
> >>>>>
> >>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org>
> napisał(a):
> >>>>>>
> >>>>>> We can enforce at the dependency level, since it is a compile
> error. I think some IDEs and build tools may allow the compile-time
> classpath to get polluted by transitive runtime deps, so protecting against
> bad imports is also a good idea.
> >>>>>>
> >>>>>> Kenn
> >>>>>>
> >>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> Not yet, we need to add that too, there are still some tasks to be
> >>>>>>> done like improve the contribution guide with this info, and
> document
> >>>>>>> how to  generate a src build artifact locally since I doubt we can
> >>>>>>> publish that into Apache for copyright reasons.
> >>>>>>> I will message in the future for awareness for awareness when most
> of
> >>>>>>> the pending tasks are finished.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org>
> wrote:
> >>>>>>> >
> >>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava
> version is used
> >>>>>>> > everywhere now.
> >>>>>>> >
> >>>>>>> > Do we already have a Checkstyle rule that prevents people from
> using the
> >>>>>>> > unvendored Guava? If not, such a rule could be useful because
> it's almost
> >>>>>>> > inevitable that the unvedored Guava will slip back in.
> >>>>>>> >
> >>>>>>> > Cheers,
> >>>>>>> > Max
> >>>>>>> >
> >>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
> >>>>>>> > > We merged today the PR [1] that changes most of the code to
> use our
> >>>>>>> > > new guava vendored dependency. In practice it means that most
> of the
> >>>>>>> > > imports of the classes were changed from `com.google.common.`
> to
> >>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
> >>>>>>> > >
> >>>>>>> > > This is a great improvement to fix a long existing problem of
> guava
> >>>>>>> > > leaking through some Beam modules. This also reduces the size
> of most
> >>>>>>> > > jars in the project because they don't need to relocate and
> include
> >>>>>>> > > guava anymore, they just use the vendored dependency.
> >>>>>>> > >
> >>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the
> others that
> >>>>>>> > > worked (are working) to make this possible.
> >>>>>>> > >
> >>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs
> were
> >>>>>>> > > broken so please review if yours was and do a rebase and fix
> the
> >>>>>>> > > imports to use the new vendored dependency. Sorry for the
> >>>>>>> > > inconvenience. From now one all uses of guava should use the
> vendored
> >>>>>>> > > version. Expect some updates in the docs.
> >>>>>>> > >
> >>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
> >>>>>>> > >
> >>>
> >>>
> >>>
> >>> --
> >>> Cheers,
> >>> Gleb
> >>
> >>
> >>
> >> --
> >> Cheers,
> >> Gleb
>


-- 
Cheers,
Gleb

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Ismaël Mejía <ie...@gmail.com>.
That looks interesting but I am not sure if I understand correctly,
isn't the problem that the system API (Bigtable, Cassandra, etc)
exposes guava related stuff? Or in other words, wouldn't the
transitivie version of guava leak anyway?
If it does not I am pretty interested on doing this to fix the
Cassandra IO from leaking too.
https://issues.apache.org/jira/browse/BEAM-5723

On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <ke...@apache.org> wrote:
>
> If someone is using BigTableIO with bigtable-client-core then having BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine, right? Specifically, a user of BigTableIO after https://github.com/apache/beam/pull/7957 will still have non-vendored Guava on the classpath due to the transitive deps of bigtable-client-core.
>
> In any case it seems very wrong for the Beam root project to manage the version of Guava in BigTableIO since the whole point is to be compatible with bigtable-client-core. Would it work to delete our pinned Guava version [1] and chase down all the places it breaks, moving Guava dependencies local to places where an IO or extension must use it for interop? Then you don't need adapters.
>
> In both of the above approaches, diamond dependency problems between IOs are quite possible.
>
> I don't know if we can do better. For example, producing a bigtable-client-core where we have relocated Guava internally and using that could really be an interop nightmare as things that look like the same type would not be. Less likely to be broken would be bigtable-client-core entirely relocated and vendored, but generally IO connectors exchange objects with users and the users would have to use the relocated versions, so that's gross.
>
> Kenn
>
> [1] https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
>
>
> On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com> wrote:
>>
>> For the past week, two independent people have asked me if I can help with guava NoSuchMethodError in BigtableIO. It turns out we still have a potential problem with dependencies that don't vendor guava, in this case, it was bigtable-client-core that depends on guava-26.0. However, the root cause of the classpath problem was in the usage of a deprecated method from non-vendored guava in BigtableServiceClientImpl in the code path where we integrate with bigtable client.
>>
>> I created apache/beam#7957 [1] to fix that. There few other IO-s where we use non-vendored guava that we can fix using adapters.
>>
>> And there is an unknown number of conflicts between guava versions in our dependencies that don't vendor it, that as I understand it, could be fixed by relocating them, in a similar way we do for Calcite [2].
>>
>> [1]: https://github.com/apache/beam/pull/7957
>> [2]: https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>>
>> Gleb
>>
>> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com> wrote:
>>>
>>> I didn't look deep into it, but it seems we can put .idea/codeInsightSettings.xml into our repository where we blacklist packages from auto-import. See an example in JetBrains/kotlin/.idea/codeInsightSettings.xml.
>>>
>>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
>>>>
>>>> Bad IDEs automatically generate the wrong import. I think we need to automatically prevent this, otherwise the bad imports will inevitably slip back in.
>>>>
>>>> Reuven
>>>>
>>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <lu...@gmail.com> wrote:
>>>>>
>>>>> Great news. Thanks all for this work!
>>>>>
>>>>> +1 to enforcing this on dependency level as Kenn suggested.
>>>>>
>>>>> Łukasz
>>>>>
>>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org> napisał(a):
>>>>>>
>>>>>> We can enforce at the dependency level, since it is a compile error. I think some IDEs and build tools may allow the compile-time classpath to get polluted by transitive runtime deps, so protecting against bad imports is also a good idea.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>>>>>
>>>>>>> Not yet, we need to add that too, there are still some tasks to be
>>>>>>> done like improve the contribution guide with this info, and document
>>>>>>> how to  generate a src build artifact locally since I doubt we can
>>>>>>> publish that into Apache for copyright reasons.
>>>>>>> I will message in the future for awareness for awareness when most of
>>>>>>> the pending tasks are finished.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org> wrote:
>>>>>>> >
>>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava version is used
>>>>>>> > everywhere now.
>>>>>>> >
>>>>>>> > Do we already have a Checkstyle rule that prevents people from using the
>>>>>>> > unvendored Guava? If not, such a rule could be useful because it's almost
>>>>>>> > inevitable that the unvedored Guava will slip back in.
>>>>>>> >
>>>>>>> > Cheers,
>>>>>>> > Max
>>>>>>> >
>>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>>>>>> > > We merged today the PR [1] that changes most of the code to use our
>>>>>>> > > new guava vendored dependency. In practice it means that most of the
>>>>>>> > > imports of the classes were changed from `com.google.common.` to
>>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>>>>>> > >
>>>>>>> > > This is a great improvement to fix a long existing problem of guava
>>>>>>> > > leaking through some Beam modules. This also reduces the size of most
>>>>>>> > > jars in the project because they don't need to relocate and include
>>>>>>> > > guava anymore, they just use the vendored dependency.
>>>>>>> > >
>>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
>>>>>>> > > worked (are working) to make this possible.
>>>>>>> > >
>>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs were
>>>>>>> > > broken so please review if yours was and do a rebase and fix the
>>>>>>> > > imports to use the new vendored dependency. Sorry for the
>>>>>>> > > inconvenience. From now one all uses of guava should use the vendored
>>>>>>> > > version. Expect some updates in the docs.
>>>>>>> > >
>>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>>>>>> > >
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Gleb
>>
>>
>>
>> --
>> Cheers,
>> Gleb

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Kenneth Knowles <ke...@apache.org>.
If someone is using BigTableIO with bigtable-client-core then having
BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
right? Specifically, a user of BigTableIO after
https://github.com/apache/beam/pull/7957 will still have non-vendored Guava
on the classpath due to the transitive deps of bigtable-client-core.

In any case it seems very wrong for the Beam root project to manage the
version of Guava in BigTableIO since the whole point is to be compatible
with bigtable-client-core. Would it work to delete our pinned Guava version
[1] and chase down all the places it breaks, moving Guava dependencies
local to places where an IO or extension must use it for interop? Then you
don't need adapters.

In both of the above approaches, diamond dependency problems between IOs
are quite possible.

I don't know if we can do better. For example, producing a
bigtable-client-core where we have relocated Guava internally and using
that could really be an interop nightmare as things that look like the same
type would not be. Less likely to be broken would be bigtable-client-core
entirely relocated and vendored, but generally IO connectors exchange
objects with users and the users would have to use the relocated versions,
so that's gross.

Kenn

[1]
https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353


On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <gl...@spotify.com> wrote:

> For the past week, two independent people have asked me if I can help with
> guava NoSuchMethodError in BigtableIO. It turns out we still have a
> potential problem with dependencies that don't vendor guava, in this case,
> it was bigtable-client-core that depends on guava-26.0. However, the root
> cause of the classpath problem was in the usage of a deprecated method from
> non-vendored guava in BigtableServiceClientImpl in the code path where we
> integrate with bigtable client.
>
> I created apache/beam#7957 <https://github.com/apache/beam/pull/7957> [1]
> to fix that. There few other IO-s where we use non-vendored guava that we
> can fix using adapters.
>
> And there is an unknown number of conflicts between guava versions in our
> dependencies that don't vendor it, that as I understand it, could be fixed
> by relocating them, in a similar way we do for Calcite [2].
>
> [1]: https://github.com/apache/beam/pull/7957
> [2]:
> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>
> Gleb
>
> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com> wrote:
>
>> I didn't look deep into it, but it seems we can put
>> .idea/codeInsightSettings.xml into our repository where we blacklist
>> packages from auto-import. See an example in
>> JetBrains/kotlin/.idea/codeInsightSettings.xml
>> <https://github.com/JetBrains/kotlin/blob/master/.idea/codeInsightSettings.xml>
>> .
>>
>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
>>
>>> Bad IDEs automatically generate the wrong import. I think we need to
>>> automatically prevent this, otherwise the bad imports will inevitably slip
>>> back in.
>>>
>>> Reuven
>>>
>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <lu...@gmail.com>
>>> wrote:
>>>
>>>> Great news. Thanks all for this work!
>>>>
>>>> +1 to enforcing this on dependency level as Kenn suggested.
>>>>
>>>> Łukasz
>>>>
>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org> napisał(a):
>>>>
>>>>> We can enforce at the dependency level, since it is a compile error. I
>>>>> think some IDEs and build tools may allow the compile-time classpath to get
>>>>> polluted by transitive runtime deps, so protecting against bad imports is
>>>>> also a good idea.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Not yet, we need to add that too, there are still some tasks to be
>>>>>> done like improve the contribution guide with this info, and document
>>>>>> how to  generate a src build artifact locally since I doubt we can
>>>>>> publish that into Apache for copyright reasons.
>>>>>> I will message in the future for awareness for awareness when most of
>>>>>> the pending tasks are finished.
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org>
>>>>>> wrote:
>>>>>> >
>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava
>>>>>> version is used
>>>>>> > everywhere now.
>>>>>> >
>>>>>> > Do we already have a Checkstyle rule that prevents people from
>>>>>> using the
>>>>>> > unvendored Guava? If not, such a rule could be useful because it's
>>>>>> almost
>>>>>> > inevitable that the unvedored Guava will slip back in.
>>>>>> >
>>>>>> > Cheers,
>>>>>> > Max
>>>>>> >
>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>>>>> > > We merged today the PR [1] that changes most of the code to use
>>>>>> our
>>>>>> > > new guava vendored dependency. In practice it means that most of
>>>>>> the
>>>>>> > > imports of the classes were changed from `com.google.common.` to
>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>>>>> > >
>>>>>> > > This is a great improvement to fix a long existing problem of
>>>>>> guava
>>>>>> > > leaking through some Beam modules. This also reduces the size of
>>>>>> most
>>>>>> > > jars in the project because they don't need to relocate and
>>>>>> include
>>>>>> > > guava anymore, they just use the vendored dependency.
>>>>>> > >
>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others
>>>>>> that
>>>>>> > > worked (are working) to make this possible.
>>>>>> > >
>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs were
>>>>>> > > broken so please review if yours was and do a rebase and fix the
>>>>>> > > imports to use the new vendored dependency. Sorry for the
>>>>>> > > inconvenience. From now one all uses of guava should use the
>>>>>> vendored
>>>>>> > > version. Expect some updates in the docs.
>>>>>> > >
>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>>>>> > >
>>>>>>
>>>>>
>>
>> --
>> Cheers,
>> Gleb
>>
>
>
> --
> Cheers,
> Gleb
>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Gleb Kanterov <gl...@spotify.com>.
For the past week, two independent people have asked me if I can help with
guava NoSuchMethodError in BigtableIO. It turns out we still have a
potential problem with dependencies that don't vendor guava, in this case,
it was bigtable-client-core that depends on guava-26.0. However, the root
cause of the classpath problem was in the usage of a deprecated method from
non-vendored guava in BigtableServiceClientImpl in the code path where we
integrate with bigtable client.

I created apache/beam#7957 <https://github.com/apache/beam/pull/7957> [1]
to fix that. There few other IO-s where we use non-vendored guava that we
can fix using adapters.

And there is an unknown number of conflicts between guava versions in our
dependencies that don't vendor it, that as I understand it, could be fixed
by relocating them, in a similar way we do for Calcite [2].

[1]: https://github.com/apache/beam/pull/7957
[2]:
https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39

Gleb

On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <gl...@spotify.com> wrote:

> I didn't look deep into it, but it seems we can put
> .idea/codeInsightSettings.xml into our repository where we blacklist
> packages from auto-import. See an example in
> JetBrains/kotlin/.idea/codeInsightSettings.xml
> <https://github.com/JetBrains/kotlin/blob/master/.idea/codeInsightSettings.xml>
> .
>
> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:
>
>> Bad IDEs automatically generate the wrong import. I think we need to
>> automatically prevent this, otherwise the bad imports will inevitably slip
>> back in.
>>
>> Reuven
>>
>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <lu...@gmail.com>
>> wrote:
>>
>>> Great news. Thanks all for this work!
>>>
>>> +1 to enforcing this on dependency level as Kenn suggested.
>>>
>>> Łukasz
>>>
>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org> napisał(a):
>>>
>>>> We can enforce at the dependency level, since it is a compile error. I
>>>> think some IDEs and build tools may allow the compile-time classpath to get
>>>> polluted by transitive runtime deps, so protecting against bad imports is
>>>> also a good idea.
>>>>
>>>> Kenn
>>>>
>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>>
>>>>> Not yet, we need to add that too, there are still some tasks to be
>>>>> done like improve the contribution guide with this info, and document
>>>>> how to  generate a src build artifact locally since I doubt we can
>>>>> publish that into Apache for copyright reasons.
>>>>> I will message in the future for awareness for awareness when most of
>>>>> the pending tasks are finished.
>>>>>
>>>>>
>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava
>>>>> version is used
>>>>> > everywhere now.
>>>>> >
>>>>> > Do we already have a Checkstyle rule that prevents people from using
>>>>> the
>>>>> > unvendored Guava? If not, such a rule could be useful because it's
>>>>> almost
>>>>> > inevitable that the unvedored Guava will slip back in.
>>>>> >
>>>>> > Cheers,
>>>>> > Max
>>>>> >
>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>>>> > > We merged today the PR [1] that changes most of the code to use our
>>>>> > > new guava vendored dependency. In practice it means that most of
>>>>> the
>>>>> > > imports of the classes were changed from `com.google.common.` to
>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>>>> > >
>>>>> > > This is a great improvement to fix a long existing problem of guava
>>>>> > > leaking through some Beam modules. This also reduces the size of
>>>>> most
>>>>> > > jars in the project because they don't need to relocate and include
>>>>> > > guava anymore, they just use the vendored dependency.
>>>>> > >
>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others
>>>>> that
>>>>> > > worked (are working) to make this possible.
>>>>> > >
>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs were
>>>>> > > broken so please review if yours was and do a rebase and fix the
>>>>> > > imports to use the new vendored dependency. Sorry for the
>>>>> > > inconvenience. From now one all uses of guava should use the
>>>>> vendored
>>>>> > > version. Expect some updates in the docs.
>>>>> > >
>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>>>> > >
>>>>>
>>>>
>
> --
> Cheers,
> Gleb
>


-- 
Cheers,
Gleb

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Gleb Kanterov <gl...@spotify.com>.
I didn't look deep into it, but it seems we can put
.idea/codeInsightSettings.xml into our repository where we blacklist
packages from auto-import. See an example in
JetBrains/kotlin/.idea/codeInsightSettings.xml
<https://github.com/JetBrains/kotlin/blob/master/.idea/codeInsightSettings.xml>
.

On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com> wrote:

> Bad IDEs automatically generate the wrong import. I think we need to
> automatically prevent this, otherwise the bad imports will inevitably slip
> back in.
>
> Reuven
>
> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <lu...@gmail.com>
> wrote:
>
>> Great news. Thanks all for this work!
>>
>> +1 to enforcing this on dependency level as Kenn suggested.
>>
>> Łukasz
>>
>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org> napisał(a):
>>
>>> We can enforce at the dependency level, since it is a compile error. I
>>> think some IDEs and build tools may allow the compile-time classpath to get
>>> polluted by transitive runtime deps, so protecting against bad imports is
>>> also a good idea.
>>>
>>> Kenn
>>>
>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Not yet, we need to add that too, there are still some tasks to be
>>>> done like improve the contribution guide with this info, and document
>>>> how to  generate a src build artifact locally since I doubt we can
>>>> publish that into Apache for copyright reasons.
>>>> I will message in the future for awareness for awareness when most of
>>>> the pending tasks are finished.
>>>>
>>>>
>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava
>>>> version is used
>>>> > everywhere now.
>>>> >
>>>> > Do we already have a Checkstyle rule that prevents people from using
>>>> the
>>>> > unvendored Guava? If not, such a rule could be useful because it's
>>>> almost
>>>> > inevitable that the unvedored Guava will slip back in.
>>>> >
>>>> > Cheers,
>>>> > Max
>>>> >
>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>>> > > We merged today the PR [1] that changes most of the code to use our
>>>> > > new guava vendored dependency. In practice it means that most of the
>>>> > > imports of the classes were changed from `com.google.common.` to
>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>>> > >
>>>> > > This is a great improvement to fix a long existing problem of guava
>>>> > > leaking through some Beam modules. This also reduces the size of
>>>> most
>>>> > > jars in the project because they don't need to relocate and include
>>>> > > guava anymore, they just use the vendored dependency.
>>>> > >
>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
>>>> > > worked (are working) to make this possible.
>>>> > >
>>>> > > Sadly as a side effect of the merge of this PR multiple PRs were
>>>> > > broken so please review if yours was and do a rebase and fix the
>>>> > > imports to use the new vendored dependency. Sorry for the
>>>> > > inconvenience. From now one all uses of guava should use the
>>>> vendored
>>>> > > version. Expect some updates in the docs.
>>>> > >
>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>>> > >
>>>>
>>>

-- 
Cheers,
Gleb

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Reuven Lax <re...@google.com>.
Bad IDEs automatically generate the wrong import. I think we need to
automatically prevent this, otherwise the bad imports will inevitably slip
back in.

Reuven

On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <lu...@gmail.com>
wrote:

> Great news. Thanks all for this work!
>
> +1 to enforcing this on dependency level as Kenn suggested.
>
> Łukasz
>
> wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org> napisał(a):
>
>> We can enforce at the dependency level, since it is a compile error. I
>> think some IDEs and build tools may allow the compile-time classpath to get
>> polluted by transitive runtime deps, so protecting against bad imports is
>> also a good idea.
>>
>> Kenn
>>
>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Not yet, we need to add that too, there are still some tasks to be
>>> done like improve the contribution guide with this info, and document
>>> how to  generate a src build artifact locally since I doubt we can
>>> publish that into Apache for copyright reasons.
>>> I will message in the future for awareness for awareness when most of
>>> the pending tasks are finished.
>>>
>>>
>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org>
>>> wrote:
>>> >
>>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava
>>> version is used
>>> > everywhere now.
>>> >
>>> > Do we already have a Checkstyle rule that prevents people from using
>>> the
>>> > unvendored Guava? If not, such a rule could be useful because it's
>>> almost
>>> > inevitable that the unvedored Guava will slip back in.
>>> >
>>> > Cheers,
>>> > Max
>>> >
>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>> > > We merged today the PR [1] that changes most of the code to use our
>>> > > new guava vendored dependency. In practice it means that most of the
>>> > > imports of the classes were changed from `com.google.common.` to
>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>> > >
>>> > > This is a great improvement to fix a long existing problem of guava
>>> > > leaking through some Beam modules. This also reduces the size of most
>>> > > jars in the project because they don't need to relocate and include
>>> > > guava anymore, they just use the vendored dependency.
>>> > >
>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
>>> > > worked (are working) to make this possible.
>>> > >
>>> > > Sadly as a side effect of the merge of this PR multiple PRs were
>>> > > broken so please review if yours was and do a rebase and fix the
>>> > > imports to use the new vendored dependency. Sorry for the
>>> > > inconvenience. From now one all uses of guava should use the vendored
>>> > > version. Expect some updates in the docs.
>>> > >
>>> > > [1]  https://github.com/apache/beam/pull/6809
>>> > >
>>>
>>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Łukasz Gajowy <lu...@gmail.com>.
Great news. Thanks all for this work!

+1 to enforcing this on dependency level as Kenn suggested.

Łukasz

wt., 15 sty 2019 o 01:18 Kenneth Knowles <ke...@apache.org> napisał(a):

> We can enforce at the dependency level, since it is a compile error. I
> think some IDEs and build tools may allow the compile-time classpath to get
> polluted by transitive runtime deps, so protecting against bad imports is
> also a good idea.
>
> Kenn
>
> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Not yet, we need to add that too, there are still some tasks to be
>> done like improve the contribution guide with this info, and document
>> how to  generate a src build artifact locally since I doubt we can
>> publish that into Apache for copyright reasons.
>> I will message in the future for awareness for awareness when most of
>> the pending tasks are finished.
>>
>>
>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org>
>> wrote:
>> >
>> > Thanks for the heads up, Ismaël! Great to see the vendored Guava
>> version is used
>> > everywhere now.
>> >
>> > Do we already have a Checkstyle rule that prevents people from using the
>> > unvendored Guava? If not, such a rule could be useful because it's
>> almost
>> > inevitable that the unvedored Guava will slip back in.
>> >
>> > Cheers,
>> > Max
>> >
>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>> > > We merged today the PR [1] that changes most of the code to use our
>> > > new guava vendored dependency. In practice it means that most of the
>> > > imports of the classes were changed from `com.google.common.` to
>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>> > >
>> > > This is a great improvement to fix a long existing problem of guava
>> > > leaking through some Beam modules. This also reduces the size of most
>> > > jars in the project because they don't need to relocate and include
>> > > guava anymore, they just use the vendored dependency.
>> > >
>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
>> > > worked (are working) to make this possible.
>> > >
>> > > Sadly as a side effect of the merge of this PR multiple PRs were
>> > > broken so please review if yours was and do a rebase and fix the
>> > > imports to use the new vendored dependency. Sorry for the
>> > > inconvenience. From now one all uses of guava should use the vendored
>> > > version. Expect some updates in the docs.
>> > >
>> > > [1]  https://github.com/apache/beam/pull/6809
>> > >
>>
>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Kenneth Knowles <ke...@apache.org>.
We can enforce at the dependency level, since it is a compile error. I
think some IDEs and build tools may allow the compile-time classpath to get
polluted by transitive runtime deps, so protecting against bad imports is
also a good idea.

Kenn

On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ie...@gmail.com> wrote:

> Not yet, we need to add that too, there are still some tasks to be
> done like improve the contribution guide with this info, and document
> how to  generate a src build artifact locally since I doubt we can
> publish that into Apache for copyright reasons.
> I will message in the future for awareness for awareness when most of
> the pending tasks are finished.
>
>
> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org> wrote:
> >
> > Thanks for the heads up, Ismaël! Great to see the vendored Guava version
> is used
> > everywhere now.
> >
> > Do we already have a Checkstyle rule that prevents people from using the
> > unvendored Guava? If not, such a rule could be useful because it's almost
> > inevitable that the unvedored Guava will slip back in.
> >
> > Cheers,
> > Max
> >
> > On 14.01.19 05:55, Ismaël Mejía wrote:
> > > We merged today the PR [1] that changes most of the code to use our
> > > new guava vendored dependency. In practice it means that most of the
> > > imports of the classes were changed from `com.google.common.` to
> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
> > >
> > > This is a great improvement to fix a long existing problem of guava
> > > leaking through some Beam modules. This also reduces the size of most
> > > jars in the project because they don't need to relocate and include
> > > guava anymore, they just use the vendored dependency.
> > >
> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
> > > worked (are working) to make this possible.
> > >
> > > Sadly as a side effect of the merge of this PR multiple PRs were
> > > broken so please review if yours was and do a rebase and fix the
> > > imports to use the new vendored dependency. Sorry for the
> > > inconvenience. From now one all uses of guava should use the vendored
> > > version. Expect some updates in the docs.
> > >
> > > [1]  https://github.com/apache/beam/pull/6809
> > >
>

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Ismaël Mejía <ie...@gmail.com>.
Not yet, we need to add that too, there are still some tasks to be
done like improve the contribution guide with this info, and document
how to  generate a src build artifact locally since I doubt we can
publish that into Apache for copyright reasons.
I will message in the future for awareness for awareness when most of
the pending tasks are finished.


On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <mx...@apache.org> wrote:
>
> Thanks for the heads up, Ismaël! Great to see the vendored Guava version is used
> everywhere now.
>
> Do we already have a Checkstyle rule that prevents people from using the
> unvendored Guava? If not, such a rule could be useful because it's almost
> inevitable that the unvedored Guava will slip back in.
>
> Cheers,
> Max
>
> On 14.01.19 05:55, Ismaël Mejía wrote:
> > We merged today the PR [1] that changes most of the code to use our
> > new guava vendored dependency. In practice it means that most of the
> > imports of the classes were changed from `com.google.common.` to
> > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
> >
> > This is a great improvement to fix a long existing problem of guava
> > leaking through some Beam modules. This also reduces the size of most
> > jars in the project because they don't need to relocate and include
> > guava anymore, they just use the vendored dependency.
> >
> > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
> > worked (are working) to make this possible.
> >
> > Sadly as a side effect of the merge of this PR multiple PRs were
> > broken so please review if yours was and do a rebase and fix the
> > imports to use the new vendored dependency. Sorry for the
> > inconvenience. From now one all uses of guava should use the vendored
> > version. Expect some updates in the docs.
> >
> > [1]  https://github.com/apache/beam/pull/6809
> >

Re: Merge of vendored Guava (Some PRs need a rebase)

Posted by Maximilian Michels <mx...@apache.org>.
Thanks for the heads up, Ismaël! Great to see the vendored Guava version is used 
everywhere now.

Do we already have a Checkstyle rule that prevents people from using the 
unvendored Guava? If not, such a rule could be useful because it's almost 
inevitable that the unvedored Guava will slip back in.

Cheers,
Max

On 14.01.19 05:55, Ismaël Mejía wrote:
> We merged today the PR [1] that changes most of the code to use our
> new guava vendored dependency. In practice it means that most of the
> imports of the classes were changed from `com.google.common.` to
> `org.apache.beam.vendor.guava.v20_0.com.google.common.`
> 
> This is a great improvement to fix a long existing problem of guava
> leaking through some Beam modules. This also reduces the size of most
> jars in the project because they don't need to relocate and include
> guava anymore, they just use the vendored dependency.
> 
> Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the others that
> worked (are working) to make this possible.
> 
> Sadly as a side effect of the merge of this PR multiple PRs were
> broken so please review if yours was and do a rebase and fix the
> imports to use the new vendored dependency. Sorry for the
> inconvenience. From now one all uses of guava should use the vendored
> version. Expect some updates in the docs.
> 
> [1]  https://github.com/apache/beam/pull/6809
>