You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Kenneth Knowles <ke...@apache.org> on 2022/01/11 19:54:10 UTC

Re: [DISCUSS] propdeps removal and what to do going forward

To clarify: "provided" should have been in the test runtime configuration,
but not in the shipped runtime configuration (otherwise dep resolution for
users would pull in provided deps, which should not happen)

On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:

> During the migration to Gradle 7[1] the propdeps plugin was removed[2]
> since there wasn't a newer version that was compatible with Gradle 7 and a
> replacement couldn't be found. All existing usages of "provided" were moved
> to "compileOnly" and "compileOnly" is being mapped to the "provided" maven
> scope in the generated pom files. This has lead to two issues:
> 1) provided was also part of the runtime configuration, so we are getting
> a few class not found exceptions when running tests [3]
> 2) the generated pom.xml will have a bunch of compile time only
> annotations added as a provided dependency in the generated pom files[4]
>
> #1 can be fixed by adding the dependency to both the "compileOnly" and
> "runtimeOnly" configurations or by adding dependency to the
> "implementation" configuration
> #2 will make the pom files messier which can lead to confusion for users
> but shouldn't impact existing uses.
>
> There was a suggestion[4] to completely remove the usage of provided from
> the generated pom.xml and have all our previously "provided" dependencies
> declared as "implementation" allowing us to solve both #1 and #2 above.
>
> The largest usage of "provided" in the past was to packages related to the
> hadoop ecosystem and afterwards it was for packages such as
> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
> the module but can provide additional features if the dependency exists.
>
> What should we migrate if anything to the "implementation" configuration
> or should we try to recreate what we were doing with the "provided"
> configuration in the past?
>
> 1: https://issues.apache.org/jira/browse/BEAM-13430
> 2: https://github.com/apache/beam/pull/16308
> 3: https://issues.apache.org/jira/browse/BEAM-13569
> 4:
> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
> 5: https://issues.apache.org/jira/browse/BEAM-13504
>
>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Emily Ye <em...@google.com>.
It would be helpful for me then to establish what we consider a release
blocker for 2.36.0, and what/how the "long-term" solution will be
implemented. It seems to me that the release blocker is actually the
compile-time annotations marked as provided ([4] from initial message) and
not really the work related to these libraries in the sheet.

---
4:
https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964

On Fri, Jan 14, 2022 at 9:34 AM Daniel Collins <dp...@google.com> wrote:

> > In particular the Hadoop/Spark and Kafka dependencies must be
> **provided** as they were. I am not sure of others but those three matter.
>
> I think there's a bit of a difference here between what should be the
> state in the short term versus the long term.
>
> In the short term, I agree that we should avoid changes to how these
> dependencies are reflected in the POM.
>
> In the long term, I don't think it makes sense for these to continue to be
> "provided" dependencies- if users wish to use a different version of
> hadoop, spark or kafka, they can explicitly override the dependencies with
> the version they want when building their JAR, even if there is a version
> listed as "compile" in the POM file on maven central. The only difference
> is that if they don't have a version preference, the one listed in the POM
> (that we tested with) will be used, which seems like an unambiguous win to
> me.
>
> -Daniel
>
> On Thu, Jan 13, 2022 at 4:19 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Optional dependencies should not be a major issue.
>>
>> What matters to validate that we are not breaking users is to compare
>> the generated POM files with the previous (pre gradle 7 / 2.35.0)
>> version and see that what was provided is still provided.
>>
>> In particular the Hadoop/Spark and Kafka dependencies must be
>> **provided** as they were. I am not sure of others but those three
>> matter.
>>
>> Ismaël
>>
>> On Wed, Jan 12, 2022 at 10:55 PM Emily Ye <em...@google.com> wrote:
>> >
>> > We've chatted offline and have a tentative plan for what to do with
>> these dependencies that are currently marked as compileOnly (instead of
>> provided). Please review the list if possible [1].
>> >
>> > Two projects we aren't sure about:
>> >
>> > :sdks:java:io:hcatalog
>> >
>> > library.java.jackson_annotations
>> > library.java.jackson_core
>> > library.java.jackson_databind
>> > library.java.hadoop_common
>> > org.apache.hive:hive-exec
>> > org.apache.hive.hcatalog:hive-hcatalog-core
>> >
>> > :sdks:java:io:parquet
>> >
>> > library.java.hadoop_client
>> >
>> >
>> > Does anyone have experience with either of these IOs? ccing Chamikara
>> >
>> > Thank you,
>> > Emily
>> >
>> >
>> > [1]
>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>> >
>> > On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:
>> >>
>> >> As the person volunteering to do fixes for this to unblock Beam
>> 2.36.0, I created a spreadsheet of the projects with dependencies changed
>> from provided to compile only [1]. I pre-filled with what I think things
>> should be, but I don't have very much background in java/maven/gradle
>> configurations so please give input!
>> >>
>> >> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we
>> keep them provided because it depends on the user's version?
>> >>
>> >> [1]
>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>> >>
>> >> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
>> >>>
>> >>> I'm not following what you're trying to say Kenn since provided in
>> maven requires the user to explicitly add the dependency themselves to have
>> it part of their runtime.
>> >>>
>> >>> As per
>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
>> >>> "
>> >>> * provided
>> >>> This is much like compile, but indicates you expect the JDK or a
>> container to provide the dependency at runtime. For example, when building
>> a web application for the Java Enterprise Edition, you would set the
>> dependency on the Servlet API and related Java EE APIs to scope provided
>> because the web container provides those classes. A dependency with this
>> scope is added to the classpath used for compilation and test, but not the
>> runtime classpath. It is not transitive."
>> >>>
>> >>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org>
>> wrote:
>> >>>>
>> >>>> To clarify: "provided" should have been in the test runtime
>> configuration, but not in the shipped runtime configuration (otherwise dep
>> resolution for users would pull in provided deps, which should not happen)
>> >>>>
>> >>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
>> >>>>>
>> >>>>> During the migration to Gradle 7[1] the propdeps plugin was
>> removed[2] since there wasn't a newer version that was compatible with
>> Gradle 7 and a replacement couldn't be found. All existing usages of
>> "provided" were moved to "compileOnly" and "compileOnly" is being mapped to
>> the "provided" maven scope in the generated pom files. This has lead to two
>> issues:
>> >>>>> 1) provided was also part of the runtime configuration, so we are
>> getting a few class not found exceptions when running tests [3]
>> >>>>> 2) the generated pom.xml will have a bunch of compile time only
>> annotations added as a provided dependency in the generated pom files[4]
>> >>>>>
>> >>>>> #1 can be fixed by adding the dependency to both the "compileOnly"
>> and "runtimeOnly" configurations or by adding dependency to the
>> "implementation" configuration
>> >>>>> #2 will make the pom files messier which can lead to confusion for
>> users but shouldn't impact existing uses.
>> >>>>>
>> >>>>> There was a suggestion[4] to completely remove the usage of
>> provided from the generated pom.xml and have all our previously "provided"
>> dependencies declared as "implementation" allowing us to solve both #1 and
>> #2 above.
>> >>>>>
>> >>>>> The largest usage of "provided" in the past was to packages related
>> to the hadoop ecosystem and afterwards it was for packages such as
>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>> the module but can provide additional features if the dependency exists.
>> >>>>>
>> >>>>> What should we migrate if anything to the "implementation"
>> configuration or should we try to recreate what we were doing with the
>> "provided" configuration in the past?
>> >>>>>
>> >>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>> >>>>> 2: https://github.com/apache/beam/pull/16308
>> >>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>> >>>>> 4:
>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>> >>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>> >>>>>
>>
>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Kenneth Knowles <ke...@apache.org>.
On Wed, Jan 19, 2022 at 10:59 AM Brian Hulette <bh...@google.com> wrote:

> > But I believe the issue is that some tooling will bundle up the
> "compile" dependencies and submit with the job, which will then have a
> conflict with the libraries on the cluster.
>
> Do you know specifically what tooling does this? Also I'm not clear what
> you mean by "cluster", is this referring to remote workers?
>

No I don't, actually. I was being deliberately vague. I think "cluster"
could be Spark (or Flink) workers, etc. There are some flags to tweak whose
dependencies win. Dataflow workers also could have this issue. I guess
something building an uber jar would be the typical example.

To be super clear: I'm pretty much in agreement with Daniel, but I also
think there must be some good reason we should account for that it isn't
done that way already.

To unblock the release, Emily has opened a PR to match the prior pom
exactly, which I also think is smart. We can fix up / eliminate `provided`
deps one or a few at a time moving on.

Kenn


>
> On Tue, Jan 18, 2022 at 6:11 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>>
>>
>> On Fri, Jan 14, 2022 at 9:34 AM Daniel Collins <dp...@google.com>
>> wrote:
>>
>>> > In particular the Hadoop/Spark and Kafka dependencies must be
>>> **provided** as they were. I am not sure of others but those three matter.
>>>
>>> I think there's a bit of a difference here between what should be the
>>> state in the short term versus the long term.
>>>
>>> In the short term, I agree that we should avoid changes to how these
>>> dependencies are reflected in the POM.
>>>
>>> In the long term, I don't think it makes sense for these to continue to
>>> be "provided" dependencies- if users wish to use a different version of
>>> hadoop, spark or kafka, they can explicitly override the dependencies with
>>> the version they want when building their JAR, even if there is a version
>>> listed as "compile" in the POM file on maven central. The only difference
>>> is that if they don't have a version preference, the one listed in the POM
>>> (that we tested with) will be used, which seems like an unambiguous win to
>>> me.
>>>
>>
>> Agree with the sentiment. But I believe the issue is that some tooling
>> will bundle up the "compile" dependencies and submit with the job, which
>> will then have a conflict with the libraries on the cluster. On the other
>> hand, the user will always want to override the "provided" version to match
>> the cluster, in which case it will just be harmless duplicates on the
>> classpath, no? I guess huge file size, but it isn't the 90s any more. Since
>> Ismaël commented, maybe he can help to clarify. I also knew about this
>> reasoning for Spark & Hadoop but I don't know exactly what is required to
>> make it work right.
>>
>> This could become a bothersome issue long term - Gradle dev community has
>> lots of posts that indicate they don't agree with the existence of
>> "provided" or "optional" dependencies. (I happen to agree with them, but
>> philosophy is not the point). We should have a very clear solution for the
>> cases that require one, and document at least on the wiki.
>>
>> Kenn
>>
>>
>>>
>>> -Daniel
>>>
>>> On Thu, Jan 13, 2022 at 4:19 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Optional dependencies should not be a major issue.
>>>>
>>>> What matters to validate that we are not breaking users is to compare
>>>> the generated POM files with the previous (pre gradle 7 / 2.35.0)
>>>> version and see that what was provided is still provided.
>>>>
>>>> In particular the Hadoop/Spark and Kafka dependencies must be
>>>> **provided** as they were. I am not sure of others but those three
>>>> matter.
>>>>
>>>> Ismaël
>>>>
>>>> On Wed, Jan 12, 2022 at 10:55 PM Emily Ye <em...@google.com> wrote:
>>>> >
>>>> > We've chatted offline and have a tentative plan for what to do with
>>>> these dependencies that are currently marked as compileOnly (instead of
>>>> provided). Please review the list if possible [1].
>>>> >
>>>> > Two projects we aren't sure about:
>>>> >
>>>> > :sdks:java:io:hcatalog
>>>> >
>>>> > library.java.jackson_annotations
>>>> > library.java.jackson_core
>>>> > library.java.jackson_databind
>>>> > library.java.hadoop_common
>>>> > org.apache.hive:hive-exec
>>>> > org.apache.hive.hcatalog:hive-hcatalog-core
>>>> >
>>>> > :sdks:java:io:parquet
>>>> >
>>>> > library.java.hadoop_client
>>>> >
>>>> >
>>>> > Does anyone have experience with either of these IOs? ccing Chamikara
>>>> >
>>>> > Thank you,
>>>> > Emily
>>>> >
>>>> >
>>>> > [1]
>>>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>>>> >
>>>> > On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:
>>>> >>
>>>> >> As the person volunteering to do fixes for this to unblock Beam
>>>> 2.36.0, I created a spreadsheet of the projects with dependencies changed
>>>> from provided to compile only [1]. I pre-filled with what I think things
>>>> should be, but I don't have very much background in java/maven/gradle
>>>> configurations so please give input!
>>>> >>
>>>> >> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we
>>>> keep them provided because it depends on the user's version?
>>>> >>
>>>> >> [1]
>>>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>>>> >>
>>>> >> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
>>>> >>>
>>>> >>> I'm not following what you're trying to say Kenn since provided in
>>>> maven requires the user to explicitly add the dependency themselves to have
>>>> it part of their runtime.
>>>> >>>
>>>> >>> As per
>>>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
>>>> >>> "
>>>> >>> * provided
>>>> >>> This is much like compile, but indicates you expect the JDK or a
>>>> container to provide the dependency at runtime. For example, when building
>>>> a web application for the Java Enterprise Edition, you would set the
>>>> dependency on the Servlet API and related Java EE APIs to scope provided
>>>> because the web container provides those classes. A dependency with this
>>>> scope is added to the classpath used for compilation and test, but not the
>>>> runtime classpath. It is not transitive."
>>>> >>>
>>>> >>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>> >>>>
>>>> >>>> To clarify: "provided" should have been in the test runtime
>>>> configuration, but not in the shipped runtime configuration (otherwise dep
>>>> resolution for users would pull in provided deps, which should not happen)
>>>> >>>>
>>>> >>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >>>>>
>>>> >>>>> During the migration to Gradle 7[1] the propdeps plugin was
>>>> removed[2] since there wasn't a newer version that was compatible with
>>>> Gradle 7 and a replacement couldn't be found. All existing usages of
>>>> "provided" were moved to "compileOnly" and "compileOnly" is being mapped to
>>>> the "provided" maven scope in the generated pom files. This has lead to two
>>>> issues:
>>>> >>>>> 1) provided was also part of the runtime configuration, so we are
>>>> getting a few class not found exceptions when running tests [3]
>>>> >>>>> 2) the generated pom.xml will have a bunch of compile time only
>>>> annotations added as a provided dependency in the generated pom files[4]
>>>> >>>>>
>>>> >>>>> #1 can be fixed by adding the dependency to both the
>>>> "compileOnly" and "runtimeOnly" configurations or by adding dependency to
>>>> the "implementation" configuration
>>>> >>>>> #2 will make the pom files messier which can lead to confusion
>>>> for users but shouldn't impact existing uses.
>>>> >>>>>
>>>> >>>>> There was a suggestion[4] to completely remove the usage of
>>>> provided from the generated pom.xml and have all our previously "provided"
>>>> dependencies declared as "implementation" allowing us to solve both #1 and
>>>> #2 above.
>>>> >>>>>
>>>> >>>>> The largest usage of "provided" in the past was to packages
>>>> related to the hadoop ecosystem and afterwards it was for packages such as
>>>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>>>> the module but can provide additional features if the dependency exists.
>>>> >>>>>
>>>> >>>>> What should we migrate if anything to the "implementation"
>>>> configuration or should we try to recreate what we were doing with the
>>>> "provided" configuration in the past?
>>>> >>>>>
>>>> >>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>>>> >>>>> 2: https://github.com/apache/beam/pull/16308
>>>> >>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>>>> >>>>> 4:
>>>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>>>> >>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>>>> >>>>>
>>>>
>>>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Brian Hulette <bh...@google.com>.
> But I believe the issue is that some tooling will bundle up the "compile"
dependencies and submit with the job, which will then have a conflict with
the libraries on the cluster.

Do you know specifically what tooling does this? Also I'm not clear what
you mean by "cluster", is this referring to remote workers?

On Tue, Jan 18, 2022 at 6:11 AM Kenneth Knowles <ke...@apache.org> wrote:

>
>
> On Fri, Jan 14, 2022 at 9:34 AM Daniel Collins <dp...@google.com>
> wrote:
>
>> > In particular the Hadoop/Spark and Kafka dependencies must be
>> **provided** as they were. I am not sure of others but those three matter.
>>
>> I think there's a bit of a difference here between what should be the
>> state in the short term versus the long term.
>>
>> In the short term, I agree that we should avoid changes to how these
>> dependencies are reflected in the POM.
>>
>> In the long term, I don't think it makes sense for these to continue to
>> be "provided" dependencies- if users wish to use a different version of
>> hadoop, spark or kafka, they can explicitly override the dependencies with
>> the version they want when building their JAR, even if there is a version
>> listed as "compile" in the POM file on maven central. The only difference
>> is that if they don't have a version preference, the one listed in the POM
>> (that we tested with) will be used, which seems like an unambiguous win to
>> me.
>>
>
> Agree with the sentiment. But I believe the issue is that some tooling
> will bundle up the "compile" dependencies and submit with the job, which
> will then have a conflict with the libraries on the cluster. On the other
> hand, the user will always want to override the "provided" version to match
> the cluster, in which case it will just be harmless duplicates on the
> classpath, no? I guess huge file size, but it isn't the 90s any more. Since
> Ismaël commented, maybe he can help to clarify. I also knew about this
> reasoning for Spark & Hadoop but I don't know exactly what is required to
> make it work right.
>
> This could become a bothersome issue long term - Gradle dev community has
> lots of posts that indicate they don't agree with the existence of
> "provided" or "optional" dependencies. (I happen to agree with them, but
> philosophy is not the point). We should have a very clear solution for the
> cases that require one, and document at least on the wiki.
>
> Kenn
>
>
>>
>> -Daniel
>>
>> On Thu, Jan 13, 2022 at 4:19 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Optional dependencies should not be a major issue.
>>>
>>> What matters to validate that we are not breaking users is to compare
>>> the generated POM files with the previous (pre gradle 7 / 2.35.0)
>>> version and see that what was provided is still provided.
>>>
>>> In particular the Hadoop/Spark and Kafka dependencies must be
>>> **provided** as they were. I am not sure of others but those three
>>> matter.
>>>
>>> Ismaël
>>>
>>> On Wed, Jan 12, 2022 at 10:55 PM Emily Ye <em...@google.com> wrote:
>>> >
>>> > We've chatted offline and have a tentative plan for what to do with
>>> these dependencies that are currently marked as compileOnly (instead of
>>> provided). Please review the list if possible [1].
>>> >
>>> > Two projects we aren't sure about:
>>> >
>>> > :sdks:java:io:hcatalog
>>> >
>>> > library.java.jackson_annotations
>>> > library.java.jackson_core
>>> > library.java.jackson_databind
>>> > library.java.hadoop_common
>>> > org.apache.hive:hive-exec
>>> > org.apache.hive.hcatalog:hive-hcatalog-core
>>> >
>>> > :sdks:java:io:parquet
>>> >
>>> > library.java.hadoop_client
>>> >
>>> >
>>> > Does anyone have experience with either of these IOs? ccing Chamikara
>>> >
>>> > Thank you,
>>> > Emily
>>> >
>>> >
>>> > [1]
>>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>>> >
>>> > On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:
>>> >>
>>> >> As the person volunteering to do fixes for this to unblock Beam
>>> 2.36.0, I created a spreadsheet of the projects with dependencies changed
>>> from provided to compile only [1]. I pre-filled with what I think things
>>> should be, but I don't have very much background in java/maven/gradle
>>> configurations so please give input!
>>> >>
>>> >> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we
>>> keep them provided because it depends on the user's version?
>>> >>
>>> >> [1]
>>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>>> >>
>>> >> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
>>> >>>
>>> >>> I'm not following what you're trying to say Kenn since provided in
>>> maven requires the user to explicitly add the dependency themselves to have
>>> it part of their runtime.
>>> >>>
>>> >>> As per
>>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
>>> >>> "
>>> >>> * provided
>>> >>> This is much like compile, but indicates you expect the JDK or a
>>> container to provide the dependency at runtime. For example, when building
>>> a web application for the Java Enterprise Edition, you would set the
>>> dependency on the Servlet API and related Java EE APIs to scope provided
>>> because the web container provides those classes. A dependency with this
>>> scope is added to the classpath used for compilation and test, but not the
>>> runtime classpath. It is not transitive."
>>> >>>
>>> >>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org>
>>> wrote:
>>> >>>>
>>> >>>> To clarify: "provided" should have been in the test runtime
>>> configuration, but not in the shipped runtime configuration (otherwise dep
>>> resolution for users would pull in provided deps, which should not happen)
>>> >>>>
>>> >>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com>
>>> wrote:
>>> >>>>>
>>> >>>>> During the migration to Gradle 7[1] the propdeps plugin was
>>> removed[2] since there wasn't a newer version that was compatible with
>>> Gradle 7 and a replacement couldn't be found. All existing usages of
>>> "provided" were moved to "compileOnly" and "compileOnly" is being mapped to
>>> the "provided" maven scope in the generated pom files. This has lead to two
>>> issues:
>>> >>>>> 1) provided was also part of the runtime configuration, so we are
>>> getting a few class not found exceptions when running tests [3]
>>> >>>>> 2) the generated pom.xml will have a bunch of compile time only
>>> annotations added as a provided dependency in the generated pom files[4]
>>> >>>>>
>>> >>>>> #1 can be fixed by adding the dependency to both the "compileOnly"
>>> and "runtimeOnly" configurations or by adding dependency to the
>>> "implementation" configuration
>>> >>>>> #2 will make the pom files messier which can lead to confusion for
>>> users but shouldn't impact existing uses.
>>> >>>>>
>>> >>>>> There was a suggestion[4] to completely remove the usage of
>>> provided from the generated pom.xml and have all our previously "provided"
>>> dependencies declared as "implementation" allowing us to solve both #1 and
>>> #2 above.
>>> >>>>>
>>> >>>>> The largest usage of "provided" in the past was to packages
>>> related to the hadoop ecosystem and afterwards it was for packages such as
>>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>>> the module but can provide additional features if the dependency exists.
>>> >>>>>
>>> >>>>> What should we migrate if anything to the "implementation"
>>> configuration or should we try to recreate what we were doing with the
>>> "provided" configuration in the past?
>>> >>>>>
>>> >>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>>> >>>>> 2: https://github.com/apache/beam/pull/16308
>>> >>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>>> >>>>> 4:
>>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>>> >>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>>> >>>>>
>>>
>>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Kenneth Knowles <ke...@apache.org>.
On Fri, Jan 14, 2022 at 9:34 AM Daniel Collins <dp...@google.com> wrote:

> > In particular the Hadoop/Spark and Kafka dependencies must be
> **provided** as they were. I am not sure of others but those three matter.
>
> I think there's a bit of a difference here between what should be the
> state in the short term versus the long term.
>
> In the short term, I agree that we should avoid changes to how these
> dependencies are reflected in the POM.
>
> In the long term, I don't think it makes sense for these to continue to be
> "provided" dependencies- if users wish to use a different version of
> hadoop, spark or kafka, they can explicitly override the dependencies with
> the version they want when building their JAR, even if there is a version
> listed as "compile" in the POM file on maven central. The only difference
> is that if they don't have a version preference, the one listed in the POM
> (that we tested with) will be used, which seems like an unambiguous win to
> me.
>

Agree with the sentiment. But I believe the issue is that some tooling will
bundle up the "compile" dependencies and submit with the job, which will
then have a conflict with the libraries on the cluster. On the other hand,
the user will always want to override the "provided" version to match the
cluster, in which case it will just be harmless duplicates on the
classpath, no? I guess huge file size, but it isn't the 90s any more. Since
Ismaël commented, maybe he can help to clarify. I also knew about this
reasoning for Spark & Hadoop but I don't know exactly what is required to
make it work right.

This could become a bothersome issue long term - Gradle dev community has
lots of posts that indicate they don't agree with the existence of
"provided" or "optional" dependencies. (I happen to agree with them, but
philosophy is not the point). We should have a very clear solution for the
cases that require one, and document at least on the wiki.

Kenn


>
> -Daniel
>
> On Thu, Jan 13, 2022 at 4:19 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Optional dependencies should not be a major issue.
>>
>> What matters to validate that we are not breaking users is to compare
>> the generated POM files with the previous (pre gradle 7 / 2.35.0)
>> version and see that what was provided is still provided.
>>
>> In particular the Hadoop/Spark and Kafka dependencies must be
>> **provided** as they were. I am not sure of others but those three
>> matter.
>>
>> Ismaël
>>
>> On Wed, Jan 12, 2022 at 10:55 PM Emily Ye <em...@google.com> wrote:
>> >
>> > We've chatted offline and have a tentative plan for what to do with
>> these dependencies that are currently marked as compileOnly (instead of
>> provided). Please review the list if possible [1].
>> >
>> > Two projects we aren't sure about:
>> >
>> > :sdks:java:io:hcatalog
>> >
>> > library.java.jackson_annotations
>> > library.java.jackson_core
>> > library.java.jackson_databind
>> > library.java.hadoop_common
>> > org.apache.hive:hive-exec
>> > org.apache.hive.hcatalog:hive-hcatalog-core
>> >
>> > :sdks:java:io:parquet
>> >
>> > library.java.hadoop_client
>> >
>> >
>> > Does anyone have experience with either of these IOs? ccing Chamikara
>> >
>> > Thank you,
>> > Emily
>> >
>> >
>> > [1]
>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>> >
>> > On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:
>> >>
>> >> As the person volunteering to do fixes for this to unblock Beam
>> 2.36.0, I created a spreadsheet of the projects with dependencies changed
>> from provided to compile only [1]. I pre-filled with what I think things
>> should be, but I don't have very much background in java/maven/gradle
>> configurations so please give input!
>> >>
>> >> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we
>> keep them provided because it depends on the user's version?
>> >>
>> >> [1]
>> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>> >>
>> >> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
>> >>>
>> >>> I'm not following what you're trying to say Kenn since provided in
>> maven requires the user to explicitly add the dependency themselves to have
>> it part of their runtime.
>> >>>
>> >>> As per
>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
>> >>> "
>> >>> * provided
>> >>> This is much like compile, but indicates you expect the JDK or a
>> container to provide the dependency at runtime. For example, when building
>> a web application for the Java Enterprise Edition, you would set the
>> dependency on the Servlet API and related Java EE APIs to scope provided
>> because the web container provides those classes. A dependency with this
>> scope is added to the classpath used for compilation and test, but not the
>> runtime classpath. It is not transitive."
>> >>>
>> >>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org>
>> wrote:
>> >>>>
>> >>>> To clarify: "provided" should have been in the test runtime
>> configuration, but not in the shipped runtime configuration (otherwise dep
>> resolution for users would pull in provided deps, which should not happen)
>> >>>>
>> >>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
>> >>>>>
>> >>>>> During the migration to Gradle 7[1] the propdeps plugin was
>> removed[2] since there wasn't a newer version that was compatible with
>> Gradle 7 and a replacement couldn't be found. All existing usages of
>> "provided" were moved to "compileOnly" and "compileOnly" is being mapped to
>> the "provided" maven scope in the generated pom files. This has lead to two
>> issues:
>> >>>>> 1) provided was also part of the runtime configuration, so we are
>> getting a few class not found exceptions when running tests [3]
>> >>>>> 2) the generated pom.xml will have a bunch of compile time only
>> annotations added as a provided dependency in the generated pom files[4]
>> >>>>>
>> >>>>> #1 can be fixed by adding the dependency to both the "compileOnly"
>> and "runtimeOnly" configurations or by adding dependency to the
>> "implementation" configuration
>> >>>>> #2 will make the pom files messier which can lead to confusion for
>> users but shouldn't impact existing uses.
>> >>>>>
>> >>>>> There was a suggestion[4] to completely remove the usage of
>> provided from the generated pom.xml and have all our previously "provided"
>> dependencies declared as "implementation" allowing us to solve both #1 and
>> #2 above.
>> >>>>>
>> >>>>> The largest usage of "provided" in the past was to packages related
>> to the hadoop ecosystem and afterwards it was for packages such as
>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>> the module but can provide additional features if the dependency exists.
>> >>>>>
>> >>>>> What should we migrate if anything to the "implementation"
>> configuration or should we try to recreate what we were doing with the
>> "provided" configuration in the past?
>> >>>>>
>> >>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>> >>>>> 2: https://github.com/apache/beam/pull/16308
>> >>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>> >>>>> 4:
>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>> >>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>> >>>>>
>>
>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Daniel Collins <dp...@google.com>.
> In particular the Hadoop/Spark and Kafka dependencies must be
**provided** as they were. I am not sure of others but those three matter.

I think there's a bit of a difference here between what should be the state
in the short term versus the long term.

In the short term, I agree that we should avoid changes to how these
dependencies are reflected in the POM.

In the long term, I don't think it makes sense for these to continue to be
"provided" dependencies- if users wish to use a different version of
hadoop, spark or kafka, they can explicitly override the dependencies with
the version they want when building their JAR, even if there is a version
listed as "compile" in the POM file on maven central. The only difference
is that if they don't have a version preference, the one listed in the POM
(that we tested with) will be used, which seems like an unambiguous win to
me.

-Daniel

On Thu, Jan 13, 2022 at 4:19 PM Ismaël Mejía <ie...@gmail.com> wrote:

> Optional dependencies should not be a major issue.
>
> What matters to validate that we are not breaking users is to compare
> the generated POM files with the previous (pre gradle 7 / 2.35.0)
> version and see that what was provided is still provided.
>
> In particular the Hadoop/Spark and Kafka dependencies must be
> **provided** as they were. I am not sure of others but those three
> matter.
>
> Ismaël
>
> On Wed, Jan 12, 2022 at 10:55 PM Emily Ye <em...@google.com> wrote:
> >
> > We've chatted offline and have a tentative plan for what to do with
> these dependencies that are currently marked as compileOnly (instead of
> provided). Please review the list if possible [1].
> >
> > Two projects we aren't sure about:
> >
> > :sdks:java:io:hcatalog
> >
> > library.java.jackson_annotations
> > library.java.jackson_core
> > library.java.jackson_databind
> > library.java.hadoop_common
> > org.apache.hive:hive-exec
> > org.apache.hive.hcatalog:hive-hcatalog-core
> >
> > :sdks:java:io:parquet
> >
> > library.java.hadoop_client
> >
> >
> > Does anyone have experience with either of these IOs? ccing Chamikara
> >
> > Thank you,
> > Emily
> >
> >
> > [1]
> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
> >
> > On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:
> >>
> >> As the person volunteering to do fixes for this to unblock Beam 2.36.0,
> I created a spreadsheet of the projects with dependencies changed from
> provided to compile only [1]. I pre-filled with what I think things should
> be, but I don't have very much background in java/maven/gradle
> configurations so please give input!
> >>
> >> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we
> keep them provided because it depends on the user's version?
> >>
> >> [1]
> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
> >>
> >> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
> >>>
> >>> I'm not following what you're trying to say Kenn since provided in
> maven requires the user to explicitly add the dependency themselves to have
> it part of their runtime.
> >>>
> >>> As per
> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
> >>> "
> >>> * provided
> >>> This is much like compile, but indicates you expect the JDK or a
> container to provide the dependency at runtime. For example, when building
> a web application for the Java Enterprise Edition, you would set the
> dependency on the Servlet API and related Java EE APIs to scope provided
> because the web container provides those classes. A dependency with this
> scope is added to the classpath used for compilation and test, but not the
> runtime classpath. It is not transitive."
> >>>
> >>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org>
> wrote:
> >>>>
> >>>> To clarify: "provided" should have been in the test runtime
> configuration, but not in the shipped runtime configuration (otherwise dep
> resolution for users would pull in provided deps, which should not happen)
> >>>>
> >>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
> >>>>>
> >>>>> During the migration to Gradle 7[1] the propdeps plugin was
> removed[2] since there wasn't a newer version that was compatible with
> Gradle 7 and a replacement couldn't be found. All existing usages of
> "provided" were moved to "compileOnly" and "compileOnly" is being mapped to
> the "provided" maven scope in the generated pom files. This has lead to two
> issues:
> >>>>> 1) provided was also part of the runtime configuration, so we are
> getting a few class not found exceptions when running tests [3]
> >>>>> 2) the generated pom.xml will have a bunch of compile time only
> annotations added as a provided dependency in the generated pom files[4]
> >>>>>
> >>>>> #1 can be fixed by adding the dependency to both the "compileOnly"
> and "runtimeOnly" configurations or by adding dependency to the
> "implementation" configuration
> >>>>> #2 will make the pom files messier which can lead to confusion for
> users but shouldn't impact existing uses.
> >>>>>
> >>>>> There was a suggestion[4] to completely remove the usage of provided
> from the generated pom.xml and have all our previously "provided"
> dependencies declared as "implementation" allowing us to solve both #1 and
> #2 above.
> >>>>>
> >>>>> The largest usage of "provided" in the past was to packages related
> to the hadoop ecosystem and afterwards it was for packages such as
> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
> the module but can provide additional features if the dependency exists.
> >>>>>
> >>>>> What should we migrate if anything to the "implementation"
> configuration or should we try to recreate what we were doing with the
> "provided" configuration in the past?
> >>>>>
> >>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
> >>>>> 2: https://github.com/apache/beam/pull/16308
> >>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
> >>>>> 4:
> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
> >>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
> >>>>>
>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Ismaël Mejía <ie...@gmail.com>.
Optional dependencies should not be a major issue.

What matters to validate that we are not breaking users is to compare
the generated POM files with the previous (pre gradle 7 / 2.35.0)
version and see that what was provided is still provided.

In particular the Hadoop/Spark and Kafka dependencies must be
**provided** as they were. I am not sure of others but those three
matter.

Ismaël

On Wed, Jan 12, 2022 at 10:55 PM Emily Ye <em...@google.com> wrote:
>
> We've chatted offline and have a tentative plan for what to do with these dependencies that are currently marked as compileOnly (instead of provided). Please review the list if possible [1].
>
> Two projects we aren't sure about:
>
> :sdks:java:io:hcatalog
>
> library.java.jackson_annotations
> library.java.jackson_core
> library.java.jackson_databind
> library.java.hadoop_common
> org.apache.hive:hive-exec
> org.apache.hive.hcatalog:hive-hcatalog-core
>
> :sdks:java:io:parquet
>
> library.java.hadoop_client
>
>
> Does anyone have experience with either of these IOs? ccing Chamikara
>
> Thank you,
> Emily
>
>
> [1] https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>
> On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:
>>
>> As the person volunteering to do fixes for this to unblock Beam 2.36.0, I created a spreadsheet of the projects with dependencies changed from provided to compile only [1]. I pre-filled with what I think things should be, but I don't have very much background in java/maven/gradle configurations so please give input!
>>
>> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we keep them provided because it depends on the user's version?
>>
>> [1] https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>>
>> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>> I'm not following what you're trying to say Kenn since provided in maven requires the user to explicitly add the dependency themselves to have it part of their runtime.
>>>
>>> As per https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
>>> "
>>> * provided
>>> This is much like compile, but indicates you expect the JDK or a container to provide the dependency at runtime. For example, when building a web application for the Java Enterprise Edition, you would set the dependency on the Servlet API and related Java EE APIs to scope provided because the web container provides those classes. A dependency with this scope is added to the classpath used for compilation and test, but not the runtime classpath. It is not transitive."
>>>
>>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>>
>>>> To clarify: "provided" should have been in the test runtime configuration, but not in the shipped runtime configuration (otherwise dep resolution for users would pull in provided deps, which should not happen)
>>>>
>>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>> During the migration to Gradle 7[1] the propdeps plugin was removed[2] since there wasn't a newer version that was compatible with Gradle 7 and a replacement couldn't be found. All existing usages of "provided" were moved to "compileOnly" and "compileOnly" is being mapped to the "provided" maven scope in the generated pom files. This has lead to two issues:
>>>>> 1) provided was also part of the runtime configuration, so we are getting a few class not found exceptions when running tests [3]
>>>>> 2) the generated pom.xml will have a bunch of compile time only annotations added as a provided dependency in the generated pom files[4]
>>>>>
>>>>> #1 can be fixed by adding the dependency to both the "compileOnly" and "runtimeOnly" configurations or by adding dependency to the "implementation" configuration
>>>>> #2 will make the pom files messier which can lead to confusion for users but shouldn't impact existing uses.
>>>>>
>>>>> There was a suggestion[4] to completely remove the usage of provided from the generated pom.xml and have all our previously "provided" dependencies declared as "implementation" allowing us to solve both #1 and #2 above.
>>>>>
>>>>> The largest usage of "provided" in the past was to packages related to the hadoop ecosystem and afterwards it was for packages such as junit/hamcrest/aircompressor in sdks/java/core which aren't required to use the module but can provide additional features if the dependency exists.
>>>>>
>>>>> What should we migrate if anything to the "implementation" configuration or should we try to recreate what we were doing with the "provided" configuration in the past?
>>>>>
>>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>>>>> 2: https://github.com/apache/beam/pull/16308
>>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>>>>> 4: https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>>>>>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Emily Ye <em...@google.com>.
We've chatted offline and have a tentative plan for what to do with these
dependencies that are currently marked as compileOnly (instead of
provided). Please review the list if possible [1].

Two projects we aren't sure about:

   - :sdks:java:io:hcatalog
      - library.java.jackson_annotations
      - library.java.jackson_core
      - library.java.jackson_databind
      - library.java.hadoop_common
      - org.apache.hive:hive-exec
      - org.apache.hive.hcatalog:hive-hcatalog-core
      - :sdks:java:io:parquet
      - library.java.hadoop_client


Does anyone have experience with either of these IOs? ccing Chamikara

Thank you,
Emily


[1]
https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing

On Tue, Jan 11, 2022 at 6:38 PM Emily Ye <em...@google.com> wrote:

> As the person volunteering to do fixes for this to unblock Beam 2.36.0, I
> created a spreadsheet of the projects with dependencies changed from
> provided to compile only [1]. I pre-filled with what I think things should
> be, but I don't have very much background in java/maven/gradle
> configurations so please give input!
>
> Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we keep
> them provided because it depends on the user's version?
>
> [1]
> https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing
>
> On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:
>
>> I'm not following what you're trying to say Kenn since provided in maven
>> requires the user to explicitly add the dependency themselves to have it
>> part of their runtime.
>>
>> As per
>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
>> "
>> * provided
>> This is much like compile, but indicates you expect the JDK or a
>> container to provide the dependency at runtime. For example, when building
>> a web application for the Java Enterprise Edition, you would set the
>> dependency on the Servlet API and related Java EE APIs to scope provided
>> because the web container provides those classes. A dependency with this
>> scope is added to the classpath used for compilation and test, but not the
>> runtime classpath. It is not transitive."
>>
>> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> To clarify: "provided" should have been in the test runtime
>>> configuration, but not in the shipped runtime configuration (otherwise dep
>>> resolution for users would pull in provided deps, which should not happen)
>>>
>>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> During the migration to Gradle 7[1] the propdeps plugin was removed[2]
>>>> since there wasn't a newer version that was compatible with Gradle 7 and a
>>>> replacement couldn't be found. All existing usages of "provided" were moved
>>>> to "compileOnly" and "compileOnly" is being mapped to the "provided" maven
>>>> scope in the generated pom files. This has lead to two issues:
>>>> 1) provided was also part of the runtime configuration, so we are
>>>> getting a few class not found exceptions when running tests [3]
>>>> 2) the generated pom.xml will have a bunch of compile time only
>>>> annotations added as a provided dependency in the generated pom files[4]
>>>>
>>>> #1 can be fixed by adding the dependency to both the "compileOnly" and
>>>> "runtimeOnly" configurations or by adding dependency to the
>>>> "implementation" configuration
>>>> #2 will make the pom files messier which can lead to confusion for
>>>> users but shouldn't impact existing uses.
>>>>
>>>> There was a suggestion[4] to completely remove the usage of provided
>>>> from the generated pom.xml and have all our previously "provided"
>>>> dependencies declared as "implementation" allowing us to solve both #1 and
>>>> #2 above.
>>>>
>>>> The largest usage of "provided" in the past was to packages related to
>>>> the hadoop ecosystem and afterwards it was for packages such as
>>>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>>>> the module but can provide additional features if the dependency exists.
>>>>
>>>> What should we migrate if anything to the "implementation"
>>>> configuration or should we try to recreate what we were doing with the
>>>> "provided" configuration in the past?
>>>>
>>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>>>> 2: https://github.com/apache/beam/pull/16308
>>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>>>> 4:
>>>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>>>>
>>>>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Emily Ye <em...@google.com>.
As the person volunteering to do fixes for this to unblock Beam 2.36.0, I
created a spreadsheet of the projects with dependencies changed from
provided to compile only [1]. I pre-filled with what I think things should
be, but I don't have very much background in java/maven/gradle
configurations so please give input!

Some (mainly hadoop/kafka) I left blank, since I'm not sure - do we keep
them provided because it depends on the user's version?

[1]
https://docs.google.com/spreadsheets/d/1UpeQtx1PoAgeSmpKxZC9lv3B9G1c7cryW3iICfRtG1o/edit?usp=sharing

On Tue, Jan 11, 2022 at 1:17 PM Luke Cwik <lc...@google.com> wrote:

> I'm not following what you're trying to say Kenn since provided in maven
> requires the user to explicitly add the dependency themselves to have it
> part of their runtime.
>
> As per
> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
> "
> * provided
> This is much like compile, but indicates you expect the JDK or a container
> to provide the dependency at runtime. For example, when building a web
> application for the Java Enterprise Edition, you would set the dependency
> on the Servlet API and related Java EE APIs to scope provided because the
> web container provides those classes. A dependency with this scope is added
> to the classpath used for compilation and test, but not the runtime
> classpath. It is not transitive."
>
> On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> To clarify: "provided" should have been in the test runtime
>> configuration, but not in the shipped runtime configuration (otherwise dep
>> resolution for users would pull in provided deps, which should not happen)
>>
>> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
>>
>>> During the migration to Gradle 7[1] the propdeps plugin was removed[2]
>>> since there wasn't a newer version that was compatible with Gradle 7 and a
>>> replacement couldn't be found. All existing usages of "provided" were moved
>>> to "compileOnly" and "compileOnly" is being mapped to the "provided" maven
>>> scope in the generated pom files. This has lead to two issues:
>>> 1) provided was also part of the runtime configuration, so we are
>>> getting a few class not found exceptions when running tests [3]
>>> 2) the generated pom.xml will have a bunch of compile time only
>>> annotations added as a provided dependency in the generated pom files[4]
>>>
>>> #1 can be fixed by adding the dependency to both the "compileOnly" and
>>> "runtimeOnly" configurations or by adding dependency to the
>>> "implementation" configuration
>>> #2 will make the pom files messier which can lead to confusion for users
>>> but shouldn't impact existing uses.
>>>
>>> There was a suggestion[4] to completely remove the usage of provided
>>> from the generated pom.xml and have all our previously "provided"
>>> dependencies declared as "implementation" allowing us to solve both #1 and
>>> #2 above.
>>>
>>> The largest usage of "provided" in the past was to packages related to
>>> the hadoop ecosystem and afterwards it was for packages such as
>>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>>> the module but can provide additional features if the dependency exists.
>>>
>>> What should we migrate if anything to the "implementation" configuration
>>> or should we try to recreate what we were doing with the "provided"
>>> configuration in the past?
>>>
>>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>>> 2: https://github.com/apache/beam/pull/16308
>>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>>> 4:
>>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>>>
>>>

Re: [DISCUSS] propdeps removal and what to do going forward

Posted by Luke Cwik <lc...@google.com>.
I'm not following what you're trying to say Kenn since provided in maven
requires the user to explicitly add the dependency themselves to have it
part of their runtime.

As per
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#dependency-scope
"
* provided
This is much like compile, but indicates you expect the JDK or a container
to provide the dependency at runtime. For example, when building a web
application for the Java Enterprise Edition, you would set the dependency
on the Servlet API and related Java EE APIs to scope provided because the
web container provides those classes. A dependency with this scope is added
to the classpath used for compilation and test, but not the runtime
classpath. It is not transitive."

On Tue, Jan 11, 2022 at 11:54 AM Kenneth Knowles <ke...@apache.org> wrote:

> To clarify: "provided" should have been in the test runtime configuration,
> but not in the shipped runtime configuration (otherwise dep resolution for
> users would pull in provided deps, which should not happen)
>
> On Thu, Dec 30, 2021 at 10:05 AM Luke Cwik <lc...@google.com> wrote:
>
>> During the migration to Gradle 7[1] the propdeps plugin was removed[2]
>> since there wasn't a newer version that was compatible with Gradle 7 and a
>> replacement couldn't be found. All existing usages of "provided" were moved
>> to "compileOnly" and "compileOnly" is being mapped to the "provided" maven
>> scope in the generated pom files. This has lead to two issues:
>> 1) provided was also part of the runtime configuration, so we are getting
>> a few class not found exceptions when running tests [3]
>> 2) the generated pom.xml will have a bunch of compile time only
>> annotations added as a provided dependency in the generated pom files[4]
>>
>> #1 can be fixed by adding the dependency to both the "compileOnly" and
>> "runtimeOnly" configurations or by adding dependency to the
>> "implementation" configuration
>> #2 will make the pom files messier which can lead to confusion for users
>> but shouldn't impact existing uses.
>>
>> There was a suggestion[4] to completely remove the usage of provided from
>> the generated pom.xml and have all our previously "provided" dependencies
>> declared as "implementation" allowing us to solve both #1 and #2 above.
>>
>> The largest usage of "provided" in the past was to packages related to
>> the hadoop ecosystem and afterwards it was for packages such as
>> junit/hamcrest/aircompressor in sdks/java/core which aren't required to use
>> the module but can provide additional features if the dependency exists.
>>
>> What should we migrate if anything to the "implementation" configuration
>> or should we try to recreate what we were doing with the "provided"
>> configuration in the past?
>>
>> 1: https://issues.apache.org/jira/browse/BEAM-13430
>> 2: https://github.com/apache/beam/pull/16308
>> 3: https://issues.apache.org/jira/browse/BEAM-13569
>> 4:
>> https://github.com/apache/beam/blob/fe456b79419d1a67ebf13d7d4b6695fa1aa6204d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L964
>> 5: https://issues.apache.org/jira/browse/BEAM-13504
>>
>>