You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "Abacn (via GitHub)" <gi...@apache.org> on 2024/02/08 22:27:08 UTC

[PR] Add an automatic GCP-BOM dependency upgrader [beam]

Abacn opened a new pull request, #30262:
URL: https://github.com/apache/beam/pull/30262

   Command: `python scripts/tools/gcpbomupgrader.py 26.31.0`
   
   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI or the [workflows README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "shunping (via GitHub)" <gi...@apache.org>.
shunping commented on PR #30262:
URL: https://github.com/apache/beam/pull/30262#issuecomment-1944419755

   LGTM!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1484466292


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed

Review Comment:
   Good point. Definitely will clean up this script when marked as ready for review



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #30262:
URL: https://github.com/apache/beam/pull/30262#issuecomment-1939431015

   > The code looks good. I am also wondering if we can use your script to add a presubmit test, so if the version is not right, the update cannot be submitted.
   > 
   > This may need to change your script to support two modes: in-place updating (which is already implemented) and reporting only. Then a new test can be added to call this script in reporting mode and check if there is any version mismatch.
   > 
   > WDYT?
   
   This sounds good, similar to .github/workflows/update_python_dependencies.yml . We can setup a "test" to generate a PR / or fail like the referred workflow.
   
   However in practice we already have many tests and there are infra related workflow no one cares and red for months. So the actual benefit I am not sure. As of the scope of this PR I am not intend to setup a test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "shunping (via GitHub)" <gi...@apache.org>.
shunping commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1484421760


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed

Review Comment:
   Could you also document what we should change BeamModulePlugin if we have a new dep that we want to pin and we want to leverage this tool later (or at least don't break the standardization)?
   
   There are two cases here:
   1) adding a new dep from a project that is already tracked in BeamModulePlugin, and 
   2) adding a new dep from a new project.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #30262:
URL: https://github.com/apache/beam/pull/30262#issuecomment-1938941088

   R: @shunping ready for review now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1484466292


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed

Review Comment:
   Good point. Definitely will clean up this script when marked as ready for review. Currently it just serves as an "unoffical" tool.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "shunping (via GitHub)" <gi...@apache.org>.
shunping commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1484421760


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed

Review Comment:
   Could you also document what we should change BeamModulePlugin if we have a new dep that we want to pin and we want to leverage this tool later (or at least don't break the standardization)?
   
   There are two cases here:
   1) adding a new dep from a project that is already tracked in BeamModulePlugin, and 2) adding a new dep from a new project.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn merged PR #30262:
URL: https://github.com/apache/beam/pull/30262


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #30262:
URL: https://github.com/apache/beam/pull/30262#issuecomment-1935038311

   CC: @dhruvdua @shunping 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1486375220


##########
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy:
##########
@@ -601,17 +604,17 @@ class BeamModulePlugin implements Plugin<Project> {
     def classgraph_version = "4.8.162"
     def dbcp2_version = "2.9.0"
     def errorprone_version = "2.10.0"
-    // Try to keep gax_version consistent with gax-grpc version in google_cloud_platform_libraries_bom
+    // [bomupgrader] determined by: com.google.api:gax, consistent with: google_cloud_platform_libraries_bom
     def gax_version = "2.41.0"
     def google_ads_version = "26.0.0"
     def google_clients_version = "2.0.0"
     def google_cloud_bigdataoss_version = "2.2.16"
-    // Try to keep google_cloud_spanner_version consistent with google_cloud_spanner_bom in google_cloud_platform_libraries_bom
+    // [bomupgrader] determined by: com.google.cloud:google-cloud-spanner, consistent with: google_cloud_platform_libraries_bom
     def google_cloud_spanner_version = "6.57.0"
     def google_code_gson_version = "2.10.1"
     def google_oauth_clients_version = "1.34.1"
-    // Try to keep grpc_version consistent with gRPC version in google_cloud_platform_libraries_bom
-    def grpc_version = "1.60.0"
+    // [bomupgrader] determined by: io.grpc:grpc-netty, consistent with: google_cloud_platform_libraries_bom
+    def grpc_version = "1.61.0"

Review Comment:
   we again have the grpc version mismatch after #30181, my bad, fixed by this script



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #30262:
URL: https://github.com/apache/beam/pull/30262#issuecomment-1938944289

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "shunping (via GitHub)" <gi...@apache.org>.
shunping commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1484399792


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed
+  by GCP-BOM.
+"""
+
+
+class BeamModulePluginProcessor:
+  # Known dependencies managed by GCP BOM and also used by Beam.
+  # We only need to have one dependency for each project to figure out the target version
+  KNOWN_DEPS = {
+      "arrow": "org.apache.arrow:arrow-memory-core",
+      "gax": "com.google.api:gax",
+      "google_cloud_spanner": "com.google.cloud:google-cloud-spanner",
+      "grpc":
+          "io.grpc:grpc-netty",  # use "grpc-netty" to pick up proper netty version
+      "netty": "io.netty:netty-transport",
+      "protobuf": "com.google.protobuf:protobuf-java"
+  }
+  # dependencies managed by GCP-BOM that used the dependencies in KNOWN_DEPS
+  # So we need to add it to the example project to get the version to sync
+  OTHER_CONSTRANTS = [
+    "com.google.cloud:google-cloud-bigquery"  # uses arrow
+  ]
+
+  # e.g. // Try to keep grpc_version consistent with gRPC version in google_cloud_platform_libraries_bom
+  ANCHOR = re.compile(
+      r'^\s*// Try to keep .+ consistent .+ google_cloud_platform_libraries_bom\s*$'
+  )
+  # e.g.  def grpc_version = "1.61.0"
+  VERSION_STRING = re.compile(
+      r'^\s*def (\w+)_version\s*=\s*[\'"](\S+)[\'"]')
+  BOM_VERSION_STRING = re.compile(
+      r'\s*google_cloud_platform_libraries_bom\s*:\s*[\'"]com\.google\.cloud:libraries-bom:([0-9\.]+)[\'"],?'
+  )
+  BUILD_DIR = 'build/dependencyResolver'
+  GRADLE_TEMPLATE = """
+plugins { id 'java' }
+repositories { mavenCentral() }
+dependencies {
+implementation platform('com.google.cloud:libraries-bom:%s')
+%s
+}
+configurations.implementation.canBeResolved = true
+"""
+
+  def __init__(
+      self,
+      bom_version,
+      filepath='buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy',
+      runnable=None):
+    self.bom_version = bom_version
+    self.filepath = filepath
+    self.runnable = runnable or os.path.abspath('gradlew')
+    logging.info('-----Read BeamModulePlugin-----')
+    with open(filepath, 'r') as fin:
+      self.original_lines = fin.readlines()
+    # e.g. {"io.grpc:grpc-netty", "1.61.0"}
+    self.dep_versions = {}
+    self.dep_versions_current = {}
+
+  def check_dependencies(self):
+    """Check dependencies in KNOWN_DEPS are found in BeamModulePlugin, and vice versa."""
+    logging.info("-----check dependency defs in BeamModulePlugin-----")
+    found_deps = {}
+    for idx, line in enumerate(self.original_lines):
+      m = self.ANCHOR.match(line)
+      if m:
+        n = self.VERSION_STRING.search(self.original_lines[idx + 1])
+        if not n:
+          raise RuntimeError(
+              "Version definition not found after anchor comment. Try standardize it."
+          )
+        found_deps[n.group(1)] = n.group(2)
+    assert sorted(self.KNOWN_DEPS.keys()) == sorted(found_deps.keys())
+    self.dep_versions_current = {
+        self.KNOWN_DEPS[k]: v for k, v in found_deps.items()
+    }
+
+  def prepare_gradle(self, bom_version):
+    logging.info("-----prepare build.gradle for example project-----")
+    try:
+      os.makedirs(self.BUILD_DIR)
+    except OSError as e:
+      if e.errno != errno.EEXIST:
+        raise
+
+    deps = []
+    for dep in list(self.KNOWN_DEPS.values()) + self.OTHER_CONSTRANTS:
+      deps.append(f"implementation '{dep}'")
+    gradle_file = self.GRADLE_TEMPLATE % (bom_version, "\n".join(deps))
+    with open(os.path.join(self.BUILD_DIR, 'build.gradle'), 'w') as fout:
+      fout.write(gradle_file)
+    # we need a settings.gradle
+    with open(os.path.join(self.BUILD_DIR, 'settings.gradle'), 'w') as fout:
+      fout.write('\n')
+
+  def resolve(self):
+    logging.info("-----resolve dependency-----")
+    subp = subprocess.run([
+        self.runnable,
+        *('-q dependencies --configuration implementation --console=plain'
+          .split())
+    ],
+                          cwd=self.BUILD_DIR,

Review Comment:
   nits: the format looks strange here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1484464994


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed
+  by GCP-BOM.
+"""
+
+
+class BeamModulePluginProcessor:
+  # Known dependencies managed by GCP BOM and also used by Beam.
+  # We only need to have one dependency for each project to figure out the target version
+  KNOWN_DEPS = {
+      "arrow": "org.apache.arrow:arrow-memory-core",
+      "gax": "com.google.api:gax",
+      "google_cloud_spanner": "com.google.cloud:google-cloud-spanner",
+      "grpc":
+          "io.grpc:grpc-netty",  # use "grpc-netty" to pick up proper netty version
+      "netty": "io.netty:netty-transport",
+      "protobuf": "com.google.protobuf:protobuf-java"
+  }
+  # dependencies managed by GCP-BOM that used the dependencies in KNOWN_DEPS
+  # So we need to add it to the example project to get the version to sync
+  OTHER_CONSTRANTS = [
+    "com.google.cloud:google-cloud-bigquery"  # uses arrow
+  ]
+
+  # e.g. // Try to keep grpc_version consistent with gRPC version in google_cloud_platform_libraries_bom
+  ANCHOR = re.compile(
+      r'^\s*// Try to keep .+ consistent .+ google_cloud_platform_libraries_bom\s*$'
+  )
+  # e.g.  def grpc_version = "1.61.0"
+  VERSION_STRING = re.compile(
+      r'^\s*def (\w+)_version\s*=\s*[\'"](\S+)[\'"]')
+  BOM_VERSION_STRING = re.compile(
+      r'\s*google_cloud_platform_libraries_bom\s*:\s*[\'"]com\.google\.cloud:libraries-bom:([0-9\.]+)[\'"],?'
+  )
+  BUILD_DIR = 'build/dependencyResolver'
+  GRADLE_TEMPLATE = """
+plugins { id 'java' }
+repositories { mavenCentral() }
+dependencies {
+implementation platform('com.google.cloud:libraries-bom:%s')
+%s
+}
+configurations.implementation.canBeResolved = true
+"""
+
+  def __init__(
+      self,
+      bom_version,
+      filepath='buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy',
+      runnable=None):
+    self.bom_version = bom_version
+    self.filepath = filepath
+    self.runnable = runnable or os.path.abspath('gradlew')
+    logging.info('-----Read BeamModulePlugin-----')
+    with open(filepath, 'r') as fin:
+      self.original_lines = fin.readlines()
+    # e.g. {"io.grpc:grpc-netty", "1.61.0"}
+    self.dep_versions = {}
+    self.dep_versions_current = {}
+
+  def check_dependencies(self):
+    """Check dependencies in KNOWN_DEPS are found in BeamModulePlugin, and vice versa."""
+    logging.info("-----check dependency defs in BeamModulePlugin-----")
+    found_deps = {}
+    for idx, line in enumerate(self.original_lines):
+      m = self.ANCHOR.match(line)
+      if m:
+        n = self.VERSION_STRING.search(self.original_lines[idx + 1])
+        if not n:
+          raise RuntimeError(
+              "Version definition not found after anchor comment. Try standardize it."
+          )
+        found_deps[n.group(1)] = n.group(2)
+    assert sorted(self.KNOWN_DEPS.keys()) == sorted(found_deps.keys())
+    self.dep_versions_current = {
+        self.KNOWN_DEPS[k]: v for k, v in found_deps.items()
+    }
+
+  def prepare_gradle(self, bom_version):
+    logging.info("-----prepare build.gradle for example project-----")
+    try:
+      os.makedirs(self.BUILD_DIR)
+    except OSError as e:
+      if e.errno != errno.EEXIST:
+        raise
+
+    deps = []
+    for dep in list(self.KNOWN_DEPS.values()) + self.OTHER_CONSTRANTS:
+      deps.append(f"implementation '{dep}'")
+    gradle_file = self.GRADLE_TEMPLATE % (bom_version, "\n".join(deps))
+    with open(os.path.join(self.BUILD_DIR, 'build.gradle'), 'w') as fout:
+      fout.write(gradle_file)
+    # we need a settings.gradle
+    with open(os.path.join(self.BUILD_DIR, 'settings.gradle'), 'w') as fout:
+      fout.write('\n')
+
+  def resolve(self):
+    logging.info("-----resolve dependency-----")
+    subp = subprocess.run([
+        self.runnable,
+        *('-q dependencies --configuration implementation --console=plain'
+          .split())
+    ],
+                          cwd=self.BUILD_DIR,

Review Comment:
   That was the result that I have run yapf on the script.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #30262:
URL: https://github.com/apache/beam/pull/30262#discussion_r1486372962


##########
scripts/tools/gcpbomupgrader.py:
##########
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import logging
+import os
+import re
+import subprocess
+import sys
+"""
+This Python script is used for upgrading the GCP-BOM in BeamModulePlugin.
+Specifically, it
+
+1. preprocessing BeamModulePlugin.groovy to decide the dependencies need to sync
+2. generate an empty Maven project to fetch the exact target versions to change
+3. Write back to BeamModulePlugin.groovy
+
+There are few reasons we need to declare the version numbers:
+1. Sync the dependency that not included in GCP-BOM with those included with BOM
+  For example, "com.google.cloud:google-cloud-spanner" does while "com.google.cloud:google-cloud-spanner:():test" doesn't
+2. There are Beam artifacts not depending on GCP-BOM but used dependency managed

Review Comment:
   Added comments about the tags that the script relies on: https://github.com/apache/beam/pull/30262/files#diff-0435a83a413ec063bf7e682cadcd56776cd18fc878f197cc99a65fc231ef2047R595



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add an automatic GCP-BOM dependency upgrader [beam]

Posted by "shunping (via GitHub)" <gi...@apache.org>.
shunping commented on PR #30262:
URL: https://github.com/apache/beam/pull/30262#issuecomment-1938995115

   The code looks good. I am also wondering if we can use your script to add a presubmit test, so if the version is not right, the update cannot be submitted.
   
   This may need to change your script to support two modes: in-place updating (which is already implemented) and reporting only. Then a new test can be added to call this script in reporting mode and check if there is any version mismatch.
   
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org