You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/10/14 21:02:48 UTC

[GitHub] [beam] alanmyrvold opened a new pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

alanmyrvold opened a new pull request #13117:
URL: https://github.com/apache/beam/pull/13117


   License script was not pulling source for license names that were spelled out. LGPL/MPL worked, but "GNU Lesser Public License" and "Mozzilla Public License" were not matched.
   
   This uncovered an issue with module names with upper case characters; fixed by not forcing lower case.
   
   Added logging.
   
   R: @emilymye
   R: @Hannah-Jiang
   R: @robinyqiu
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](htt
 ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam
 .apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
   XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website | Whitespace | Typescript
   --- | --- | --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
 am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Hannah-Jiang commented on a change in pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

Posted by GitBox <gi...@apache.org>.
Hannah-Jiang commented on a change in pull request #13117:
URL: https://github.com/apache/beam/pull/13117#discussion_r504979231



##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -101,13 +101,22 @@ def pull_from_url(file_name, url, dep, no_list):
 
 def pull_source_code(base_url, dir_name, dep):
     # base_url example: https://repo1.maven.org/maven2/org/mortbay/jetty/jsp-2.1/6.1.14/
-    soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    try:
+      soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    except:
+      logging.error('Error reading source base from {base_url}'.format(base_url=base_url))
+      raise
+    source_count = 0
     for href in (a["href"] for a in soup.select("a[href]")):
         if href.endswith(
-                '.jar') and not 'javadoc' in href:  # download jar file only
+                '.jar') and 'sources.jar' in href:  # download sources jar file only

Review comment:
       Do all source jars follow this pattern?

##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -101,13 +101,22 @@ def pull_from_url(file_name, url, dep, no_list):
 
 def pull_source_code(base_url, dir_name, dep):
     # base_url example: https://repo1.maven.org/maven2/org/mortbay/jetty/jsp-2.1/6.1.14/
-    soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    try:
+      soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    except:
+      logging.error('Error reading source base from {base_url}'.format(base_url=base_url))
+      raise
+    source_count = 0
     for href in (a["href"] for a in soup.select("a[href]")):
         if href.endswith(
-                '.jar') and not 'javadoc' in href:  # download jar file only
+                '.jar') and 'sources.jar' in href:  # download sources jar file only
             file_name = dir_name + '/' + href
             url = base_url + '/' + href
+            logging.info('Pulling source from {url}'.format(url=url))

Review comment:
       How about changing the level to `debug`? If there are many packages need to pull source, there would be a lot of log printed out. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] emilymye commented on a change in pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

Posted by GitBox <gi...@apache.org>.
emilymye commented on a change in pull request #13117:
URL: https://github.com/apache/beam/pull/13117#discussion_r505091993



##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -140,17 +149,22 @@ def execute(dep):
     }
     '''
 
-    name = dep['moduleName'].split(':')[1].lower()
+    name = dep['moduleName'].split(':')[1]
     version = dep['moduleVersion']
     name_version = name + '-' + version
+    # javac is not a runtime dependency
+    if name == 'javac':
+      logging.debug('Skipping', name_version)
+      return
     dir_name = '{license_dir}/{name_version}.jar'.format(
         license_dir=license_dir, name_version=name_version)
 
     # if auto pulled, directory is existing at {license_dir}
     if not os.path.isdir(dir_name):
         # skip self dependencies
         if dep['moduleName'].startswith('beam'):
-            logging.debug('Skippig', name_version)
+            logging.debug('Skipping', name_version)

Review comment:
       Do we only skip self-dependencies for non-auto-pulled libraries? Mostly this function seems very long and complicated to start with, and having the skipping returns at the top would make me feel a little better. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] alanmyrvold merged pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

Posted by GitBox <gi...@apache.org>.
alanmyrvold merged pull request #13117:
URL: https://github.com/apache/beam/pull/13117


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Hannah-Jiang commented on a change in pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

Posted by GitBox <gi...@apache.org>.
Hannah-Jiang commented on a change in pull request #13117:
URL: https://github.com/apache/beam/pull/13117#discussion_r505703292



##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -140,17 +149,22 @@ def execute(dep):
     }
     '''
 
-    name = dep['moduleName'].split(':')[1].lower()
+    name = dep['moduleName'].split(':')[1]
     version = dep['moduleVersion']
     name_version = name + '-' + version
+    # javac is not a runtime dependency
+    if name == 'javac':
+      logging.debug('Skipping', name_version)
+      return
+    # skip self dependencies
+    if name.startswith('beam'):

Review comment:
       If I compare to the original code, it should be `dep['moduleName']` instead of `name`. If a beam package is 'beam:xxx' and the `xxx` does not has `beam` in it, it wouldn't skipped. And I also recommend applying `lower()` when checking if startswith 'beam'.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] alanmyrvold commented on a change in pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

Posted by GitBox <gi...@apache.org>.
alanmyrvold commented on a change in pull request #13117:
URL: https://github.com/apache/beam/pull/13117#discussion_r505680858



##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -140,17 +149,22 @@ def execute(dep):
     }
     '''
 
-    name = dep['moduleName'].split(':')[1].lower()
+    name = dep['moduleName'].split(':')[1]
     version = dep['moduleVersion']
     name_version = name + '-' + version
+    # javac is not a runtime dependency
+    if name == 'javac':
+      logging.debug('Skipping', name_version)
+      return
     dir_name = '{license_dir}/{name_version}.jar'.format(
         license_dir=license_dir, name_version=name_version)
 
     # if auto pulled, directory is existing at {license_dir}
     if not os.path.isdir(dir_name):
         # skip self dependencies
         if dep['moduleName'].startswith('beam'):
-            logging.debug('Skippig', name_version)
+            logging.debug('Skipping', name_version)

Review comment:
       Moved to the top.

##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -101,13 +101,22 @@ def pull_from_url(file_name, url, dep, no_list):
 
 def pull_source_code(base_url, dir_name, dep):
     # base_url example: https://repo1.maven.org/maven2/org/mortbay/jetty/jsp-2.1/6.1.14/
-    soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    try:
+      soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    except:
+      logging.error('Error reading source base from {base_url}'.format(base_url=base_url))
+      raise
+    source_count = 0
     for href in (a["href"] for a in soup.select("a[href]")):
         if href.endswith(
-                '.jar') and not 'javadoc' in href:  # download jar file only
+                '.jar') and 'sources.jar' in href:  # download sources jar file only
             file_name = dir_name + '/' + href
             url = base_url + '/' + href
+            logging.info('Pulling source from {url}'.format(url=url))

Review comment:
       done

##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -101,13 +101,22 @@ def pull_from_url(file_name, url, dep, no_list):
 
 def pull_source_code(base_url, dir_name, dep):
     # base_url example: https://repo1.maven.org/maven2/org/mortbay/jetty/jsp-2.1/6.1.14/
-    soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    try:
+      soup = BeautifulSoup(urlopen(base_url).read(), "html.parser")
+    except:
+      logging.error('Error reading source base from {base_url}'.format(base_url=base_url))
+      raise
+    source_count = 0
     for href in (a["href"] for a in soup.select("a[href]")):
         if href.endswith(
-                '.jar') and not 'javadoc' in href:  # download jar file only
+                '.jar') and 'sources.jar' in href:  # download sources jar file only

Review comment:
       Yes, and this is guarded by the check that at least one file is found matching.
   
       if source_count == 0:
         raise RuntimeError('No source found at {base_url}'.format(base_url=base_url))




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] alanmyrvold commented on a change in pull request #13117: [BEAM-11067] Update java license script to include source for GNU and Mozilla dependencies

Posted by GitBox <gi...@apache.org>.
alanmyrvold commented on a change in pull request #13117:
URL: https://github.com/apache/beam/pull/13117#discussion_r505710543



##########
File path: sdks/java/container/license_scripts/pull_licenses_java.py
##########
@@ -140,17 +149,22 @@ def execute(dep):
     }
     '''
 
-    name = dep['moduleName'].split(':')[1].lower()
+    name = dep['moduleName'].split(':')[1]
     version = dep['moduleVersion']
     name_version = name + '-' + version
+    # javac is not a runtime dependency
+    if name == 'javac':
+      logging.debug('Skipping', name_version)
+      return
+    # skip self dependencies
+    if name.startswith('beam'):

Review comment:
       Good catch. Changed it to be:
   
       if dep['moduleName'].lower().startswith('beam'):
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org