You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/09/02 21:21:03 UTC

[GitHub] [beam] pabloem opened a new pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

pabloem opened a new pull request #12762:
URL: https://github.com/apache/beam/pull/12762


   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](htt
 ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_
 Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_P
 ostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
   XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website | Whitespace | Typescript
   --- | --- | --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
 am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   ![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg)
   ![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg)
   ![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r483276194



##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -625,6 +625,7 @@ def to_type_hint(self):
 
 
 class _CustomBigQuerySource(BoundedSource):
+
   def __init__(

Review comment:
       Yes, I'll make the change on the Java side as well.
   
   This is part of a series of improvements to BQ. Part of the requirements is to include the launching step info along with the Dataflow job info.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r492332731



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -28,6 +28,11 @@
 _VALID_CLOUD_LABEL_PATTERN = re.compile(r'^[a-z0-9\_\-]{1,63}$')
 
 
+def _sanitize_value(value):
+  """Sanitizes a value into a valid BigQuery label value."""
+  return re.sub('[^\w-]+', '', value.lower().replace('/', '-'))[0:63]

Review comment:
       This is in bigquery_io_metadata. Thanks!

##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -1875,14 +1895,22 @@ def process(self, unused_element, signal):
     temp_location = pcoll.pipeline.options.view_as(
         GoogleCloudOptions).temp_location
     gcs_location = self._get_destination_uri(temp_location)
+    job_name = pcoll.pipeline.options.view_as(GoogleCloudOptions).job_name
 
+    try:
+      step_name = self.label

Review comment:
       This is set by the Python SDK as it expands transforms. It's definitely not orthodox to rely on it, and a little unusual.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-686731633


   r: @ajamato 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696413272


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696456903


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r483275160



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -64,6 +64,8 @@ def create_bigquery_io_metadata():
     # As we do not want a bad label to fail the BQ job.
     if _is_valid_cloud_label_value(dataflow_job_id):
       kwargs['beam_job_id'] = dataflow_job_id
+  if step_name:
+    kwargs['step_name'] = step_name

Review comment:
       I've added a sanitizer and a conditional append.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r483249377



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -64,6 +64,8 @@ def create_bigquery_io_metadata():
     # As we do not want a bad label to fail the BQ job.
     if _is_valid_cloud_label_value(dataflow_job_id):
       kwargs['beam_job_id'] = dataflow_job_id
+  if step_name:
+    kwargs['step_name'] = step_name

Review comment:
       You're right that we need to sanitize. The step name would be something like `CustomerTransform/ReadFromBigquery/Read/_CustomBigQuertSouce`. I can add sanitization to it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=h1) Report
   > Merging [#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc) will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12762       +/-   ##
   ===========================================
   + Coverage   40.30%   81.99%   +41.69%     
   ===========================================
     Files         451      459        +8     
     Lines       53168    54387     +1219     
   ===========================================
   + Hits        21429    44597    +23168     
   + Misses      31739     9790    -21949     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==) | `26.95% <26.95%> (ø)` | |
   | [sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | `77.31% <77.31%> (ø)` | |
   | [sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | [setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2V0dXAucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=) | `100.00% <0.00%> (ø)` | |
   | ... and [290 more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=footer). Last update [bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=h1) Report
   > Merging [#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc) will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12762       +/-   ##
   ===========================================
   + Coverage   40.30%   81.99%   +41.69%     
   ===========================================
     Files         451      459        +8     
     Lines       53168    54387     +1219     
   ===========================================
   + Hits        21429    44597    +23168     
   + Misses      31739     9790    -21949     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==) | `26.95% <26.95%> (ø)` | |
   | [sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | `77.31% <77.31%> (ø)` | |
   | [sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | [setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2V0dXAucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=) | `100.00% <0.00%> (ø)` | |
   | ... and [290 more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=footer). Last update [bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r483275319



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
##########
@@ -826,7 +836,8 @@ def _load_data(
             TriggerCopyJobs(
                 create_disposition=self.create_disposition,
                 write_disposition=self.write_disposition,
-                test_client=self.test_client),
+                test_client=self.test_client,
+                step_name=step_name),

Review comment:
       in this case I believe it's pretty useful. The step would be `UserTransform/ReadFromBigQuery/_CustomBigQuerySource` or whatever.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r492332731



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -28,6 +28,11 @@
 _VALID_CLOUD_LABEL_PATTERN = re.compile(r'^[a-z0-9\_\-]{1,63}$')
 
 
+def _sanitize_value(value):
+  """Sanitizes a value into a valid BigQuery label value."""
+  return re.sub('[^\w-]+', '', value.lower().replace('/', '-'))[0:63]

Review comment:
       This is in bigquery_io_metadata. Thanks!

##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -1875,14 +1895,22 @@ def process(self, unused_element, signal):
     temp_location = pcoll.pipeline.options.view_as(
         GoogleCloudOptions).temp_location
     gcs_location = self._get_destination_uri(temp_location)
+    job_name = pcoll.pipeline.options.view_as(GoogleCloudOptions).job_name
 
+    try:
+      step_name = self.label

Review comment:
       This is set by the Python SDK as it expands transforms. It's definitely not orthodox to rely on it, and a little unusual.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] ajamato commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
ajamato commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r483239088



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -64,6 +64,8 @@ def create_bigquery_io_metadata():
     # As we do not want a bad label to fail the BQ job.
     if _is_valid_cloud_label_value(dataflow_job_id):
       kwargs['beam_job_id'] = dataflow_job_id
+  if step_name:
+    kwargs['step_name'] = step_name

Review comment:
       This is likely to cause the BQ job to fail. The step_name would need to be sanitized. For the cloud label format
   https://cloud.google.com/resource-manager/docs/creating-managing-labels#requirements
   
   
   - Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
   - Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
   
   

##########
File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
##########
@@ -826,7 +836,8 @@ def _load_data(
             TriggerCopyJobs(
                 create_disposition=self.create_disposition,
                 write_disposition=self.write_disposition,
-                test_client=self.test_client),
+                test_client=self.test_client,
+                step_name=step_name),

Review comment:
       Do we know the format of the step_name. Is this just what the SDK harness is referring to the step as?
   
   IIRC these did not reflect something meaningful, in the original graph to the user. So we probably don't want to show them
   
   I know the pcollections in python at least were very opaque "pcollection1", "pcollection2", etc.
   
   Normally we rely on the DF RunnerHarness to translate those before sending to the UI (ex: on metrics)

##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -625,6 +625,7 @@ def to_type_hint(self):
 
 
 class _CustomBigQuerySource(BoundedSource):
+
   def __init__(

Review comment:
       Would you mind sharing the motivation with me on this. Its not entirely clear why you are making this change to me.
   
   Can you make equivalent changes in the Java implementation as well? For a consistent experience

##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -1875,14 +1895,22 @@ def process(self, unused_element, signal):
     temp_location = pcoll.pipeline.options.view_as(
         GoogleCloudOptions).temp_location
     gcs_location = self._get_destination_uri(temp_location)
+    job_name = pcoll.pipeline.options.view_as(GoogleCloudOptions).job_name
 
+    try:
+      step_name = self.label

Review comment:
       I looked at the file, and it seems like self.label not referenced anywhere else in the class, and I don't see a call to super ctor which would set it.
   
   Is this using a werid pattern, where something external is taking the object and attaching parameters to it?
   
   Is it possible to fix up that style, so there is a clear method or something we can see where it gets added extenrally?
   
   Or to t lease define a default, empty value with a comment saying how it gets set




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem merged pull request #12762: [BEAM-10948] Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem merged pull request #12762:
URL: https://github.com/apache/beam/pull/12762


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696374290






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-686017584


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=h1) Report
   > Merging [#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc) will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12762       +/-   ##
   ===========================================
   + Coverage   40.30%   81.99%   +41.69%     
   ===========================================
     Files         451      459        +8     
     Lines       53168    54387     +1219     
   ===========================================
   + Hits        21429    44597    +23168     
   + Misses      31739     9790    -21949     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==) | `26.95% <26.95%> (ø)` | |
   | [sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | `77.31% <77.31%> (ø)` | |
   | [sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | [setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2V0dXAucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=) | `100.00% <0.00%> (ø)` | |
   | ... and [290 more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=footer). Last update [bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r492332731



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -28,6 +28,11 @@
 _VALID_CLOUD_LABEL_PATTERN = re.compile(r'^[a-z0-9\_\-]{1,63}$')
 
 
+def _sanitize_value(value):
+  """Sanitizes a value into a valid BigQuery label value."""
+  return re.sub('[^\w-]+', '', value.lower().replace('/', '-'))[0:63]

Review comment:
       This is in bigquery_io_metadata. Thanks!

##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -1875,14 +1895,22 @@ def process(self, unused_element, signal):
     temp_location = pcoll.pipeline.options.view_as(
         GoogleCloudOptions).temp_location
     gcs_location = self._get_destination_uri(temp_location)
+    job_name = pcoll.pipeline.options.view_as(GoogleCloudOptions).job_name
 
+    try:
+      step_name = self.label

Review comment:
       This is set by the Python SDK as it expands transforms. It's definitely not orthodox to rely on it, and a little unusual.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696374290






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=h1) Report
   > Merging [#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc) will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12762       +/-   ##
   ===========================================
   + Coverage   40.30%   81.99%   +41.69%     
   ===========================================
     Files         451      459        +8     
     Lines       53168    54387     +1219     
   ===========================================
   + Hits        21429    44597    +23168     
   + Misses      31739     9790    -21949     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==) | `26.95% <26.95%> (ø)` | |
   | [sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | `77.31% <77.31%> (ø)` | |
   | [sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | [setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2V0dXAucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=) | `100.00% <0.00%> (ø)` | |
   | ... and [290 more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=footer). Last update [bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696374290


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] ajamato commented on a change in pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
ajamato commented on a change in pull request #12762:
URL: https://github.com/apache/beam/pull/12762#discussion_r489745304



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_io_metadata.py
##########
@@ -28,6 +28,11 @@
 _VALID_CLOUD_LABEL_PATTERN = re.compile(r'^[a-z0-9\_\-]{1,63}$')
 
 
+def _sanitize_value(value):
+  """Sanitizes a value into a valid BigQuery label value."""
+  return re.sub('[^\w-]+', '', value.lower().replace('/', '-'))[0:63]

Review comment:
       Ah, just remembered that I checked in helpers in both python and java a python/apache_beam/io/gcp/bigquery_io_metadata.py 
   
   You may want to move the santize function to those files.
   _is_valid_cloud_label_value




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=h1) Report
   > Merging [#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc) will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12762       +/-   ##
   ===========================================
   + Coverage   40.30%   81.99%   +41.69%     
   ===========================================
     Files         451      459        +8     
     Lines       53168    54387     +1219     
   ===========================================
   + Hits        21429    44597    +23168     
   + Misses      31739     9790    -21949     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==) | `26.95% <26.95%> (ø)` | |
   | [sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | `77.31% <77.31%> (ø)` | |
   | [sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | [setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2V0dXAucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=) | `100.00% <0.00%> (ø)` | |
   | ... and [290 more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=footer). Last update [bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem merged pull request #12762: [BEAM-10948] Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
pabloem merged pull request #12762:
URL: https://github.com/apache/beam/pull/12762


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=h1) Report
   > Merging [#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc) will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12762       +/-   ##
   ===========================================
   + Coverage   40.30%   81.99%   +41.69%     
   ===========================================
     Files         451      459        +8     
     Lines       53168    54387     +1219     
   ===========================================
   + Hits        21429    44597    +23168     
   + Misses      31739     9790    -21949     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==) | `26.95% <26.95%> (ø)` | |
   | [sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | `77.31% <77.31%> (ø)` | |
   | [sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | [setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2V0dXAucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=) | `100.00% <0.00%> (ø)` | |
   | ... and [290 more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=footer). Last update [bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org