You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2021/09/01 17:24:00 UTC
[jira] [Updated] (BEAM-12514) BigQueryIO - ReadFromBigQuery can not
get table reference from RuntimeValueProvider
[ https://issues.apache.org/jira/browse/BEAM-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Beam JIRA Bot updated BEAM-12514:
---------------------------------
Labels: GCP pull-request-available (was: GCP pull-request-available stale-P2)
> BigQueryIO - ReadFromBigQuery can not get table reference from RuntimeValueProvider
> -----------------------------------------------------------------------------------
>
> Key: BEAM-12514
> URL: https://issues.apache.org/jira/browse/BEAM-12514
> Project: Beam
> Issue Type: Bug
> Components: io-py-gcp
> Affects Versions: 2.26.0, 2.27.0, 2.28.0, 2.29.0, 2.30.0
> Reporter: Teng Qiu
> Priority: P3
> Labels: GCP, pull-request-available
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> After this [change|https://github.com/apache/beam/commit/1a08c01ab1cadec16cc13866d5e6c64f6b447b03#diff-0c77ebbb0792b80a76f4ec1f2b89b9d319f5e33f1ee0857ed6fe414f75c48cbbR746], the table reference couldn't be parsed correctly when {{ReadFromBigQuery}} is taking a {{RuntimeValueProvider}} as value of table.
> This bug is included since v2.26.0
>
> Code to reproduce:
> {code:python}
> import apache_beam as beam
> from apache_beam.options.pipeline_options import PipelineOptions
> class MyOptions(PipelineOptions):
> @classmethod
> def _add_argparse_args(cls, parser):
> parser.add_value_provider_argument(
> "--input",
> type=str,
> default='test-project:test_dataset.test_table',
> )
> pipeline_options = PipelineOptions()
> with beam.Pipeline(options=pipeline_options) as p:
> my_options = pipeline_options.view_as(MyOptions)
> lines = p | "read" >> beam.io.ReadFromBigQuery(table=my_options.input)
> row_count = lines | "count" >> beam.combiners.Count.Globally()
> row_count | beam.Map(print)
> p.run().wait_until_finish()
> {code}
> Error message will be:
> {noformat}
> Traceback (most recent call last):
> ..
> ..
> File "....../venv/lib/python3.8/site-packages/apache_beam/io/gcp/bigquery.py", line 722, in estimate_size
> if not table_ref.projectId:
> AttributeError: 'RuntimeValueProvider' object has no attribute 'projectId' [while running 'read/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/SplitAndSizeRestriction']
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)