You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/08/06 21:31:27 UTC

[GitHub] [beam] galuszkak commented on a change in pull request #12480: [BEAM-10647] Fixes get_query_location bug in BigQueryWrapper

galuszkak commented on a change in pull request #12480:
URL: https://github.com/apache/beam/pull/12480#discussion_r466696088



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_tools.py
##########
@@ -318,15 +319,20 @@ def get_query_location(self, project_id, query, use_legacy_sql):
 
     referenced_tables = response.statistics.query.referencedTables
     if referenced_tables:  # Guards against both non-empty and non-None

Review comment:
       @apilloud comment suggest that is also guarding against None value. If that's true, dropping this if, would break code `for loop` in case referenced_tables is actually None. 
   
   Therefore my question is, are you sure we want to remove that if?

##########
File path: sdks/python/apache_beam/io/gcp/bigquery_tools.py
##########
@@ -318,15 +319,20 @@ def get_query_location(self, project_id, query, use_legacy_sql):
 
     referenced_tables = response.statistics.query.referencedTables
     if referenced_tables:  # Guards against both non-empty and non-None
-      table = referenced_tables[0]
-      location = self.get_table_location(
-          table.projectId, table.datasetId, table.tableId)
-      _LOGGER.info(
-          "Using location %r from table %r referenced by query %s",
-          location,
-          table,
-          query)
-      return location
+      for table in referenced_tables:
+        try:
+          location = self.get_table_location(
+              table.projectId, table.datasetId, table.tableId)
+        except HttpForbiddenError:
+          # Permission access for table (i.e. from authorized_view),
+          # try next one
+          continue
+        _LOGGER.info(
+            "Using location %r from table %r referenced by query %s",
+            location,
+            table,
+            query)
+        return location
 
     _LOGGER.debug("Query %s does not reference any tables.", query)

Review comment:
       @apilloud will change it. Thanks for pointing this out.

##########
File path: sdks/python/apache_beam/io/gcp/bigquery_tools_test.py
##########
@@ -50,6 +49,15 @@
 from apache_beam.io.gcp.internal.clients import bigquery
 from apache_beam.options.pipeline_options import PipelineOptions
 
+# Protect against environments where bigquery library is not available.

Review comment:
       @apilloud this is copy from bigquery_test.py I assumed we want this consistent between files.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org