You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/12/13 23:54:47 UTC

[GitHub] [beam] chunyang commented on a change in pull request #16156: [BEAM-13391] Fix temporary file format in WriteToBigQuery

chunyang commented on a change in pull request #16156:
URL: https://github.com/apache/beam/pull/16156#discussion_r768201379



##########
File path: sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py
##########
@@ -412,11 +420,103 @@ def test_big_query_write_temp_table_append_schema_update(self):
           | 'write' >> beam.io.WriteToBigQuery(
               table_id,
               schema=table_schema,
-              write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
+              write_disposition=BigQueryDisposition.WRITE_APPEND,

Review comment:
       style nit: Rest of the file is using `beam.io.BigQueryDisposition` rather than `BigQueryDisposition`.

##########
File path: sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py
##########
@@ -368,9 +376,9 @@ def test_big_query_write_temp_table_append_schema_update(self):
     table_id = '{}.{}'.format(self.dataset_id, table_name)
 
     input_data = [{
-        "int64": num, "bool": True, "nested_field": {
+        "int64": num, "bool": True, "nested_field": [{
             "fruit": "Apple"
-        }
+        }]

Review comment:
       😮  wow how did this test pass before?

##########
File path: sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py
##########
@@ -412,11 +420,103 @@ def test_big_query_write_temp_table_append_schema_update(self):
           | 'write' >> beam.io.WriteToBigQuery(
               table_id,
               schema=table_schema,
-              write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
+              write_disposition=BigQueryDisposition.WRITE_APPEND,
               max_file_size=1,  # bytes
               method=beam.io.WriteToBigQuery.Method.FILE_LOADS,
               additional_bq_parameters={
-                  'schemaUpdateOptions': ['ALLOW_FIELD_ADDITION']}))
+                  'schemaUpdateOptions': ['ALLOW_FIELD_ADDITION']},
+              temp_file_format=file_format))
+
+  @pytest.mark.it_postcommit
+  @parameterized.expand([
+      param(file_format=FileFormat.AVRO),
+      param(file_format=FileFormat.JSON),
+      param(file_format=None),
+  ])
+  @mock.patch(
+      "apache_beam.io.gcp.bigquery_file_loads._DEFAULT_MAX_FILE_SIZE", new=1)
+  @mock.patch(
+      "apache_beam.io.gcp.bigquery_file_loads._MAXIMUM_SOURCE_URIS", new=1)
+  def test_append_schema_change_with_temporary_tables(self, file_format):

Review comment:
       What is this testing that `test_big_query_write_temp_table_append_schema_update` is not testing? Is it possible to do everything in one test?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org