You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/05 01:22:11 UTC

[GitHub] [beam] damccorm opened a new issue, #21711: Python Streaming job failing to drain with BigQueryIO write errors

damccorm opened a new issue, #21711:
URL: https://github.com/apache/beam/issues/21711

   We have a Python Streaming Dataflow job that writes to BigQuery using the `FILE_LOADS` method and `auto_sharding` enabled. When we try to drain the job it fails with the following error,
   ```
   
   "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_tools.py", line 1000, in perform_load_job
   ValueError: Either a non-empty list of fully-qualified source URIs must be provided via the source_uris
   parameter or an open file object must be provided via the source_stream parameter.
   
   ```
   
   Our `WriteToBigQuery` configuration,
   ```
   
   beam.io.WriteToBigQuery(
     table=options.output_table,
     schema=bq_schema,
     create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
   
    write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
     insert_retry_strategy=RetryStrategy.RETRY_ON_TRANSIENT_ERROR,
   
    method=beam.io.WriteToBigQuery.Method.FILE_LOADS,
     additional_bq_parameters={
       "timePartitioning":
   {
         "type": "HOUR",
         "field": "bq_insert_timestamp",
       },
       "schemaUpdateOptions":
   ["ALLOW_FIELD_ADDITION", "ALLOW_FIELD_RELAXATION"],
     },
     triggering_frequency=120,
     with_auto_sharding=True,
   )
   
   ```
   
   
   We are also noticing that the job only fails to drain when there are actual schema updates. If there are no schema updates the job drains without the above error.
   
   Imported from Jira [BEAM-14146](https://issues.apache.org/jira/browse/BEAM-14146). Original Jira may contain additional context.
   Reported by: rahuli.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on issue #21711: Python Streaming job failing to drain with BigQueryIO write errors

Posted by GitBox <gi...@apache.org>.
ahmedabu98 commented on issue #21711:
URL: https://github.com/apache/beam/issues/21711#issuecomment-1310969808

   Fixed by #23710


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #21711: Python Streaming job failing to drain with BigQueryIO write errors

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #21711:
URL: https://github.com/apache/beam/issues/21711#issuecomment-1155253253

   From the Jira it looks like the job failure was solved but the drain has trouble completing. I don't think we have to block the release on it, even though it does seem important to continue to figure out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #21711: Python Streaming job failing to drain with BigQueryIO write errors

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #21711:
URL: https://github.com/apache/beam/issues/21711#issuecomment-1155256527

   CC @chamikaramj from the Jira CCs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] closed issue #21711: Python Streaming job failing to drain with BigQueryIO write errors

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #21711: Python Streaming job failing to drain with BigQueryIO write errors
URL: https://github.com/apache/beam/issues/21711


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on issue #21711: Python Streaming job failing to drain with BigQueryIO write errors

Posted by GitBox <gi...@apache.org>.
ahmedabu98 commented on issue #21711:
URL: https://github.com/apache/beam/issues/21711#issuecomment-1310969930

   .close-issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org