You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 01:53:24 UTC

[GitHub] [beam] kennknowles opened a new issue, #19494: python sdk WriteToBigQuery excessive usage of metered API

kennknowles opened a new issue, #19494:
URL: https://github.com/apache/beam/issues/19494

   Right now, there is a potential issue with the python sdk where `beam.io.gcp.bigquery.WriteToBigQuery` calls the following api more often than needed:
   
   [https://www.googleapis.com/bigquery/v2/projects/<project-name>/datasets/<dataset-name>/tables/<table-name>?alt=json](https://www.googleapis.com/bigquery/v2/projects/%3Cproject-name%3E/datasets/%3Cdataset-name%3E/tables/%3Ctable-name%3E?alt=json)
   
   The above request falls under specific bigquery API quotas which are excluded from bigquery streaming inserts. When used in a streaming pipeline, we hit this quota pretty quickly, and cannot proceed to write any further data to bigquery.
   
   Dispositions being used are:
    * create_disposition: `beam.io.BigQueryDisposition.CREATE_NEVER`
    * write_disposition: `beam.io.BigQueryDisposition.WRITE_APPEND`
   
   This is currently blocking us from using bigqueryIO in a streaming pipeline to write to bigquery, and required us to formally request an API quota increase from Google to temporarily correct the situation.
   
   Our pipeline uses DataflowRunner. Error seen is below, and in attached screenshot of stackdriver trace.
   ```
   
     "errors": [
       {
         "message": "Exceeded rate limits: too many api requests per user per method
   for this user_method. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors",
   
        "domain": "usageLimits",
         "reason": "rateLimitExceeded"
       }
     ],
   
   ```
   
   
   Imported from Jira [BEAM-6831](https://issues.apache.org/jira/browse/BEAM-6831). Original Jira may contain additional context.
   Reported by: pesach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org