You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 19:34:15 UTC

[GitHub] [beam] kennknowles opened a new issue, #18660: Request payload size exceeds the limit: 10485760 bytes

kennknowles opened a new issue, #18660:
URL: https://github.com/apache/beam/issues/18660

   I wrote a python dataflow job to read data from Bigquery and do some transform and save the result as bq table..
   
   I tested with 8 days data it works fine - when I scaled to 180 days I’m getting the below error
   
   ```"message": "Request payload size exceeds the limit: 10485760 bytes.",```
   
   
   ```pitools.base.py.exceptions.HttpError: HttpError accessing <https://dataflow.googleapis.com/v1b3/projects/careem-mktg-dwh/locations/us-central1/jobs?alt=json\>: response: <{'status': '400', 'content-length': '145', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 10 Jan 2018 22:49:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'hq=":443"; ma=2592000; quic=51303431; quic=51303339; quic=51303338; quic=51303337; quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35"', 'content-type': 'application/json; charset=UTF-8'}\>, content <{
   "error": {
   "code": 400,
   "message": "Request payload size exceeds the limit: 10485760 bytes.",
   "status": "INVALID_ARGUMENT"
   }
   
   ```
   
   
   In short, this is what I’m doing
   1 - Reading data from bigquery table using
   ```beam.io.BigQuerySource ```
   2 - Partitioning each days using
   ``` beam.Partition ```
   3- Applying transforms each partition and combining some output P-Collections.
   4- After the transforms, the results are saved to a biqquery date partitioned table.
   
   Imported from Jira [BEAM-3455](https://issues.apache.org/jira/browse/BEAM-3455). Original Jira may contain additional context.
   Reported by: unais.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] baeminbo commented on issue #18660: Request payload size exceeds the limit: 10485760 bytes

Posted by GitBox <gi...@apache.org>.
baeminbo commented on issue #18660:
URL: https://github.com/apache/beam/issues/18660#issuecomment-1307331745

   I read [the discussion](https://lists.apache.org/thread/ty2pgo5xmpf3l9z3sdk2jv8fnmw45nmw) in the mailing list. I believe we don't have 10 MB request size limit in Dataflow. IIUC, we have only size limit for single element value in Streaming Engine (80MB), and `CommitRequest` limit (2GB?). See https://cloud.google.com/dataflow/quotas.
   
   In the meantime, Pub/Sub still has 10MB limit for "Publish request" and "Message Size": https://cloud.google.com/pubsub/quotas#resource_limits. 
   
   Can we ask the reporter to confirm if this issue still happens with the latest Beam version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org