You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Egbert (JIRA)" <ji...@apache.org> on 2019/04/18 07:59:00 UTC

[jira] [Created] (BEAM-7107) PubsubIO may exceed maximum payload size

Egbert created BEAM-7107:
----------------------------

             Summary: PubsubIO may exceed maximum payload size
                 Key: BEAM-7107
                 URL: https://issues.apache.org/jira/browse/BEAM-7107
             Project: Beam
          Issue Type: Bug
          Components: io-java-gcp
    Affects Versions: 2.11.0
            Reporter: Egbert


In a batch job on Dataflow that reads payload and metadata from a Bigquery table and publishes them to PubsubIO, I sometimes experience errors:
{noformat}
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
"message" : "Request payload size exceeds the limit: 10485760 bytes.",

{noformat}
 

PubsubIO Javadoc says it will use the the global limit of 10 MiB by default but it seems that doesn't work in all circumstances. I'm handling relatively large records here, up to 600 KiB per message.
 
 Adding
{code:java}
.withMaxBatchBytesSize(5242880){code}
after
{code:java}
PubsubIO.writeMessages().to(topic){code}
fixes this issue.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)