You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/02/26 12:16:22 UTC

[GitHub] [beam] Fokko commented on pull request #14081: [BEAM-11865] Add readMessagesWithAttributesWithCoderAndParseFn to PubSubIO

Fokko commented on pull request #14081:
URL: https://github.com/apache/beam/pull/14081#issuecomment-786612636


   Thanks for looking at this @dpcollins-google @pabloem 
   
   > I don't have decision power on PubsubIO, but I'd really rather have fewer configuration options here to maintain, not more.
   
   I agree here, but the builder is private, so I need to this is the only way to make this possible.
   
   > Is there any reason you can't use `readMessagesWithAttributesAndMessageId()` followed by a MapElements?
   
   I currently use `readMessagesWithAttributes()` with a `ParDo` to do this. However, I'm running into the limits of dataflow: The job graph is too large. Please try again with a smaller job graph, or split your job into two or more smaller jobs. 
   
   A `ParDo` is actually quite overkill since the parsing of the messages is rather simple, based on the header I simply parse the message. Something that would perfectly fit in the SimpleFunction, and reducing the size of the job graph.
   
   I need the headers since it contains the message type and the version, so I know how to parse the message.
   
   Let me know if this is enough context, and that we can get this in. Let me know if there are any further questions or concerns.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org