You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 16:30:08 UTC

[GitHub] [beam] damccorm opened a new issue, #20309: Removing Invalid JSON messages from PCollection before starting BigQueryIO Operations

damccorm opened a new issue, #20309:
URL: https://github.com/apache/beam/issues/20309

   In a typical set up of Pub Sub and Cloud Dataflow, a pub sub subscriber might receive some messages that does not follow a valid json structure and the Big Query Insert operation fails to process these messages and the worker may gets terminated if the exception is not handled correctly.
   
   The likelihood of receiving the invalid json messages are very less and the upstream component pushing messages on the Topic should have a validation at their end but this is not always the case and the application should be robust enough to survive even if there are wrong messages being pushed by the Upstreams. 
   
   I have created an Enum which acts like a Predicate in Filter transform. This is very standard logic of validating Json and i would like to add this to the java SDK(and Python) in the Filter transform 
   
   Imported from Jira [BEAM-9873](https://issues.apache.org/jira/browse/BEAM-9873). Original Jira may contain additional context.
   Reported by: varun.sharma.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org