You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sergei Lilichenko (Jira)" <ji...@apache.org> on 2021/11/01 16:07:00 UTC

[jira] [Created] (BEAM-13158) Improve BigQueryIO Storage Write API data validation error handling

Sergei Lilichenko created BEAM-13158:
----------------------------------------

             Summary: Improve BigQueryIO Storage Write API data validation error handling
                 Key: BEAM-13158
                 URL: https://issues.apache.org/jira/browse/BEAM-13158
             Project: Beam
          Issue Type: New Feature
          Components: io-java-gcp
    Affects Versions: 2.33.0
            Reporter: Sergei Lilichenko


A single invalid row causes the BigQueryIO transform and the whole pipeline to fail. The desired behavior would be to allow control of the error handling - either fail on any validation failure (current behavior) or return the list of failed records through the WriteResult. 

There are two places where the exception occurs - Json to protobuf conversion and the BigQuery backend. 

Example of the exception caused by the conversion:

io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The proto field mismatched with BigQuery field at D586b3f9a_1543_4dbe_87ff_ef786d6803c2.bytes_sent, the proto field type string, BigQuery field type INTEGER Entity: projects/event-processing-demo/datasets/bigquery_io/tables/events/streams/Cic2MzUyMTYxYy0wMDAwLTI2MjktOGVjYy1mNDAzMDQ1ZWY5Y2U6czI

Example of the exception caused by the BigQuery backend: 

io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Field dst_ip: STRING(15) has maximum length 15 but got a value with length 54 Entity: projects/event-processing-demo/datasets/bigquery_io/tables/events/streams/CiQ2MzRkOGM5Mi0wMDAwLTI2MjktOGVjYy1mNDAzMDQ1ZWY5Y2U



--
This message was sent by Atlassian Jira
(v8.3.4#803005)