You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sergei Lilichenko (Jira)" <ji...@apache.org> on 2021/11/01 16:07:00 UTC
[jira] [Created] (BEAM-13158) Improve BigQueryIO Storage Write API
data validation error handling
Sergei Lilichenko created BEAM-13158:
----------------------------------------
Summary: Improve BigQueryIO Storage Write API data validation error handling
Key: BEAM-13158
URL: https://issues.apache.org/jira/browse/BEAM-13158
Project: Beam
Issue Type: New Feature
Components: io-java-gcp
Affects Versions: 2.33.0
Reporter: Sergei Lilichenko
A single invalid row causes the BigQueryIO transform and the whole pipeline to fail. The desired behavior would be to allow control of the error handling - either fail on any validation failure (current behavior) or return the list of failed records through the WriteResult.
There are two places where the exception occurs - Json to protobuf conversion and the BigQuery backend.
Example of the exception caused by the conversion:
io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The proto field mismatched with BigQuery field at D586b3f9a_1543_4dbe_87ff_ef786d6803c2.bytes_sent, the proto field type string, BigQuery field type INTEGER Entity: projects/event-processing-demo/datasets/bigquery_io/tables/events/streams/Cic2MzUyMTYxYy0wMDAwLTI2MjktOGVjYy1mNDAzMDQ1ZWY5Y2U6czI
Example of the exception caused by the BigQuery backend:
io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Field dst_ip: STRING(15) has maximum length 15 but got a value with length 54 Entity: projects/event-processing-demo/datasets/bigquery_io/tables/events/streams/CiQ2MzRkOGM5Mi0wMDAwLTI2MjktOGVjYy1mNDAzMDQ1ZWY5Y2U
--
This message was sent by Atlassian Jira
(v8.3.4#803005)