You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@beam.apache.org by OrielResearch Eila Arich-Landkof <ei...@orielresearch.org> on 2020/09/12 21:06:59 UTC

BQ pipeline fires and error

Hi all.

I am initiating the following pipeline to read from table and write to new table and receive the following error:

Table is ~60,000 rows

input_query = “select * from table"
p =  beam.Pipeline(options=options)
# the first list is the root idx_and_sample[0:1]
(p | 'Step 5.1.1: read each row in ort databse table ' >> beam.io.Read(beam.io.BigQuerySource(query = input_query))
   | 'Step 5.1.2: check for gene based expression value ' >> beam.ParDo(align_to_dict(VALUES_TO_MAP)) # will run it one after the end
   | "step 5.1.3: write to new big query " >> beam.io.WriteToBigQuery(DICTIONARY_ENFORCED_TABLE,
                                                                      create_disposition = beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
                                                                      write_disposition = beam.io.BigQueryDisposition.WRITE_TRUNCATE,
                                                                      schema = bq_schema))

After running the above rows (not running the pipeline yet), I am getting the following error:
{'FailedRows': <PCollection[[120]: step 5.1.3: write to new big query /_StreamToBigQuery/StreamInsertRows/ParDo(BigQueryWriteFn).FailedRows] at 0x7f23f0c4d4e0>}
Any idea what might be the issue?

Thanks,
 <https://orielresearch.com/>
——————
Eila

www.orielresearch.com <http://www.orielresearch.com/>
Meetup <https://www.meetup.com/Deep-Learning-In-Production/>