You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/10/08 10:08:00 UTC

[jira] [Work logged] (BEAM-4824) Get BigQueryIO batch loads to return something actionable

     [ https://issues.apache.org/jira/browse/BEAM-4824?focusedWorklogId=152182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-152182 ]

ASF GitHub Bot logged work on BEAM-4824:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Oct/18 10:07
            Start Date: 08/Oct/18 10:07
    Worklog Time Spent: 10m 
      Work Description: reuvenlax commented on issue #6055: [BEAM-4824] Batch BigQueryIO returns job results
URL: https://github.com/apache/beam/pull/6055#issuecomment-427781427
 
 
   @calonso sorry I've been traveling internationally a lot recently (on vacation now) and haven't stayed on top of this.
   
   I have some concerns about adding this capability given that there will be a new BigQuery connector that doesn't use these inserts at all, and code that relies on this functionality simply will not function with the new connector. However as long as the capability is guarded behind Experimental (meaning that the functionality will change, and no compatibility guarantees are made), I think it's ok to add this.
   
   My biggest concern with the current PR is changing the types of PCollections (e.g. from String -> BigQueryWriteResult). Several runners (Dataflow, Flink) support updating streaming pipelines. However if one of these PCollections has changed types, the update must fail. Is there any way to structure this without changing intermediate types?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 152182)
    Time Spent: 1h 40m  (was: 1.5h)

> Get BigQueryIO batch loads to return something actionable
> ---------------------------------------------------------
>
>                 Key: BEAM-4824
>                 URL: https://issues.apache.org/jira/browse/BEAM-4824
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Carlos Alonso
>            Assignee: Carlos Alonso
>            Priority: Minor
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> ATM BigQueryIO batchloads returns an empty collection that has no information related to how the load job finished. It is even returned before the job finishes.
>  
> Change it so that:
>  # The returning PCollection only appers when the job has actually finished
>  # The returning PCollection contains information about the job result



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)