You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Chamikara Jayalath (JIRA)" <ji...@apache.org> on 2017/11/02 22:24:00 UTC

[jira] [Resolved] (BEAM-3088) BigQuery source should consider streaming buffer when determining estimated sizes of tables

     [ https://issues.apache.org/jira/browse/BEAM-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chamikara Jayalath resolved BEAM-3088.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

> BigQuery source should consider streaming buffer when determining estimated sizes of tables
> -------------------------------------------------------------------------------------------
>
>                 Key: BEAM-3088
>                 URL: https://issues.apache.org/jira/browse/BEAM-3088
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>            Reporter: Chamikara Jayalath
>            Assignee: Chamikara Jayalath
>            Priority: Major
>             Fix For: 2.3.0
>
>
> Currently BigQuery table source determines estimated size using table.numBytes property.
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryTableSource.java#L100
> If BigQuery table has data in the streaming buffer, size of that data will not be reflected by table.numBytes. To better estimate size of table, data in the streaming buffer has to be considered as well. Size of data in streaming buffer can be determined based on property streamingBuffer.estimatedBytes according to following.
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)