You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Tim Swast (JIRA)" <ji...@apache.org> on 2018/08/17 19:42:00 UTC

[jira] [Commented] (AIRFLOW-2772) BigQuery hook does not allow specifying both the partition field name and table name at the same time

    [ https://issues.apache.org/jira/browse/AIRFLOW-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584321#comment-16584321 ] 

Tim Swast commented on AIRFLOW-2772:
------------------------------------

In my opinion, libraries such as Airflow should not be doing this kind of client-side validation at all.

> BigQuery hook does not allow specifying both the partition field name and table name at the same time
> -----------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-2772
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2772
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hooks
>            Reporter: Berislav Lopac
>            Priority: Minor
>
> When creating a load job for a single partition in a BigQuery's partitioned table, it is possible to specify either the table name with the partition (e.g. {{dataset_name.table_name$partition_id}}), or the field used for the partition (e.g. {{time_partitioning=\{"field": "field_name"\}}}) -- but not both.
> This is the code that raises the exception, at the very end of {{contrib/hooks/bigquery_hook.py}}:
> {code}
>         assert not time_partitioning_in.get('field'), (
>             "Cannot specify field partition and partition name "
>             "(dataset.table$partition) at the same time"
>         )
> {code}
> My first problem is using {{assert}} for flow control, but more importantly it is not clear what is the rationale for this check and the error if both are defined? The code works well if we provide just the partition field specification, but passing only the partition table name results in the following BQ error:
> {code}Incompatible table partitioning specification. Expects partitioning specification interval(type:day,field:local_event_start_date), but input partitioning specification is interval(type:day){code}
> which implies that sending both should be perfectly fine.
> Can anyone provide any insight?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)