You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "shaneikennedy (via GitHub)" <gi...@apache.org> on 2023/02/15 12:33:36 UTC

[GitHub] [airflow] shaneikennedy opened a new issue, #11911: BigQuery support for create or replace table

shaneikennedy opened a new issue, #11911:
URL: https://github.com/apache/airflow/issues/11911

   **Description**
   
   BigQuery supports multiple [create table statements](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_table_statement), one of which is create or replace. 
   
   **Use case / motivation**
   
   This would be really nice for batch processing because I can write a DAG that is: `create-table >> insert-data` and the operation is idempotent. Right now, the BiqQueryCreateEmptyTable operator fails if the table already exists, which means my dag needs some logic to see if I should actually run this operator.
   
   
   **Related Issues**
   
   Not that I could find
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on issue #11911: BigQuery support for create or replace table

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1599761211

   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vchiapaikeo commented on issue #11911: BigQuery support for create or replace table

Posted by "vchiapaikeo (via GitHub)" <gi...@apache.org>.
vchiapaikeo commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1435343231

   It doesn't look like the [API supports a replace_if_exists / delete_if_exists parameter](https://github.com/googleapis/python-bigquery/blob/v2.34.4/google/cloud/bigquery/client.py#L702-L761). It will simply raise a `google.cloud.exceptions.Conflict` if the table exists, unless `exists_ok` is set to True. 
   
   Is the desired behavior for a `delete_if_exists` or `replace_if_exists` flag to delete the table and recreate it if the table already exists? Also, is it okay if this type of operation is not atomic? We will need to delete the table first and then recreate it. I'm not sure how CREATE OR REPLACE TABLE is implemented under the hood.
   
   It's also unusual how such a `replace_if_exists` parameter would work with the `exists_ok` parameter. Should the `replace_if_exists` parameter take precedence? Or should we only do the delete/recreate if `exists_ok` is set to False and `replace_if_exists` is set to True? I kinda prefer that approach so it's clear that the user is not okay with the table existing. (exists_ok currently defaults to False and a replace_if_exists param would obviously default to False as well).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on issue #11911: BigQuery support for create or replace table

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1620849845

   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on issue #11911: BigQuery support for create or replace table

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1181859027

   I believe this already implemented.
   See `exists_ok` parameter
   https://github.com/apache/airflow/blob/acaa0635c8477c98ab78da9f6d86e6f1bad2737d/airflow/providers/google/cloud/operators/bigquery.py#L784


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] closed issue #11911: BigQuery support for create or replace table

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #11911: BigQuery support for create or replace table
URL: https://github.com/apache/airflow/issues/11911


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal closed issue #11911: BigQuery support for create or replace table

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #11911: BigQuery support for create or replace table
URL: https://github.com/apache/airflow/issues/11911


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jkukul commented on issue #11911: BigQuery support for create or replace table

Posted by "jkukul (via GitHub)" <gi...@apache.org>.
jkukul commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1431290823

   @eladkal `exists_ok` parameter doesn't allow to replace a table if it exists. It will make the operator simply not create a table if it already exists. It corresponds to `CREATE TABLE IF NOT EXISTS` DDL statement.
   
   This feature request is about adding a parameter to mimic the `CREATE OR REPLACE TABLE` DDL. It remains unresolved and I believe it should be re-opened. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on issue #11911: BigQuery support for create or replace table

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1431300066

   thanks for the clarification.
   @jkukul would you like to raise a PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] shahar1 commented on issue #11911: BigQuery support for create or replace table

Posted by "shahar1 (via GitHub)" <gi...@apache.org>.
shahar1 commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1517652005

   > It doesn't look like the [API supports a replace_if_exists / delete_if_exists parameter](https://github.com/googleapis/python-bigquery/blob/v2.34.4/google/cloud/bigquery/client.py#L702-L761). It will simply raise a `google.cloud.exceptions.Conflict` if the table exists, unless `exists_ok` is set to True.
   > 
   > Is the desired behavior for a `delete_if_exists` or `replace_if_exists` flag to delete the table and recreate it if the table already exists? Also, is it okay if this type of operation is not atomic? We will need to delete the table first and then recreate it. I'm not sure how CREATE OR REPLACE TABLE in standard BQ SQL is implemented under the hood.
   > 
   > It's also unclear how such a `replace_if_exists` parameter would work with the `exists_ok` parameter. Should the `replace_if_exists` parameter take precedence? Or should we only do the delete/recreate if `exists_ok` is set to False and `replace_if_exists` is set to True? I kinda prefer that approach so it's clear that the user is not okay with the table existing. (exists_ok currently defaults to False and a replace_if_exists param would obviously default to False as well).
   > 
   > An aside - for tables that get appended to, this operation could be quite dangerous - since a user will lose all their historical data as a result. Just worth mentioning that this should only be done for tables that are truncated / recreated and are okay with non-atomicity.
   
   I agree with your statement - as long as BQ API doesn't support it as an atomic operation natively, I don't see a good reason to maintain a specific operator for that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on issue #11911: BigQuery support for create or replace table

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-1517668862

   So the preferred action is to do nothing until big query API supports this action natively?
   If so I will close this ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org