You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/25 12:28:29 UTC
[GitHub] [airflow] joppevos opened a new pull request #8556: Missing example dags/system tests for google services
joppevos opened a new pull request #8556:
URL: https://github.com/apache/airflow/pull/8556
Fixes partly the following [issue](https://github.com/apache/airflow/issues/8280#issue-599125821)
Renamed the sample file to match the operator file.
Wrote a system test for example_gcs_to_bigquery.py
Small syntax correction in the example file to make it work properly.
---
Make sure to mark the boxes below before creating PR: [x]
- [x] Description above provides context of the change
- [x] Unit tests coverage for changes (not needed for documentation changes)
- [x] Target Github ISSUE in description if exists
- [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
- [x] Relevant documentation is updated including usage instructions.
- [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
---
In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415115715
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
Interesting. So if I create the environment variables in the file like this.
`DATASET = os.environ.get("GCP_DATASET_NAME", 'airflow_test')`
They will be picked up and adjusted to a unique name by the script?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619926960
@potiuk Thanks. I never heard/used about pre-commit, but will definitely get started with it. New to the whole CI workflow but always happy to learn. Already felt that is is probably not the way to go how I did it now :sweat_smile:
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415116924
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
Yes. Exactly. Each test run has its own runtime environment defined by ... environment variables. It is not yet complete in the community, but it will work like that. At [Polidea](polidea.com), we launched system tests on CI in this way.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619873329
Once you delete the "wrong" licence and re-run the pre-commit, It will add the licences in the right way when they are missing. In this case, `pre-commit run insert-license --all-files` should do the job for you. Then you can add/commit --amend and re-push it (rebase it first ideally)
Then you will be able
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415691161
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -8,7 +8,7 @@
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
-#
+
Review comment:
@potiuk My thanks. should have checked the file differences pre and post
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619372129
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:
- Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
- In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
- Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, itโs a heavy docker but it ships with a working Airflow and a lot of integrations.
- Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
- Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
Apache Airflow is a community-driven project and together we are making it better ๐.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://apache-airflow-slack.herokuapp.com/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos edited a comment on pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
joppevos edited a comment on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-622317738
@mik-laj gentle poke, ready to be re-reviewed :) Jarek assured me that the failing quarantine test are nothing to worry about.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416608495
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
TEST_SOURCE_OBJECTS = ['test/objects/*']
+@pytest.mark.backend("mysql", "postgres")
Review comment:
In the future, we will want to move them to a separate directory, because they do not test `airflow.providers.google.cloud.operators.gcs_to_bigquery` module, but example dag.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-620586208
I run system tests and when everything works it will accept the change.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415092152
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -29,7 +29,7 @@
}
dag = models.DAG(
- dag_id='example_gcs_to_bq_operator', default_args=args,
+ dag_id='example_gcs_to_bigquery_operator', default_args=args,
schedule_interval=None, tags=['example'])
create_test_dataset = BashOperator(
Review comment:
Does breeze log you in on google cloud CLI? All I did was provide the credentials. Using the operators seems like a better and cleaner solution indeed. Will rework it
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415077858
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
Here is the problem with test isolation. When two tests are run simultaneously, they will conflict. Can you move the dataset name to the environment variables? We want to generate environment variables to ensure isolation between tests.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415077499
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -29,7 +29,7 @@
}
dag = models.DAG(
- dag_id='example_gcs_to_bq_operator', default_args=args,
+ dag_id='example_gcs_to_bigquery_operator', default_args=args,
schedule_interval=None, tags=['example'])
create_test_dataset = BashOperator(
Review comment:
Unfortunately, this command will fail if you are not logged in to gcloud. I'm working to fix it.
https://github.com/apache/airflow/pull/8432
Can you run this command in setUp/tearDown or use dedicated operators?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619474944
Can you also updaate reference in /opt/airflow/docs/howto/operator/gcp/gcs.rst file?
```
File path: /opt/airflow/docs/howto/operator/gcp/gcs.rst (41)
37 | Use the
38 | :class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`
39 | to execute a BigQuery load job.
40 |
41 | .. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_gcs_to_bq.py
42 | :language: python
43 | :start-after: [START howto_operator_gcs_to_bq]
44 | :end-before: [END howto_operator_gcs_to_bq]
45 |
46 | .. _howto/operator:GCSBucketCreateAclEntryOperator:
==================================================
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-620609408
Example DAG works
<details>
```
root@d8cf57dc3068:/opt/airflow# pytest tests/providers/google/cloud/operators/test_gcs_to_bigquery.py --system google -s
=========================================================================================================================================================================== test session starts ============================================================================================================================================================================
platform linux -- Python 3.6.10, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow, inifile: pytest.ini
plugins: flaky-3.6.1, rerunfailures-9.0, forked-1.1.3, instafail-0.4.1.post0, requests-mock-1.7.0, xdist-1.31.0, timeout-1.3.4, celery-4.4.2, cov-2.8.1
collected 3 items
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator ========================= AIRFLOW ==========================
Home of the user: /root
Airflow home /root/airflow
Skipping initializing of the DB as it was initialized already.
You can re-initialize the database by adding --with-db-init flag when running tests.
Removing all log files except previous_runs
[2020-04-28 13:29:00,246] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_bigquery.json'
[2020-04-28 13:29:01,254] {logging_command_executor.py:40} INFO - Stdout:
[2020-04-28 13:29:01,256] {logging_command_executor.py:41} INFO - Stderr: Activated service account credentials for: [gcp-bigquery-account@polidea-airflow.iam.gserviceaccount.com]
[2020-04-28 13:29:01,257] {system_tests_class.py:137} INFO - Looking for DAG: example_gcs_to_bigquery_operator in /opt/airflow/airflow/providers/google/cloud/example_dags
[2020-04-28 13:29:01,257] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags
[2020-04-28 13:29:03,882] {system_tests_class.py:151} INFO - Attempting to run DAG: example_gcs_to_bigquery_operator
[2020-04-28 13:29:04,565] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.create_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>
[2020-04-28 13:29:04,582] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpye7zceo1']
[2020-04-28 13:29:04,602] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:04,628] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:05,035] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpye7zceo1']
[2020-04-28 13:29:05,056] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:05,088] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:05,118] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:06,042] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:06,060] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:06,093] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:07,047] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:07,068] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:07,100] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:07,717] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
[2020-04-28 13:29:08,071] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:08,142] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:08,208] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
Running <TaskInstance: example_gcs_to_bigquery_operator.create_airflow_test_dataset 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
[2020-04-28 13:29:09,063] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:09,085] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:09,112] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:10,069] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:10,094] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:10,121] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:11,078] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 1 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:11,103] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>
[2020-04-28 13:29:11,114] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpuzkczu3l']
[2020-04-28 13:29:11,137] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:12,035] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpuzkczu3l']
[2020-04-28 13:29:12,037] {backfill_job.py:262} WARNING - ('example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', datetime.datetime(2020, 4, 26, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 3) state success not in running=dict_values([<TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [queued]>])
[2020-04-28 13:29:12,053] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:12,074] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:13,047] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:13,070] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:14,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:14,114] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:14,751] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
[2020-04-28 13:29:15,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:15,108] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
Running <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
[2020-04-28 13:29:16,062] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:16,077] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:17,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:17,081] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:18,068] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:18,083] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:19,075] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:19,093] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:20,077] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:20,092] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:21,090] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:21,107] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:22,096] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:22,112] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:23,107] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 2 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:23,133] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>
[2020-04-28 13:29:23,140] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'delete_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpknrawrfh']
[2020-04-28 13:29:24,101] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'delete_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpknrawrfh']
[2020-04-28 13:29:24,117] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:25,103] {backfill_job.py:262} WARNING - ('example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', datetime.datetime(2020, 4, 26, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 2) state success not in running=dict_values([<TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [queued]>])
[2020-04-28 13:29:25,134] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:26,116] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:26,484] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
[2020-04-28 13:29:27,117] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
Running <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
[2020-04-28 13:29:28,124] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:29,132] {dagrun.py:336} INFO - Marking run <DagRun example_gcs_to_bigquery_operator @ 2020-04-26 00:00:00+00:00: backfill__2020-04-26T00:00:00+00:00, externally triggered: False> successful
[2020-04-28 13:29:29,139] {backfill_job.py:379} INFO - [backfill progress] | finished run 1 of 1 | tasks waiting: 0 | succeeded: 3 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:29,348] {backfill_job.py:830} INFO - Backfill done. Exiting.
Saving all log files to /root/airflow/logs/previous_runs/2020-04-28_13_29_29
PASSED
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryOperator::test_execute_explicit_project SKIPPED
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryOperator::test_execute_explicit_project_legacy SKIPPED
============================================================================================================================================================================= warnings summary =============================================================================================================================================================================
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
/opt/airflow/airflow/providers/google/cloud/example_dags/example_mlengine.py:82: DeprecationWarning: This operator is deprecated. Consider using operators for specific operations: MLEngineCreateModelOperator, MLEngineGetModelOperator.
"name": MODEL_NAME,
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
/opt/airflow/airflow/providers/google/cloud/example_dags/example_mlengine.py:91: DeprecationWarning: This operator is deprecated. Consider using operators for specific operations: MLEngineCreateModelOperator, MLEngineGetModelOperator.
"name": MODEL_NAME,
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
/usr/local/lib/python3.6/site-packages/future/standard_library/__init__.py:65: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
/opt/airflow/airflow/providers/google/cloud/example_dags/example_datacatalog.py:26: DeprecationWarning: This module is deprecated. Please use `airflow.operators.bash`.
from airflow.operators.bash_operator import BashOperator
-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================================================================================================================================================================= short test summary info ==========================================================================================================================================================================
SKIPPED [1] /opt/airflow/tests/conftest.py:238: The test is skipped because it does not have the right system marker. Only tests marked with pytest.mark.system(SYSTEM) are run with SYSTEM being one of ['google']. <TestCaseFunction test_execute_explicit_project>
SKIPPED [1] /opt/airflow/tests/conftest.py:238: The test is skipped because it does not have the right system marker. Only tests marked with pytest.mark.system(SYSTEM) are run with SYSTEM being one of ['google']. <TestCaseFunction test_execute_explicit_project_legacy>
================================================================================================================================================================ 1 passed, 2 skipped, 4 warnings in 30.77s =================================================================================================================================================================
```
</details>
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415677785
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -8,7 +8,7 @@
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
-#
+
Review comment:
@joppevos -> this is the problem you have with the licence
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416606511
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
TEST_SOURCE_OBJECTS = ['test/objects/*']
+@pytest.mark.backend("mysql", "postgres")
Review comment:
System tests should be in the test_gcs_to_bigquery_sysstem.py file
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415203015
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -17,10 +17,12 @@
# under the License.
import unittest
Review comment:
These import should be in the following order:because the isort is sad.
```
import unittest
import mock
import pytest
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415202828
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -17,10 +17,12 @@
# under the License.
import unittest
-
+import pytest
import mock
from airflow.providers.google.cloud.operators.gcs_to_bigquery import GCSToBigQueryOperator
+from tests.test_utils.gcp_system_helpers import CLOUD_DAG_FOLDER, GoogleSystemTest, provide_gcp_context
+from tests.providers.google.cloud.utils.gcp_authenticator import GCP_GCS_KEY
Review comment:
```suggestion
from airflow.providers.google.cloud.operators.gcs_to_bigquery import GCSToBigQueryOperator
from tests.providers.google.cloud.utils.gcp_authenticator import GCP_GCS_KEY
from tests.test_utils.gcp_system_helpers import CLOUD_DAG_FOLDER, GoogleSystemTest, provide_gcp_context
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416580045
##########
File path: docs/howto/operator/gcp/gcs.rst
##########
@@ -38,10 +38,10 @@ Use the
:class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`
Review comment:
This guide should be in a separate file, but this is another problem. Each module (* .py) with operators should have a separate guide and separate unit test file and separate system test and at least one example dag.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619866125
@mik-laj Made the requested adjustments. Not sure why CI fails and says that the license has been adjusted.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619924806
HEy @joppevos -> I really recommend installing pre-commit framework. You could have seen all those errors automatically during the commit (it will not let you commit anything that fails the checks) long before you push it. I heartily recommend it :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415092152
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -29,7 +29,7 @@
}
dag = models.DAG(
- dag_id='example_gcs_to_bq_operator', default_args=args,
+ dag_id='example_gcs_to_bigquery_operator', default_args=args,
schedule_interval=None, tags=['example'])
create_test_dataset = BashOperator(
Review comment:
Using the operators seems like a better and cleaner solution indeed. Will rework it
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415114170
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
We generate environment variables for each test run.
To set environment variables we have a script that is similar to the following.
```bash
if [[ ! -f "${RANDOM_FILE}" ]]; then
echo "${RANDOM}" > "${RANDOM_FILE}"
fi
RANDOM_POSTFIX=$(cat "${RANDOM_FILE}")
AIRFLOW_BREEZE_SHORT_SHA="${AIRFLOW_BREEZE_SHORT_SHA:="build"}"
AIRFLOW_BREEZE_TEST_SUITE="${AIRFLOW_BREEZE_TEST_SUITE:="test"}"
AIRFLOW_BREEZE_UNIQUE_SUFFIX=${AIRFLOW_BREEZE_TEST_SUITE}-${AIRFLOW_BREEZE_SHORT_SHA}-${RANDOM_POSTFIX}
GCP_FIRESTORE_DATASET_NAME=test_firestore_to_bigquery_${RANDOM_POSTFIX}
```
This script generates unique resource names for each CI launch. This script generates unique resource names for each CI launch.
During development, to make sure everything works, I often run tests using the following command.
```
GCP_GCS_BUCKET=airflow-life-science-$RANDOM pytest tests/providers/google/cloud/operators/test_life_sciences_system.py --system google -s
````
That way I can be sure that everything works and I don't have side effects from another run.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416611120
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
TEST_SOURCE_OBJECTS = ['test/objects/*']
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.credential_file(GCP_GCS_KEY)
+class TestGoogleCloudStorageToBigQueryExample(GoogleSystemTest):
+
+ @provide_gcp_context(GCP_GCS_KEY)
Review comment:
```suggestion
@provide_gcp_context(GCP_BIGQUERY_KEY)
```
An exact description of the keys is not publicly available, but GCP_GCS_KEY only allows access to GCS. GCP_BIGQUERY_KEY allows access to BigQuery and GCS.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415112739
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
Probably a newbie question, but how do environment variables isolate the test, compared to the hardcoded command that's in there right now? Thanks in advance @mik-laj , Will adjust it tomorrow
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415115715
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
Interesting. So if I create the environment variables in the file like this.
`DATASET = os.environ.get("GCP_DATASET_NAME", 'airflow_test')`
They will be picked up and adjusted to a unique name by the script when running CI?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416580045
##########
File path: docs/howto/operator/gcp/gcs.rst
##########
@@ -38,10 +38,10 @@ Use the
:class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`
Review comment:
This guide should be in a separate file, but this is another problem. Each module (* .py) with operators should have a separate guide and separate unit test file and separate system test file.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415117829
##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
delete_test_dataset = BashOperator(
task_id='delete_airflow_test_dataset',
- bash_command='bq rm -rf airflow_test',
+ bash_command='bq rm -r -f airflow_test',
Review comment:
thanks. appreciate the clear explanation
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
joppevos commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-622317738
@mik-laj gentle poke, ready to be re-reviewed :) Jark assured me that the failing quarantine test are nothing to worry about.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-623212977
Awesome work, congrats on your first merged pull request!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] joppevos commented on a change in pull request #8556: Add system test for gcs_to_bigquery
Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416665190
##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
TEST_SOURCE_OBJECTS = ['test/objects/*']
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.credential_file(GCP_GCS_KEY)
+class TestGoogleCloudStorageToBigQueryExample(GoogleSystemTest):
+
+ @provide_gcp_context(GCP_GCS_KEY)
Review comment:
I was not aware of that. Was developing a "project admin" service account. thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #8556: Missing example dags/system tests for google services
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619871980
@joppevos
```
# TODO: This license is not consistent with license used in the project.
# Delete the inconsistent license and above line and rerun pre-commit to insert a good license.
```
You had a problem when copy&pasting the licence. All our licence headers have to be exactly the same. Delete the licence from this file (tests/providers/google/cloud/operators/test_gcs_to_bigquery.py) and run pre-commit as described in https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#pre-commit-hooks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org