You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/25 12:28:29 UTC

[GitHub] [airflow] joppevos opened a new pull request #8556: Missing example dags/system tests for google services

joppevos opened a new pull request #8556:
URL: https://github.com/apache/airflow/pull/8556


   Fixes partly the following [issue](https://github.com/apache/airflow/issues/8280#issue-599125821)
   
   Renamed the sample file to match the operator file.
   Wrote a system test for example_gcs_to_bigquery.py
   Small syntax correction in the example file to make it work properly.
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415115715



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       Interesting. So if I create the environment variables in the file like this.
   `DATASET = os.environ.get("GCP_DATASET_NAME", 'airflow_test')` 
   They will be picked up and adjusted to a unique name by the script? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619926960


   @potiuk Thanks. I never heard/used about pre-commit, but will definitely get started with it. New to the whole CI workflow but always happy to learn. Already felt that is is probably not the way to go how I did it now :sweat_smile:


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415116924



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       Yes. Exactly. Each test run has its own runtime environment defined by ... environment variables. It is not yet complete in the community, but it will work like that. At [Polidea](polidea.com), we launched system tests on CI in this way.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619873329


   Once you delete the "wrong" licence and re-run the pre-commit, It will add the licences in the right way when they are missing. In this case, `pre-commit run insert-license --all-files` should do the job for you. Then you can add/commit --amend and re-push it (rebase it first ideally)
   
   Then you will be able 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415691161



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -8,7 +8,7 @@
 # with the License.  You may obtain a copy of the License at
 #
 #   http://www.apache.org/licenses/LICENSE-2.0
-#
+

Review comment:
       @potiuk My thanks. should have checked the file differences pre and post




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619372129


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos edited a comment on pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
joppevos edited a comment on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-622317738


   @mik-laj gentle poke, ready to be re-reviewed :) Jarek assured me that the failing quarantine test are nothing to worry about.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416608495



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
 TEST_SOURCE_OBJECTS = ['test/objects/*']
 
 
+@pytest.mark.backend("mysql", "postgres")

Review comment:
       In the future, we will want to move them to a separate directory, because they do not test `airflow.providers.google.cloud.operators.gcs_to_bigquery` module, but example dag.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-620586208


   I run system tests and when everything works it will accept the change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415092152



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -29,7 +29,7 @@
 }
 
 dag = models.DAG(
-    dag_id='example_gcs_to_bq_operator', default_args=args,
+    dag_id='example_gcs_to_bigquery_operator', default_args=args,
     schedule_interval=None, tags=['example'])
 
 create_test_dataset = BashOperator(

Review comment:
       Does breeze log you in on google cloud CLI? All I did was provide the credentials. Using the operators seems like a better and cleaner solution indeed. Will rework it 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415077858



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       Here is the problem with test isolation. When two tests are run simultaneously, they will conflict. Can you move the dataset name to the environment variables? We want to generate environment variables to ensure isolation between tests.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415077499



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -29,7 +29,7 @@
 }
 
 dag = models.DAG(
-    dag_id='example_gcs_to_bq_operator', default_args=args,
+    dag_id='example_gcs_to_bigquery_operator', default_args=args,
     schedule_interval=None, tags=['example'])
 
 create_test_dataset = BashOperator(

Review comment:
       Unfortunately, this command will fail if you are not logged in to gcloud. I'm working to fix it.
   https://github.com/apache/airflow/pull/8432
   Can you run this command in setUp/tearDown or use dedicated operators? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619474944


   Can you also updaate reference in /opt/airflow/docs/howto/operator/gcp/gcs.rst file?
   ```
   File path: /opt/airflow/docs/howto/operator/gcp/gcs.rst (41)
   
     37 | Use the
     38 | :class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`
     39 | to execute a BigQuery load job.
     40 | 
     41 | .. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_gcs_to_bq.py
     42 |     :language: python
     43 |     :start-after: [START howto_operator_gcs_to_bq]
     44 |     :end-before: [END howto_operator_gcs_to_bq]
     45 | 
     46 | .. _howto/operator:GCSBucketCreateAclEntryOperator:
   ==================================================
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-620609408


   Example DAG works
   <details>
   
   ```
   root@d8cf57dc3068:/opt/airflow# pytest tests/providers/google/cloud/operators/test_gcs_to_bigquery.py  --system google -s
   =========================================================================================================================================================================== test session starts ============================================================================================================================================================================
   platform linux -- Python 3.6.10, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /usr/local/bin/python
   cachedir: .pytest_cache
   rootdir: /opt/airflow, inifile: pytest.ini
   plugins: flaky-3.6.1, rerunfailures-9.0, forked-1.1.3, instafail-0.4.1.post0, requests-mock-1.7.0, xdist-1.31.0, timeout-1.3.4, celery-4.4.2, cov-2.8.1
   collected 3 items
   
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator ========================= AIRFLOW ==========================
   Home of the user: /root
   Airflow home /root/airflow
   Skipping initializing of the DB as it was initialized already.
   You can re-initialize the database by adding --with-db-init flag when running tests.
   
   Removing all log files except previous_runs
   
   [2020-04-28 13:29:00,246] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_bigquery.json'
   [2020-04-28 13:29:01,254] {logging_command_executor.py:40} INFO - Stdout:
   [2020-04-28 13:29:01,256] {logging_command_executor.py:41} INFO - Stderr: Activated service account credentials for: [gcp-bigquery-account@polidea-airflow.iam.gserviceaccount.com]
   
   [2020-04-28 13:29:01,257] {system_tests_class.py:137} INFO - Looking for DAG: example_gcs_to_bigquery_operator in /opt/airflow/airflow/providers/google/cloud/example_dags
   [2020-04-28 13:29:01,257] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags
   [2020-04-28 13:29:03,882] {system_tests_class.py:151} INFO - Attempting to run DAG: example_gcs_to_bigquery_operator
   [2020-04-28 13:29:04,565] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.create_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>
   [2020-04-28 13:29:04,582] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpye7zceo1']
   [2020-04-28 13:29:04,602] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:04,628] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:05,035] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpye7zceo1']
   [2020-04-28 13:29:05,056] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:05,088] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:05,118] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:06,042] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:06,060] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:06,093] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:07,047] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:07,068] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:07,100] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:07,717] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
   [2020-04-28 13:29:08,071] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:08,142] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:08,208] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   Running <TaskInstance: example_gcs_to_bigquery_operator.create_airflow_test_dataset 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
   [2020-04-28 13:29:09,063] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:09,085] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:09,112] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:10,069] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:10,094] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
   [2020-04-28 13:29:10,121] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:11,078] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 1 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
   [2020-04-28 13:29:11,103] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>
   [2020-04-28 13:29:11,114] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpuzkczu3l']
   [2020-04-28 13:29:11,137] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:12,035] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpuzkczu3l']
   [2020-04-28 13:29:12,037] {backfill_job.py:262} WARNING - ('example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', datetime.datetime(2020, 4, 26, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 3) state success not in running=dict_values([<TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [queued]>])
   [2020-04-28 13:29:12,053] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:12,074] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:13,047] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:13,070] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:14,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:14,114] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:14,751] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
   [2020-04-28 13:29:15,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:15,108] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   Running <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
   [2020-04-28 13:29:16,062] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:16,077] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:17,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:17,081] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:18,068] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:18,083] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:19,075] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:19,093] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:20,077] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:20,092] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:21,090] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:21,107] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:22,096] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:22,112] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
   [2020-04-28 13:29:23,107] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 2 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
   [2020-04-28 13:29:23,133] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>
   [2020-04-28 13:29:23,140] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'delete_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpknrawrfh']
   [2020-04-28 13:29:24,101] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'delete_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpknrawrfh']
   [2020-04-28 13:29:24,117] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
   [2020-04-28 13:29:25,103] {backfill_job.py:262} WARNING - ('example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', datetime.datetime(2020, 4, 26, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 2) state success not in running=dict_values([<TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [queued]>])
   [2020-04-28 13:29:25,134] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
   [2020-04-28 13:29:26,116] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
   [2020-04-28 13:29:26,484] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
   [2020-04-28 13:29:27,117] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
   Running <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
   [2020-04-28 13:29:28,124] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
   [2020-04-28 13:29:29,132] {dagrun.py:336} INFO - Marking run <DagRun example_gcs_to_bigquery_operator @ 2020-04-26 00:00:00+00:00: backfill__2020-04-26T00:00:00+00:00, externally triggered: False> successful
   [2020-04-28 13:29:29,139] {backfill_job.py:379} INFO - [backfill progress] | finished run 1 of 1 | tasks waiting: 0 | succeeded: 3 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
   [2020-04-28 13:29:29,348] {backfill_job.py:830} INFO - Backfill done. Exiting.
   
   Saving all log files to /root/airflow/logs/previous_runs/2020-04-28_13_29_29
   
   PASSED
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryOperator::test_execute_explicit_project SKIPPED
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryOperator::test_execute_explicit_project_legacy SKIPPED
   
   ============================================================================================================================================================================= warnings summary =============================================================================================================================================================================
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
     /opt/airflow/airflow/providers/google/cloud/example_dags/example_mlengine.py:82: DeprecationWarning: This operator is deprecated. Consider using operators for specific operations: MLEngineCreateModelOperator, MLEngineGetModelOperator.
       "name": MODEL_NAME,
   
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
     /opt/airflow/airflow/providers/google/cloud/example_dags/example_mlengine.py:91: DeprecationWarning: This operator is deprecated. Consider using operators for specific operations: MLEngineCreateModelOperator, MLEngineGetModelOperator.
       "name": MODEL_NAME,
   
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
     /usr/local/lib/python3.6/site-packages/future/standard_library/__init__.py:65: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
       import imp
   
   tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
     /opt/airflow/airflow/providers/google/cloud/example_dags/example_datacatalog.py:26: DeprecationWarning: This module is deprecated. Please use `airflow.operators.bash`.
       from airflow.operators.bash_operator import BashOperator
   
   -- Docs: https://docs.pytest.org/en/latest/warnings.html
   ========================================================================================================================================================================= short test summary info ==========================================================================================================================================================================
   SKIPPED [1] /opt/airflow/tests/conftest.py:238: The test is skipped because it does not have the right system marker. Only tests marked with pytest.mark.system(SYSTEM) are run with SYSTEM being one of ['google']. <TestCaseFunction test_execute_explicit_project>
   SKIPPED [1] /opt/airflow/tests/conftest.py:238: The test is skipped because it does not have the right system marker. Only tests marked with pytest.mark.system(SYSTEM) are run with SYSTEM being one of ['google']. <TestCaseFunction test_execute_explicit_project_legacy>
   ================================================================================================================================================================ 1 passed, 2 skipped, 4 warnings in 30.77s =================================================================================================================================================================
   ```
   
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415677785



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -8,7 +8,7 @@
 # with the License.  You may obtain a copy of the License at
 #
 #   http://www.apache.org/licenses/LICENSE-2.0
-#
+

Review comment:
       @joppevos -> this is the problem you have with the licence




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416606511



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
 TEST_SOURCE_OBJECTS = ['test/objects/*']
 
 
+@pytest.mark.backend("mysql", "postgres")

Review comment:
       System tests should be in the test_gcs_to_bigquery_sysstem.py file




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415203015



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -17,10 +17,12 @@
 # under the License.
 
 import unittest

Review comment:
       These import should be in the following order:because the isort is sad.
   ```
   import unittest
   
   import mock
   import pytest
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415202828



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -17,10 +17,12 @@
 # under the License.
 
 import unittest
-
+import pytest
 import mock
 
 from airflow.providers.google.cloud.operators.gcs_to_bigquery import GCSToBigQueryOperator
+from tests.test_utils.gcp_system_helpers import CLOUD_DAG_FOLDER, GoogleSystemTest, provide_gcp_context
+from tests.providers.google.cloud.utils.gcp_authenticator import GCP_GCS_KEY

Review comment:
       ```suggestion
   from airflow.providers.google.cloud.operators.gcs_to_bigquery import GCSToBigQueryOperator
   from tests.providers.google.cloud.utils.gcp_authenticator import GCP_GCS_KEY
   from tests.test_utils.gcp_system_helpers import CLOUD_DAG_FOLDER, GoogleSystemTest, provide_gcp_context
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416580045



##########
File path: docs/howto/operator/gcp/gcs.rst
##########
@@ -38,10 +38,10 @@ Use the
 :class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`

Review comment:
       This guide should be in a separate file, but this is another problem. Each module (* .py) with operators should have a separate guide and separate unit test file and separate system test and at least one example dag.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619866125


   @mik-laj Made the requested adjustments. Not sure why CI fails and says that the license has been adjusted.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619924806


   HEy @joppevos -> I really recommend installing pre-commit framework. You could have seen all those errors automatically during the commit (it will not let you commit anything that fails the checks) long before you push it. I heartily recommend it :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415092152



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -29,7 +29,7 @@
 }
 
 dag = models.DAG(
-    dag_id='example_gcs_to_bq_operator', default_args=args,
+    dag_id='example_gcs_to_bigquery_operator', default_args=args,
     schedule_interval=None, tags=['example'])
 
 create_test_dataset = BashOperator(

Review comment:
       Using the operators seems like a better and cleaner solution indeed. Will rework it 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415114170



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       We generate environment variables for each test run.
   
   To set environment variables we have a script that is similar to the following.
   ```bash
   if [[ ! -f "${RANDOM_FILE}" ]]; then
       echo "${RANDOM}" > "${RANDOM_FILE}"
   fi
   
   RANDOM_POSTFIX=$(cat "${RANDOM_FILE}")
   AIRFLOW_BREEZE_SHORT_SHA="${AIRFLOW_BREEZE_SHORT_SHA:="build"}"
   AIRFLOW_BREEZE_TEST_SUITE="${AIRFLOW_BREEZE_TEST_SUITE:="test"}"
   AIRFLOW_BREEZE_UNIQUE_SUFFIX=${AIRFLOW_BREEZE_TEST_SUITE}-${AIRFLOW_BREEZE_SHORT_SHA}-${RANDOM_POSTFIX}
   GCP_FIRESTORE_DATASET_NAME=test_firestore_to_bigquery_${RANDOM_POSTFIX}
   ```
   This script generates unique resource names for each CI launch. This script generates unique resource names for each CI launch.
   
   During development, to make sure everything works, I often run tests using the following command.
   ```
   GCP_GCS_BUCKET=airflow-life-science-$RANDOM pytest tests/providers/google/cloud/operators/test_life_sciences_system.py --system google -s
   ````
   That way I can be sure that everything works and I don't have side effects from another run.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416611120



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
 TEST_SOURCE_OBJECTS = ['test/objects/*']
 
 
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.credential_file(GCP_GCS_KEY)
+class TestGoogleCloudStorageToBigQueryExample(GoogleSystemTest):
+
+    @provide_gcp_context(GCP_GCS_KEY)

Review comment:
       ```suggestion
       @provide_gcp_context(GCP_BIGQUERY_KEY)
   ```
    An exact description of the keys is not publicly available, but GCP_GCS_KEY only allows access to GCS. GCP_BIGQUERY_KEY allows access to BigQuery and GCS.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415112739



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       Probably a newbie question, but how do environment variables isolate the test, compared to the hardcoded command that's in there right now? Thanks in advance @mik-laj   , Will adjust it tomorrow




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415115715



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       Interesting. So if I create the environment variables in the file like this.
   `DATASET = os.environ.get("GCP_DATASET_NAME", 'airflow_test')` 
   They will be picked up and adjusted to a unique name by the script when running CI? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416580045



##########
File path: docs/howto/operator/gcp/gcs.rst
##########
@@ -38,10 +38,10 @@ Use the
 :class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`

Review comment:
       This guide should be in a separate file, but this is another problem. Each module (* .py) with operators should have a separate guide and separate unit test file and separate system test file.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r415117829



##########
File path: airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
##########
@@ -53,7 +53,7 @@
 
 delete_test_dataset = BashOperator(
     task_id='delete_airflow_test_dataset',
-    bash_command='bq rm -rf airflow_test',
+    bash_command='bq rm -r -f airflow_test',

Review comment:
       thanks. appreciate the clear explanation 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
joppevos commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-622317738


   @mik-laj gentle poke, ready to be re-reviewed :) Jark assured me that the failing quarantine test are nothing to worry about.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-623212977


   Awesome work, congrats on your first merged pull request!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joppevos commented on a change in pull request #8556: Add system test for gcs_to_bigquery

Posted by GitBox <gi...@apache.org>.
joppevos commented on a change in pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#discussion_r416665190



##########
File path: tests/providers/google/cloud/operators/test_gcs_to_bigquery.py
##########
@@ -29,6 +32,15 @@
 TEST_SOURCE_OBJECTS = ['test/objects/*']
 
 
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.credential_file(GCP_GCS_KEY)
+class TestGoogleCloudStorageToBigQueryExample(GoogleSystemTest):
+
+    @provide_gcp_context(GCP_GCS_KEY)

Review comment:
       I was not aware of that. Was developing a "project admin" service account. thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8556: Missing example dags/system tests for google services

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8556:
URL: https://github.com/apache/airflow/pull/8556#issuecomment-619871980


   @joppevos 
   ```
   # TODO: This license is not consistent with license used in the project.
   #       Delete the inconsistent license and above line and rerun pre-commit to insert a good license.
   ```
   You had a problem when copy&pasting the licence.  All our licence headers have to be exactly the same. Delete the licence from this file (tests/providers/google/cloud/operators/test_gcs_to_bigquery.py) and run pre-commit as described in https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#pre-commit-hooks
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org