You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 00:20:23 UTC
[GitHub] [airflow] mik-laj commented on issue #13343: Remote logging broken (Airflow 2.0.0 on GCS)
mik-laj commented on issue #13343:
URL: https://github.com/apache/airflow/issues/13343#issuecomment-763236585
Today I had discussions with the Polidea team about this bug. @turbaszek said that this problem is already fixed in the main branch. I can also confirm that this problem no longer occurs and that system tests are successful. There are some more problems with the StackdriverTaskHandler, but I already have a patch ready and I'm testing it.
```
root@f375df8a2fd5:/opt/airflow# pytest tests/providers/google/cloud/log/test_gcs_task_handler_system.py --system google -s
============================================================================================================================= test session starts ==============================================================================================================================
platform linux -- Python 3.6.12, pytest-6.2.1, py-1.10.0, pluggy-0.13.1 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow, configfile: pytest.ini
plugins: forked-1.3.0, timeouts-1.2.1, xdist-2.2.0, cov-2.10.1, instafail-0.4.2, flaky-3.7.0, requests-mock-1.8.0, rerunfailures-9.1.1, celery-4.4.7
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 1 item
tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs ========================= AIRFLOW ==========================
Home of the user: /root
Airflow home /root/airflow
Skipping initializing of the DB as it was initialized already.
You can re-initialize the database by adding --with-db-init flag when running tests.
[2021-01-20 00:17:42,536] {logging_command_executor.py:33} INFO - Executing: 'gcloud config set core/project polidea-airflow'
[2021-01-20 00:17:44,288] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:17:44,288] {logging_command_executor.py:45} INFO - Stderr: Updated property [core/project].
[2021-01-20 00:17:44,288] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
[2021-01-20 00:17:46,231] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:17:46,231] {logging_command_executor.py:45} INFO - Stderr: Activated service account credentials for: [gcp-storage-account@polidea-airflow.iam.gserviceaccount.com]
[2021-01-20 00:17:46,232] {logging_command_executor.py:33} INFO - Executing: 'gsutil mb gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv'
[2021-01-20 00:17:51,253] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:17:51,254] {logging_command_executor.py:45} INFO - Stderr: Creating gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/...
[2021-01-20 00:17:55,495] {__init__.py:38} INFO - Loaded API auth backend: <module 'airflow.api.auth.backend.default' from '/opt/airflow/airflow/api/auth/backend/default.py'>
[2021-01-20 00:17:58,331] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
Created <DagRun example_complex @ 2021-01-20 00:18:00+00:00: manual__2021-01-20T00:18:00+00:00, externally triggered: True>
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2021-01-20 00:18:04,044] {scheduler_job.py:1239} INFO - Starting the scheduler
[2021-01-20 00:18:04,044] {scheduler_job.py:1244} INFO - Processing each file at most -1 times
[2021-01-20 00:18:04,232] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 31561
[2021-01-20 00:18:04,235] {scheduler_job.py:1748} INFO - Resetting orphaned tasks for active dag runs
[2021-01-20 00:18:04,290] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
[2021-01-20 00:18:07,141] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
[2021-01-20 00:18:08,796] {scheduler_job.py:936} INFO - 1 tasks up for execution:
<TaskInstance: example_complex.create_entry_group 2021-01-20 00:18:00+00:00 [scheduled]>
[2021-01-20 00:18:08,800] {scheduler_job.py:970} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 1 task instances ready to be queued
[2021-01-20 00:18:08,800] {scheduler_job.py:997} INFO - DAG example_complex has 0/16 running and queued tasks
[2021-01-20 00:18:08,801] {scheduler_job.py:1058} INFO - Setting the following tasks to queued state:
<TaskInstance: example_complex.create_entry_group 2021-01-20 00:18:00+00:00 [scheduled]>
[2021-01-20 00:18:08,809] {scheduler_job.py:1100} INFO - Sending TaskInstanceKey(dag_id='example_complex', task_id='create_entry_group', execution_date=datetime.datetime(2021, 1, 20, 0, 18, tzinfo=Timezone('UTC')), try_number=1) to executor with priority 37 and queue default
[2021-01-20 00:18:08,809] {base_executor.py:79} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_complex', 'create_entry_group', '2021-01-20T00:18:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/example_dags/example_complex.py']
[2021-01-20 00:18:08,816] {local_executor.py:81} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_complex', 'create_entry_group', '2021-01-20T00:18:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/example_dags/example_complex.py']
[2021-01-20 00:18:08,831] {scheduler_job.py:1401} INFO - Exiting scheduler loop as requested number of runs (1 - got to 1) has been reached
[2021-01-20 00:18:08,832] {dag_processing.py:439} INFO - Sending termination message to manager.
[2021-01-20 00:18:08,834] {scheduler_job.py:1282} INFO - Deactivating DAGs that haven't been touched since 2021-01-20T00:18:04.234269+00:00
[2021-01-20 00:18:08,992] {dagbag.py:440} INFO - Filling up the DagBag from /opt/airflow/airflow/example_dags/example_complex.py
[2021-01-20 00:18:11,716] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
Running <TaskInstance: example_complex.create_entry_group 2021-01-20T00:18:00+00:00 [queued]> on host f375df8a2fd5
[2021-01-20 00:18:14,325] {gcs_task_handler.py:182} INFO - Previous log discarded: 404 GET https://storage.googleapis.com/download/storage/v1/b/airflow-gcs-task-handler-tests-hiucqpbantyszrxv/o/path%2Fto%2Flogs%2Fexample_complex%2Fcreate_entry_group%2F2021-01-20T00%3A18%3A00%2B00%3A00%2F1.log?alt=media: No such object: airflow-gcs-task-handler-tests-hiucqpbantyszrxv/path/to/logs/example_complex/create_entry_group/2021-01-20T00:18:00+00:00/1.log: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
[2021-01-20 00:18:14,735] {process_utils.py:95} INFO - Sending Signals.SIGTERM to GPID 31561
[2021-01-20 00:18:14,736] {scheduler_job.py:1293} INFO - Exited execute loop
[2021-01-20 00:18:15,245] {dagbag.py:440} INFO - Filling up the DagBag from /opt/airflow/airflow/example_dags/example_complex.py
[2021-01-20 00:18:15,627] {logging_command_executor.py:33} INFO - Executing: 'gcloud config set core/project polidea-airflow'
[2021-01-20 00:18:17,463] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:17,464] {logging_command_executor.py:45} INFO - Stderr: Updated property [core/project].
[2021-01-20 00:18:17,464] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
[2021-01-20 00:18:19,103] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:19,104] {logging_command_executor.py:45} INFO - Stderr: Activated service account credentials for: [gcp-storage-account@polidea-airflow.iam.gserviceaccount.com]
[2021-01-20 00:18:21,811] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
PASSED[2021-01-20 00:18:24,225] {logging_command_executor.py:33} INFO - Executing: 'gcloud config set core/project polidea-airflow'
[2021-01-20 00:18:25,949] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:25,951] {logging_command_executor.py:45} INFO - Stderr: Updated property [core/project].
[2021-01-20 00:18:25,952] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
[2021-01-20 00:18:27,947] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:27,948] {logging_command_executor.py:45} INFO - Stderr: Activated service account credentials for: [gcp-storage-account@polidea-airflow.iam.gserviceaccount.com]
[2021-01-20 00:18:27,948] {logging_command_executor.py:33} INFO - Executing: 'gsutil -m rm -r gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv'
[2021-01-20 00:18:33,055] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:33,056] {logging_command_executor.py:45} INFO - Stderr: Removing gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/path/to/logs/example_complex/create_entry_group/2021-01-20T00:18:00+00:00/1.log#1611101894698364...
/ [1/1 objects] 100% Done
Operation completed over 1 objects.
Removing gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/...
=============================================================================================================================== warnings summary ===============================================================================================================================
tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
/usr/local/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)
tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
/usr/local/lib/python3.6/site-packages/boto/plugin.py:40: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
-- Docs: https://docs.pytest.org/en/stable/warnings.html
======================================================================================================================== 1 passed, 2 warnings in 52.09s ========================================================================================================================
root@f375df8a2fd5:/opt/airflow#
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org