You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 00:20:23 UTC

[GitHub] [airflow] mik-laj commented on issue #13343: Remote logging broken (Airflow 2.0.0 on GCS)

mik-laj commented on issue #13343:
URL: https://github.com/apache/airflow/issues/13343#issuecomment-763236585


   Today I had discussions with the Polidea team about this bug. @turbaszek  said that this problem is already fixed in the main branch. I can also confirm that this problem no longer occurs and that system tests are successful.  There are some more problems with the StackdriverTaskHandler, but I already have a patch ready and I'm testing it.
   
   ```
   root@f375df8a2fd5:/opt/airflow# pytest tests/providers/google/cloud/log/test_gcs_task_handler_system.py --system google -s
   ============================================================================================================================= test session starts ==============================================================================================================================
   platform linux -- Python 3.6.12, pytest-6.2.1, py-1.10.0, pluggy-0.13.1 -- /usr/local/bin/python
   cachedir: .pytest_cache
   rootdir: /opt/airflow, configfile: pytest.ini
   plugins: forked-1.3.0, timeouts-1.2.1, xdist-2.2.0, cov-2.10.1, instafail-0.4.2, flaky-3.7.0, requests-mock-1.8.0, rerunfailures-9.1.1, celery-4.4.7
   setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
   collected 1 item
   
   tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs ========================= AIRFLOW ==========================
   Home of the user: /root
   Airflow home /root/airflow
   Skipping initializing of the DB as it was initialized already.
   You can re-initialize the database by adding --with-db-init flag when running tests.
   [2021-01-20 00:17:42,536] {logging_command_executor.py:33} INFO - Executing: 'gcloud config set core/project polidea-airflow'
   [2021-01-20 00:17:44,288] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:17:44,288] {logging_command_executor.py:45} INFO - Stderr: Updated property [core/project].
   
   [2021-01-20 00:17:44,288] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
   [2021-01-20 00:17:46,231] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:17:46,231] {logging_command_executor.py:45} INFO - Stderr: Activated service account credentials for: [gcp-storage-account@polidea-airflow.iam.gserviceaccount.com]
   
   [2021-01-20 00:17:46,232] {logging_command_executor.py:33} INFO - Executing: 'gsutil mb gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv'
   [2021-01-20 00:17:51,253] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:17:51,254] {logging_command_executor.py:45} INFO - Stderr: Creating gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/...
   
   [2021-01-20 00:17:55,495] {__init__.py:38} INFO - Loaded API auth backend: <module 'airflow.api.auth.backend.default' from '/opt/airflow/airflow/api/auth/backend/default.py'>
   [2021-01-20 00:17:58,331] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
   Created <DagRun example_complex @ 2021-01-20 00:18:00+00:00: manual__2021-01-20T00:18:00+00:00, externally triggered: True>
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-01-20 00:18:04,044] {scheduler_job.py:1239} INFO - Starting the scheduler
   [2021-01-20 00:18:04,044] {scheduler_job.py:1244} INFO - Processing each file at most -1 times
   [2021-01-20 00:18:04,232] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 31561
   [2021-01-20 00:18:04,235] {scheduler_job.py:1748} INFO - Resetting orphaned tasks for active dag runs
   [2021-01-20 00:18:04,290] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
   [2021-01-20 00:18:07,141] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
   [2021-01-20 00:18:08,796] {scheduler_job.py:936} INFO - 1 tasks up for execution:
   	<TaskInstance: example_complex.create_entry_group 2021-01-20 00:18:00+00:00 [scheduled]>
   [2021-01-20 00:18:08,800] {scheduler_job.py:970} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 1 task instances ready to be queued
   [2021-01-20 00:18:08,800] {scheduler_job.py:997} INFO - DAG example_complex has 0/16 running and queued tasks
   [2021-01-20 00:18:08,801] {scheduler_job.py:1058} INFO - Setting the following tasks to queued state:
   	<TaskInstance: example_complex.create_entry_group 2021-01-20 00:18:00+00:00 [scheduled]>
   [2021-01-20 00:18:08,809] {scheduler_job.py:1100} INFO - Sending TaskInstanceKey(dag_id='example_complex', task_id='create_entry_group', execution_date=datetime.datetime(2021, 1, 20, 0, 18, tzinfo=Timezone('UTC')), try_number=1) to executor with priority 37 and queue default
   [2021-01-20 00:18:08,809] {base_executor.py:79} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_complex', 'create_entry_group', '2021-01-20T00:18:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/example_dags/example_complex.py']
   [2021-01-20 00:18:08,816] {local_executor.py:81} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_complex', 'create_entry_group', '2021-01-20T00:18:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/example_dags/example_complex.py']
   [2021-01-20 00:18:08,831] {scheduler_job.py:1401} INFO - Exiting scheduler loop as requested number of runs (1 - got to 1) has been reached
   [2021-01-20 00:18:08,832] {dag_processing.py:439} INFO - Sending termination message to manager.
   [2021-01-20 00:18:08,834] {scheduler_job.py:1282} INFO - Deactivating DAGs that haven't been touched since 2021-01-20T00:18:04.234269+00:00
   [2021-01-20 00:18:08,992] {dagbag.py:440} INFO - Filling up the DagBag from /opt/airflow/airflow/example_dags/example_complex.py
   [2021-01-20 00:18:11,716] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
   Running <TaskInstance: example_complex.create_entry_group 2021-01-20T00:18:00+00:00 [queued]> on host f375df8a2fd5
   [2021-01-20 00:18:14,325] {gcs_task_handler.py:182} INFO - Previous log discarded: 404 GET https://storage.googleapis.com/download/storage/v1/b/airflow-gcs-task-handler-tests-hiucqpbantyszrxv/o/path%2Fto%2Flogs%2Fexample_complex%2Fcreate_entry_group%2F2021-01-20T00%3A18%3A00%2B00%3A00%2F1.log?alt=media: No such object: airflow-gcs-task-handler-tests-hiucqpbantyszrxv/path/to/logs/example_complex/create_entry_group/2021-01-20T00:18:00+00:00/1.log: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
   [2021-01-20 00:18:14,735] {process_utils.py:95} INFO - Sending Signals.SIGTERM to GPID 31561
   [2021-01-20 00:18:14,736] {scheduler_job.py:1293} INFO - Exited execute loop
   [2021-01-20 00:18:15,245] {dagbag.py:440} INFO - Filling up the DagBag from /opt/airflow/airflow/example_dags/example_complex.py
   [2021-01-20 00:18:15,627] {logging_command_executor.py:33} INFO - Executing: 'gcloud config set core/project polidea-airflow'
   [2021-01-20 00:18:17,463] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:18:17,464] {logging_command_executor.py:45} INFO - Stderr: Updated property [core/project].
   
   [2021-01-20 00:18:17,464] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
   [2021-01-20 00:18:19,103] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:18:19,104] {logging_command_executor.py:45} INFO - Stderr: Activated service account credentials for: [gcp-storage-account@polidea-airflow.iam.gserviceaccount.com]
   
   [2021-01-20 00:18:21,811] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
   PASSED[2021-01-20 00:18:24,225] {logging_command_executor.py:33} INFO - Executing: 'gcloud config set core/project polidea-airflow'
   [2021-01-20 00:18:25,949] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:18:25,951] {logging_command_executor.py:45} INFO - Stderr: Updated property [core/project].
   
   [2021-01-20 00:18:25,952] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
   [2021-01-20 00:18:27,947] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:18:27,948] {logging_command_executor.py:45} INFO - Stderr: Activated service account credentials for: [gcp-storage-account@polidea-airflow.iam.gserviceaccount.com]
   
   [2021-01-20 00:18:27,948] {logging_command_executor.py:33} INFO - Executing: 'gsutil -m rm -r gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv'
   [2021-01-20 00:18:33,055] {logging_command_executor.py:44} INFO - Stdout:
   [2021-01-20 00:18:33,056] {logging_command_executor.py:45} INFO - Stderr: Removing gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/path/to/logs/example_complex/create_entry_group/2021-01-20T00:18:00+00:00/1.log#1611101894698364...
   / [1/1 objects] 100% Done
   Operation completed over 1 objects.
   Removing gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/...
   
   
   
   =============================================================================================================================== warnings summary ===============================================================================================================================
   tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
     /usr/local/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
       return f(*args, **kwds)
   
   tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
     /usr/local/lib/python3.6/site-packages/boto/plugin.py:40: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
       import imp
   
   -- Docs: https://docs.pytest.org/en/stable/warnings.html
   ======================================================================================================================== 1 passed, 2 warnings in 52.09s ========================================================================================================================
   root@f375df8a2fd5:/opt/airflow#
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org