You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/02 18:46:55 UTC

[GitHub] [airflow] MrManicotti opened a new issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor

MrManicotti opened a new issue #12761:
URL: https://github.com/apache/airflow/issues/12761


   **Apache Airflow version**:
   2.0.0b3
   
   **What happened**:
   When Airflow is ran within CeleryExecutor mode and using the new Cloudwatch integrations (from #7437) log groups and log streams are created, but no log events are pushed to AWS. The Webserver is able to read the log stream as expected. There are no errors, just empty logs within Cloudwatch:
   `*** Reading remote log from Cloudwatch log_group: airflow-task log_stream: tutorial_taskflow_api_etl_dag/extract/2020-12-02T16_37_07.011589+00_00/1.log.`
     
   By switching to SequentialExecutor (default), only then are the logs written as expected:  
   ```
   *** Reading remote log from Cloudwatch log_group: airflow-task log_stream: tutorial_taskflow_api_etl_dag/extract/2020-12-02T18_04_50.259906+00_00/1.log.
   Task exited with return code 0
   test
   Exporting the following env vars:
   AIRFLOW_CTX_DAG_OWNER=airflow
   AIRFLOW_CTX_DAG_ID=tutorial_taskflow_api_etl_dag
   AIRFLOW_CTX_TASK_ID=extract
   AIRFLOW_CTX_EXECUTION_DATE=2020-12-02T18:04:50.259906+00:00
   AIRFLOW_CTX_DAG_RUN_ID=manual__2020-12-02T18:04:50.259906+00:00
   Running <TaskInstance: tutorial_taskflow_api_etl_dag.extract 2020-12-02T18:04:50.259906+00:00 [running]> on host 845928f08686
   Started process 152 to run task
   Executing <Task(_PythonDecoratedOperator): extract> on 2020-12-02T18:04:50.259906+00:00
   
   --------------------------------------------------------------------------------
   Starting attempt 1 of 1
   
   --------------------------------------------------------------------------------
   Dependencies all met for <TaskInstance: tutorial_taskflow_api_etl_dag.extract 2020-12-02T18:04:50.259906+00:00 [queued]>
   Dependencies all met for <TaskInstance: tutorial_taskflow_api_etl_dag.extract 2020-12-02T18:04:50.259906+00:00 [queued]>
   
   ```  
     
   The logging configuration is turned on for the Scheduler, Webserver, and Workers. **Log groups and log streams are being created as expected, but log events are not**. There are no errors in the workers, and in AWS the IAM policy permissions are all-inclusive (they also definitely work with SequentialExecutor)
   
   **What you expected to happen**:
   I expected Celery Worker logs to display for the Webserver when ran with Cloudwatch Remote logging.
   
   **How to reproduce it**:
   Run Airflow 2.0.0b3 with Webserver in CeleryExecutor mode, two workers, 1 scheduler
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb closed issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor (2.0.0)

Posted by GitBox <gi...@apache.org>.
ashb closed issue #12761:
URL: https://github.com/apache/airflow/issues/12761


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ee07dazn commented on issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor (2.0.0)

Posted by GitBox <gi...@apache.org>.
ee07dazn commented on issue #12761:
URL: https://github.com/apache/airflow/issues/12761#issuecomment-744428254


   @MrManicotti : Thanks for raising this ticket. I have a similar problem, whereby even in Sequential/Local executor, only INFO is getting written to Cloudwatch. Any "ERROR" messages are not getting written with remote logging as shown below
   ```
   Dependencies all met for <TaskInstance: tutorial.test_snow 2020-12-13T07:00:00+00:00 [queued]>
   Dependencies all met for <TaskInstance: tutorial.test_snow 2020-12-13T07:00:00+00:00 [queued]>
   --------------------------------------------------------------------------------
   Starting attempt 1 of 1
   --------------------------------------------------------------------------------
   Executing <Task(VYGSnowflakeOperator): test_snow> on 2020-12-13T07:00:00+00:00
   Started process 161 to run task
   Task exited with return code 1
   ```
   
   Without remote logging, i get the following output:
   ```
   *** Reading local file: /usr/local/airflow/logs/tutorial/print_date/2020-12-13T07:00:00+00:00/1.log
   [2020-12-14 13:07:33,907] {{taskinstance.py:670}} INFO - Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   [2020-12-14 13:07:33,937] {{taskinstance.py:670}} INFO - Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   [2020-12-14 13:07:33,937] {{taskinstance.py:880}} INFO - 
   --------------------------------------------------------------------------------
   [2020-12-14 13:07:33,937] {{taskinstance.py:881}} INFO - Starting attempt 1 of 1
   [2020-12-14 13:07:33,937] {{taskinstance.py:882}} INFO - 
   --------------------------------------------------------------------------------
   [2020-12-14 13:07:33,958] {{taskinstance.py:901}} INFO - Executing <Task(BashOperator): print_date> on 2020-12-13T07:00:00+00:00
   [2020-12-14 13:07:33,963] {{standard_task_runner.py:54}} INFO - Started process 373 to run task
   [2020-12-14 13:07:34,100] {{standard_task_runner.py:77}} INFO - Running: ['airflow', 'run', 'tutorial', 'print_date', '2020-12-13T07:00:00+00:00', '--job_id', '2', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/tuto.py', '--cfg_path', '/tmp/tmpvcretdrv']
   [2020-12-14 13:07:34,101] {{standard_task_runner.py:78}} INFO - Job 2: Subtask print_date
   [2020-12-14 13:07:34,220] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [running]> 4624a16308a0
   [2020-12-14 13:07:34,267] {{bash_operator.py:113}} INFO - Tmp dir root location: 
    /tmp
   [2020-12-14 13:07:34,268] {{bash_operator.py:136}} INFO - Temporary script location: /tmp/airflowtmph5p8i1j5/print_datejbg3bgyq
   [2020-12-14 13:07:34,268] {{bash_operator.py:146}} INFO - Running command: date2
   [2020-12-14 13:07:34,282] {{bash_operator.py:153}} INFO - Output:
   [2020-12-14 13:07:34,286] {{bash_operator.py:157}} INFO - /tmp/airflowtmph5p8i1j5/print_datejbg3bgyq: line 1: date2: command not found
   [2020-12-14 13:07:34,287] {{bash_operator.py:161}} INFO - Command exited with return code 127
   **[2020-12-14 13:07:34,303] {{taskinstance.py:1150}} ERROR - Bash command failed
   Traceback (most recent call last):
     File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
       result = task_copy.execute(context=context)
     File "/usr/local/lib/python3.7/site-packages/airflow/operators/bash_operator.py", line 165, in execute
       raise AirflowException("Bash command failed")
   airflow.exceptions.AirflowException: Bash command failed**
   [2020-12-14 13:07:34,306] {{taskinstance.py:1194}} INFO - Marking task as FAILED. dag_id=tutorial, task_id=print_date, execution_date=20201213T070000, start_date=20201214T130733, end_date=20201214T130734
   [2020-12-14 13:07:38,807] {{local_task_job.py:102}} INFO - Task exited with return code 1
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ee07dazn edited a comment on issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor (2.0.0)

Posted by GitBox <gi...@apache.org>.
ee07dazn edited a comment on issue #12761:
URL: https://github.com/apache/airflow/issues/12761#issuecomment-744428254


   @MrManicotti : Thanks for raising this ticket. I have a similar problem, whereby even in Sequential/Local executor, only INFO is getting written to Cloudwatch. Any "ERROR" messages are not getting written with remote logging as shown below
   ```
   Dependencies all met for <TaskInstance: tutorial.test_snow 2020-12-13T07:00:00+00:00 [queued]>
   Dependencies all met for <TaskInstance: tutorial.test_snow 2020-12-13T07:00:00+00:00 [queued]>
   --------------------------------------------------------------------------------
   Starting attempt 1 of 1
   --------------------------------------------------------------------------------
   Executing <Task(VYGSnowflakeOperator): test_snow> on 2020-12-13T07:00:00+00:00
   Started process 161 to run task
   Task exited with return code 1
   ```
   
   Without remote logging, i get the following output:
   ```
   *** Reading local file: /usr/local/airflow/logs/tutorial/print_date/2020-12-13T07:00:00+00:00/1.log
   [2020-12-14 13:07:33,907] {{taskinstance.py:670}} INFO - Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   [2020-12-14 13:07:33,937] {{taskinstance.py:670}} INFO - Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   [2020-12-14 13:07:33,937] {{taskinstance.py:880}} INFO - 
   --------------------------------------------------------------------------------
   [2020-12-14 13:07:33,937] {{taskinstance.py:881}} INFO - Starting attempt 1 of 1
   [2020-12-14 13:07:33,937] {{taskinstance.py:882}} INFO - 
   --------------------------------------------------------------------------------
   [2020-12-14 13:07:33,958] {{taskinstance.py:901}} INFO - Executing <Task(BashOperator): print_date> on 2020-12-13T07:00:00+00:00
   [2020-12-14 13:07:33,963] {{standard_task_runner.py:54}} INFO - Started process 373 to run task
   [2020-12-14 13:07:34,100] {{standard_task_runner.py:77}} INFO - Running: ['airflow', 'run', 'tutorial', 'print_date', '2020-12-13T07:00:00+00:00', '--job_id', '2', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/tuto.py', '--cfg_path', '/tmp/tmpvcretdrv']
   [2020-12-14 13:07:34,101] {{standard_task_runner.py:78}} INFO - Job 2: Subtask print_date
   [2020-12-14 13:07:34,220] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [running]> 4624a16308a0
   [2020-12-14 13:07:34,267] {{bash_operator.py:113}} INFO - Tmp dir root location: 
    /tmp
   [2020-12-14 13:07:34,268] {{bash_operator.py:136}} INFO - Temporary script location: /tmp/airflowtmph5p8i1j5/print_datejbg3bgyq
   [2020-12-14 13:07:34,268] {{bash_operator.py:146}} INFO - Running command: date2
   [2020-12-14 13:07:34,282] {{bash_operator.py:153}} INFO - Output:
   [2020-12-14 13:07:34,286] {{bash_operator.py:157}} INFO - /tmp/airflowtmph5p8i1j5/print_datejbg3bgyq: line 1: date2: command not found
   [2020-12-14 13:07:34,287] {{bash_operator.py:161}} INFO - Command exited with return code 127
   **[2020-12-14 13:07:34,303] {{taskinstance.py:1150}} ERROR - Bash command failed
   Traceback (most recent call last):
     File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
       result = task_copy.execute(context=context)
     File "/usr/local/lib/python3.7/site-packages/airflow/operators/bash_operator.py", line 165, in execute
       raise AirflowException("Bash command failed")
   airflow.exceptions.AirflowException: Bash command failed**
   [2020-12-14 13:07:34,306] {{taskinstance.py:1194}} INFO - Marking task as FAILED. dag_id=tutorial, task_id=print_date, execution_date=20201213T070000, start_date=20201214T130733, end_date=20201214T130734
   [2020-12-14 13:07:38,807] {{local_task_job.py:102}} INFO - Task exited with return code 1
   ```
   Were you actually able to see 'ERROR' logs or just 'INFO' ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor (2.0.0)

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #12761:
URL: https://github.com/apache/airflow/issues/12761#issuecomment-744429186


   Duplicate of/same issue as #12969, which will be fixed in RC3 by  #13057 think.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor (2.0.0)

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #12761:
URL: https://github.com/apache/airflow/issues/12761#issuecomment-745195488


   Please test with 2.0.0rc3 which has now been released @MrManicotti @ee07dazn 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ee07dazn edited a comment on issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor (2.0.0)

Posted by GitBox <gi...@apache.org>.
ee07dazn edited a comment on issue #12761:
URL: https://github.com/apache/airflow/issues/12761#issuecomment-744428254


   @MrManicotti : Thanks for raising this ticket. I have a similar problem, whereby even in Sequential/Local executor, only INFO is getting written to Cloudwatch. Any "ERROR" messages are not getting written with remote logging as shown below
   ```
   Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   --------------------------------------------------------------------------------
   Starting attempt 1 of 1
   --------------------------------------------------------------------------------
   Executing <Task(BashOperator): print_date> on 2020-12-13T07:00:00+00:00
   Started process 746 to run task
   Task exited with return code 1
   ```
   
   Without remote logging, i get the following output:
   ```
   *** Reading local file: /usr/local/airflow/logs/tutorial/print_date/2020-12-13T07:00:00+00:00/1.log
   [2020-12-14 13:07:33,907] {{taskinstance.py:670}} INFO - Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   [2020-12-14 13:07:33,937] {{taskinstance.py:670}} INFO - Dependencies all met for <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [queued]>
   [2020-12-14 13:07:33,937] {{taskinstance.py:880}} INFO - 
   --------------------------------------------------------------------------------
   [2020-12-14 13:07:33,937] {{taskinstance.py:881}} INFO - Starting attempt 1 of 1
   [2020-12-14 13:07:33,937] {{taskinstance.py:882}} INFO - 
   --------------------------------------------------------------------------------
   [2020-12-14 13:07:33,958] {{taskinstance.py:901}} INFO - Executing <Task(BashOperator): print_date> on 2020-12-13T07:00:00+00:00
   [2020-12-14 13:07:33,963] {{standard_task_runner.py:54}} INFO - Started process 373 to run task
   [2020-12-14 13:07:34,100] {{standard_task_runner.py:77}} INFO - Running: ['airflow', 'run', 'tutorial', 'print_date', '2020-12-13T07:00:00+00:00', '--job_id', '2', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/tuto.py', '--cfg_path', '/tmp/tmpvcretdrv']
   [2020-12-14 13:07:34,101] {{standard_task_runner.py:78}} INFO - Job 2: Subtask print_date
   [2020-12-14 13:07:34,220] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: tutorial.print_date 2020-12-13T07:00:00+00:00 [running]> 4624a16308a0
   [2020-12-14 13:07:34,267] {{bash_operator.py:113}} INFO - Tmp dir root location: 
    /tmp
   [2020-12-14 13:07:34,268] {{bash_operator.py:136}} INFO - Temporary script location: /tmp/airflowtmph5p8i1j5/print_datejbg3bgyq
   [2020-12-14 13:07:34,268] {{bash_operator.py:146}} INFO - Running command: date2
   [2020-12-14 13:07:34,282] {{bash_operator.py:153}} INFO - Output:
   [2020-12-14 13:07:34,286] {{bash_operator.py:157}} INFO - /tmp/airflowtmph5p8i1j5/print_datejbg3bgyq: line 1: date2: command not found
   [2020-12-14 13:07:34,287] {{bash_operator.py:161}} INFO - Command exited with return code 127
   **[2020-12-14 13:07:34,303] {{taskinstance.py:1150}} ERROR - Bash command failed
   Traceback (most recent call last):
     File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
       result = task_copy.execute(context=context)
     File "/usr/local/lib/python3.7/site-packages/airflow/operators/bash_operator.py", line 165, in execute
       raise AirflowException("Bash command failed")
   airflow.exceptions.AirflowException: Bash command failed**
   [2020-12-14 13:07:34,306] {{taskinstance.py:1194}} INFO - Marking task as FAILED. dag_id=tutorial, task_id=print_date, execution_date=20201213T070000, start_date=20201214T130733, end_date=20201214T130734
   [2020-12-14 13:07:38,807] {{local_task_job.py:102}} INFO - Task exited with return code 1
   ```
   Were you actually able to see 'ERROR' logs or just 'INFO' ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #12761: Cloudwatch Logs Integration Does Not Write Logs With CeleryExecutor

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #12761:
URL: https://github.com/apache/airflow/issues/12761#issuecomment-737423678


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org