You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "wolfier (via GitHub)" <gi...@apache.org> on 2023/02/10 08:23:46 UTC

[GitHub] [airflow] wolfier opened a new issue, #29458: Identify logging statements from Airflow versus top level code

wolfier opened a new issue, #29458:
URL: https://github.com/apache/airflow/issues/29458

   ### Description
   
   I want to filter out logs generated by Airflow workers during the files parsing process before task execution. Logs are printed when the file contains top level logging statements or module imports that contain logging statements.
   
   ### Use case/motivation
   
   In an effort to reduce cost from log ingestion, I want to be able to identify these top level logging statements to filter them. 
   
   
   I am able to determine the logs for a common successful celery task execution.
   
   ```
   [2023-02-07 18:48:39,101: INFO/MainProcess] Task airflow.executors.celery_executor.execute_command[5589d08f-4b4c-4a1b-9227-7c124abe0d24] received
   [2023-02-07 18:48:39,141: INFO/ForkPoolWorker-9] [5589d08f-4b4c-4a1b-9227-7c124abe0d24] Executing command in Celery: ['airflow', 'tasks', 'run', 'ae_cancellation_flows', 'dbt_run.f_loan_cancellation_requests_co_logs.model.lake_modeling.f_loan_cancellation_requests_co_logs', 'scheduled__2023-02-07T16:45:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/lake_modeling/templated_dags.py']
   [2023-02-07 18:48:39,252: INFO/ForkPoolWorker-9] Filling up the DagBag from /usr/local/airflow/dags/lake_modeling/templated_dags.py
   [2023-02-07 18:51:16,557: INFO/ForkPoolWorker-9] Running <TaskInstance: ae_cancellation_flows.dbt_run.f_loan_cancellation_requests_co_logs.model.lake_modeling.f_loan_cancellation_requests_co_logs scheduled__2023-02-07T16:45:00+00:00 [queued]> on host 172.20.19.199
   [2023-02-07 18:51:32,965: INFO/ForkPoolWorker-9] Using connection ID 'astro_s3_logging' for task execution.
   [2023-02-07 18:51:33,155: INFO/ForkPoolWorker-9] Using connection ID 'astro_s3_logging' for task execution.
   [2023-02-07 18:51:33,287: INFO/ForkPoolWorker-9] Task airflow.executors.celery_executor.execute_command[5589d08f-4b4c-4a1b-9227-7c124abe0d24] succeeded in 174.15662495000288s: None
   ```
   
   Users may include arbitrary logging statements that is hard to identify there are no distinguish markers.
   
   ```
   [2023-02-07 18:48:56,754: INFO/ForkPoolWorker-9] Model: superduperjellymonster
   [2023-02-07 18:48:56,756: INFO/ForkPoolWorker-9] Model: slipperybones
   ```
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #29458: Identify logging statements from Airflow versus top level code

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29458:
URL: https://github.com/apache/airflow/issues/29458#issuecomment-1436167051

   I think it should be done via convention. In your DAGs you can add specific logger "filtered.out" for example and configue it in the way you want in LOGGING_CONFIG: https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/logging-tasks.html#advanced-configuration - you can remove log propagation and make sure they are not sent to root logger. 
   
   Try it this way. Let us know if it solves your issue. I am converting this one to a discussion and you if you test it @wolfier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #29458: Identify logging statements from Airflow versus top level code

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #29458: Identify logging statements from Airflow versus top level code
URL: https://github.com/apache/airflow/issues/29458


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org