You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by ka...@apache.org on 2021/01/14 13:08:24 UTC
[airflow] branch master updated: Increase the default
``min_file_process_interval`` to decrease CPU Usage (#13664)
This is an automated email from the ASF dual-hosted git repository.
kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/master by this push:
new e4b8ee6 Increase the default ``min_file_process_interval`` to decrease CPU Usage (#13664)
e4b8ee6 is described below
commit e4b8ee63b04a25feb21a5766b1cc997aca9951a9
Author: Kaxil Naik <ka...@gmail.com>
AuthorDate: Thu Jan 14 13:08:12 2021 +0000
Increase the default ``min_file_process_interval`` to decrease CPU Usage (#13664)
With the previous default of `0`, the CPU Usage mostly stays around 100.
As in Airflow 2.0.0, the scheduling decisions have been moved out from
DagFileProcessor to Scheduler, we can keep this number high.
closes https://github.com/apache/airflow/issues/13637
---
UPDATING.md | 9 +++++++++
airflow/config_templates/config.yml | 6 ++++--
airflow/config_templates/default_airflow.cfg | 6 ++++--
3 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/UPDATING.md b/UPDATING.md
index 1cf2a6c..374a521 100644
--- a/UPDATING.md
+++ b/UPDATING.md
@@ -59,6 +59,15 @@ However, it was unintentionally changed to `8` in 2.0.0.
From Airflow 2.0.1, we revert to the old default of `16`.
+### Default `[scheduler] min_file_process_interval` is changed to `30`
+
+The default value for `[scheduler] min_file_process_interval` was `0`,
+due to which the CPU Usage mostly stayed around 100% as the DAG files are parsed
+constantly.
+
+From Airflow 2.0.0, the scheduling decisions have been moved from
+DagFileProcessor to Scheduler, so we can keep the default a bit higher: `30`.
+
## Airflow 2.0.0
### The experimental REST API is disabled by default
diff --git a/airflow/config_templates/config.yml b/airflow/config_templates/config.yml
index d475ce7..1fc16e1 100644
--- a/airflow/config_templates/config.yml
+++ b/airflow/config_templates/config.yml
@@ -1648,11 +1648,13 @@
default: "1"
- name: min_file_process_interval
description: |
- after how much time (seconds) a new DAGs should be picked up from the filesystem
+ Number of seconds after which a DAG file is parsed. The DAG file is parsed every
+ ``min_file_process_interval`` number of seconds. Updates to DAGs are reflected after
+ this interval. Keeping this number low will increase CPU usage.
version_added: ~
type: string
example: ~
- default: "0"
+ default: "30"
- name: dag_dir_list_interval
description: |
How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
diff --git a/airflow/config_templates/default_airflow.cfg b/airflow/config_templates/default_airflow.cfg
index 458b606..f03dbca 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -814,8 +814,10 @@ num_runs = -1
# The number of seconds to wait between consecutive DAG file processing
processor_poll_interval = 1
-# after how much time (seconds) a new DAGs should be picked up from the filesystem
-min_file_process_interval = 0
+# Number of seconds after which a DAG file is parsed. The DAG file is parsed every
+# ``min_file_process_interval`` number of seconds. Updates to DAGs are reflected after
+# this interval. Keeping this number low will increase CPU usage.
+min_file_process_interval = 30
# How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
dag_dir_list_interval = 300