You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "t oo (Jira)" <ji...@apache.org> on 2019/11/07 18:23:00 UTC

[jira] [Updated] (AIRFLOW-5866) Task_instance table too large causing issues?

     [ https://issues.apache.org/jira/browse/AIRFLOW-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

t oo updated AIRFLOW-5866:
--------------------------
    Attachment: Mysql queuedepth.png

> Task_instance table too large causing issues?
> ---------------------------------------------
>
>                 Key: AIRFLOW-5866
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5866
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: database, scheduler
>    Affects Versions: 1.10.3
>            Reporter: t oo
>            Priority: Major
>         Attachments: Mysql queuedepth.png
>
>
> mysql rds metastore - db.m5.large instance, 5.7.26 version
>  
> task_instance table has 2,848,160 rows
> dag_run table has 22768 rows
> dag table has 23 rows
> log table has 17,916,891 rows
>  
> airflow 1.10.3, using LocalExecutor, python 2.7, single ec2 m5.4xlarge, parallelism set to 45 (ie max 45 tasks at once). Just using externally triggered dags, no SLAs. No subdags/backfills. 4 gunicorn workers. Using dynamic dags
>  
> Everything was fine until yesterday, around 300 dag runs every day. Now today these issues appear all of a sudden (no code change, environment change.etc). I suspect the task_instance table has gotten too big and causing scheduler and mysql issues.
>  
> 1.
> 'Recent tasks' are showing blank on the web ui home page. admin/airflow/task_stats fails to display with 504 error after few mins but dag_stats endpoint shows dags are in running state
>  
> 2.
> dag_runs are stuck in running state > 20 hrs, seems no new tasks are being run (they are stuck in scheduled/queued state)
>  
> I then tried terminating the EC2 and getting a new one, the dagruns and tasks would then start finishing but then after few hours got into same situation as points 1/2 above. I believe certain dag ids (with many tasks) are hitting the issue, will know m
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)