You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jack (JIRA)" <ji...@apache.org> on 2019/06/22 16:40:00 UTC

[jira] [Commented] (AIRFLOW-3080) Mysql OperationalError occurs during heartbeat or any DB operation

    [ https://issues.apache.org/jira/browse/AIRFLOW-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870305#comment-16870305 ] 

jack commented on AIRFLOW-3080:
-------------------------------

[~amitgh] did you solve this?

> Mysql OperationalError occurs during heartbeat or any DB operation
> ------------------------------------------------------------------
>
>                 Key: AIRFLOW-3080
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3080
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler, worker
>    Affects Versions: 1.10.0
>            Reporter: Amit Ghosh
>            Assignee: Amit Ghosh
>            Priority: Major
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When airflow uses mysql and airflow has many worker instances and no dag was executed for a long time mysql gives "mysql_exceptions.OperationalError".
> Main issue is if connections become stale for a long time, first db request gives this error because mysql marks connection as stale after some time if no connection has happened to db from a given sqlachemy pool. I am working on a fix and will commit it and that should work in case of other databases also.
> 1) Log Text = \{"log":"[2018-09-18 05:33:45,296] {jobs.py:748} ERROR - (_mysql_exceptions.OperationalError) (2005, \"Unknown MySQL server host 'mlp.prod.machine-learning-platform-prod.ms-df-cloudrdbms.prod.walmart.com' (2)\") (Background on this error at: [http://sqlalche.me/e/e3q8])
> ","stream":"stdout","time":"2018-09-18T05:33:45.315547946Z"}
>  
> 2) Log Text = {"log":" raise errorvalue
> ","stream":"stderr","time":"2018-09-15T06:04:35.722310847Z"}
> {"log":"sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'UPDATE job SET latest_heartbeat=%s WHERE job.id = %s'] [parameters: (datetime.datetime(2018, 9, 15, 6, 4, 23, 4294), 345143L)] (Background on this error at: [http://sqlalche.me/e/e3q8])
> ","stream":"stderr","time":"2018-09-15T06:04:35.72232954Z"}
> {"log":"[2018-09-15 06:04:35,844: ERROR/ForkPoolWorker-13] Command 'airflow run dag_2063_baf60054-d0c7-41b2-8009-4d88f773dc79 web_crawl_pipeline 2018-09-14T05:48:42 --local -sd DAGS_FOLDER/1833_workflows/dag_2063_baf60054-d0c7-41b2-8009-4d88f773dc79.py ' returned non-zero exit status 1
> ","stream":"stderr","time":"2018-09-15T06:04:35.847747612Z"}
> {"log":"[2018-09-15 06:04:35,851: ERROR/ForkPoolWorker-13] Task airflow.executors.celery_executor.execute_command[30141a5a-71da-4d28-a829-495aeca3cfa9] raised unexpected: AirflowException('Celery command failed',)
> ","stream":"stderr","time":"2018-09-15T06:04:35.855019453Z"}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)