You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by ra...@gmail.com, ra...@gmail.com on 2018/09/26 06:57:32 UTC

SqlAlchemy Pool config parameters to minimize connectivity issue impact

Hi All,

We are observing sometimes Dag tasks get failed because of some connectivity issues with Mysql server.
So Are there any recommended settings for mysql pool's related parameters like
sql_alchemy_pool_size = 5
sql_alchemy_pool_recycle = 3600
to minimise  the connectivity issue impact.

Thanks,
Raman Gupta

Re: SqlAlchemy Pool config parameters to minimize connectivity issue impact

Posted by Pramiti Goel <pr...@gmail.com>.
Hi,
We  also faced this issue one month back for Airflow 1.9 with Celery
Executor. Unfortunately, we could not find the root cause immediately. This
can occur due to too many open connections. We end up closing all open
connections as immediate Solution and restarting Mysql Instance. But I
think what Kevin mentioned can be the cause of our problem too. We happened
to run airflow run command many times in last week before we start to see
such error.
@Kevin,
My doubt is, When does airflow run command connection is actually closed ?
Does it follow SQL_ALCHEMY_POOL_RECYCLE (1 hour) property and kill every
hour ? Because I think we saw some very old connections running too.

On Fri, Sep 28, 2018 at 4:32 PM Kevin Yang <yr...@gmail.com> wrote:

> Hi Raman,
> Would you elaborate a bit more on what exactly is the connectivity issues
> that you were facing and which version of Airflow are you on? We previously
> had some connectivity issue when the # of connection was too large and we
> had it fixed with this PR
> <https://github.com/apache/incubator-airflow/pull/3336>.
>
> Cheers,
> Kevin Y
>
> On Tue, Sep 25, 2018 at 11:57 PM ramandumcs@gmail.com <
> ramandumcs@gmail.com>
> wrote:
>
> > Hi All,
> >
> > We are observing sometimes Dag tasks get failed because of some
> > connectivity issues with Mysql server.
> > So Are there any recommended settings for mysql pool's related parameters
> > like
> > sql_alchemy_pool_size = 5
> > sql_alchemy_pool_recycle = 3600
> > to minimise  the connectivity issue impact.
> >
> > Thanks,
> > Raman Gupta
> >
>

Re: SqlAlchemy Pool config parameters to minimize connectivity issue impact

Posted by raman gupta <ra...@gmail.com>.
Hi Kevin,

We are using Airflow 1.9 with Local Excecutor and we are  facing various
mysql connectivity error/exceptions
like
-> sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError)
(1213, 'Deadlock found when trying to get lock; try restarting
transaction') [SQL: u'UPDATE task_instance, dag_run SET
task_instance.state=%s WHERE task_instance.dag_id IN (%s, %s, %s, %s) AND
task_instance.state IN (%s, %s) AND dag_run.dag_id = task_instance.dag_id
AND dag_run.execution_date = task_instance.execution_date AND dag_run.state
!= %s'] [parameters: (None,
-> OperationalError: (_mysql_exceptions.OperationalError) (2013, 'Lost
connection to MySQL server during query') (Background on this error at:
http://sqlalche.me/e/e3q8)
-> sqlalchemy.exc.InvalidRequestError: This Session's transaction has been
rolled back due to a previous exception during flush. To begin a new
transaction with this Session, first issue Session.rollback(). Original
exception was: (_mysql_exceptions.OperationalError) (2003, "Can't connect
to MySQL server on.."

we have set pool size as 10 with pool recycle is set to 3600 sec.Machine
config is 8 core 16 GB.

Thanks,
Raman Gupta



On Fri, Sep 28, 2018 at 2:02 PM Kevin Yang <yr...@gmail.com> wrote:

> Hi Raman,
> Would you elaborate a bit more on what exactly is the connectivity issues
> that you were facing and which version of Airflow are you on? We previously
> had some connectivity issue when the # of connection was too large and we
> had it fixed with this PR
> <https://github.com/apache/incubator-airflow/pull/3336>.
>
> Cheers,
> Kevin Y
>
> On Tue, Sep 25, 2018 at 11:57 PM ramandumcs@gmail.com <
> ramandumcs@gmail.com>
> wrote:
>
> > Hi All,
> >
> > We are observing sometimes Dag tasks get failed because of some
> > connectivity issues with Mysql server.
> > So Are there any recommended settings for mysql pool's related parameters
> > like
> > sql_alchemy_pool_size = 5
> > sql_alchemy_pool_recycle = 3600
> > to minimise  the connectivity issue impact.
> >
> > Thanks,
> > Raman Gupta
> >
>

Re: SqlAlchemy Pool config parameters to minimize connectivity issue impact

Posted by Kevin Yang <yr...@gmail.com>.
Hi Raman,
Would you elaborate a bit more on what exactly is the connectivity issues
that you were facing and which version of Airflow are you on? We previously
had some connectivity issue when the # of connection was too large and we
had it fixed with this PR
<https://github.com/apache/incubator-airflow/pull/3336>.

Cheers,
Kevin Y

On Tue, Sep 25, 2018 at 11:57 PM ramandumcs@gmail.com <ra...@gmail.com>
wrote:

> Hi All,
>
> We are observing sometimes Dag tasks get failed because of some
> connectivity issues with Mysql server.
> So Are there any recommended settings for mysql pool's related parameters
> like
> sql_alchemy_pool_size = 5
> sql_alchemy_pool_recycle = 3600
> to minimise  the connectivity issue impact.
>
> Thanks,
> Raman Gupta
>