You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by David Capwell <dc...@gmail.com> on 2017/08/28 22:01:12 UTC

As history grows UI gets slower

We are on 1.8.0 and have a monitor DAG that monitors the health of Airflow
and Celery every minute.  This has been running for awhile now and at 26k
dag runs. We see that the UI for this DAG is multiple seconds slower (6-7
second) than any other DAG.

My question is, what do people do about managing history as it grows over
time? Do people delete history after N or so days?

Thanks for your time reading this email

Re: As history grows UI gets slower

Posted by David Capwell <dc...@gmail.com>.
So if I cleanup the DB for anything older than 30 days, wouldn't the
scheduler try to backfill?

On Aug 29, 2017 11:02 AM, "David Capwell" <dc...@gmail.com> wrote:

> Thanks, will take a look at this project
>
> On Aug 29, 2017 10:35 AM, "Chris Riccomini" <cr...@apache.org> wrote:
>
>> Might have a look at this, too:
>>
>> https://github.com/teamclairvoyant/airflow-maintenance-dags
>>
>> I haven't used it, but it seems to have a DB cleaning script.
>>
>> On Mon, Aug 28, 2017 at 3:43 PM, Maxime Beauchemin <
>> maximebeauchemin@gmail.com> wrote:
>>
>> > Just make sure to archive based on start_date and not execution_date to
>> > allow backfills.
>> >
>> > Max
>> >
>> > On Mon, Aug 28, 2017 at 3:20 PM, Alex Guziel <alex.guziel@airbnb.com.
>> > invalid
>> > > wrote:
>> >
>> > > Here at Airbnb we delete old "completed" task instances.
>> > >
>> > > On Mon, Aug 28, 2017 at 3:01 PM, David Capwell <dc...@gmail.com>
>> > wrote:
>> > >
>> > > > We are on 1.8.0 and have a monitor DAG that monitors the health of
>> > > Airflow
>> > > > and Celery every minute.  This has been running for awhile now and
>> at
>> > 26k
>> > > > dag runs. We see that the UI for this DAG is multiple seconds slower
>> > (6-7
>> > > > second) than any other DAG.
>> > > >
>> > > > My question is, what do people do about managing history as it grows
>> > over
>> > > > time? Do people delete history after N or so days?
>> > > >
>> > > > Thanks for your time reading this email
>> > > >
>> > >
>> >
>>
>

Re: As history grows UI gets slower

Posted by David Capwell <dc...@gmail.com>.
Thanks, will take a look at this project

On Aug 29, 2017 10:35 AM, "Chris Riccomini" <cr...@apache.org> wrote:

> Might have a look at this, too:
>
> https://github.com/teamclairvoyant/airflow-maintenance-dags
>
> I haven't used it, but it seems to have a DB cleaning script.
>
> On Mon, Aug 28, 2017 at 3:43 PM, Maxime Beauchemin <
> maximebeauchemin@gmail.com> wrote:
>
> > Just make sure to archive based on start_date and not execution_date to
> > allow backfills.
> >
> > Max
> >
> > On Mon, Aug 28, 2017 at 3:20 PM, Alex Guziel <alex.guziel@airbnb.com.
> > invalid
> > > wrote:
> >
> > > Here at Airbnb we delete old "completed" task instances.
> > >
> > > On Mon, Aug 28, 2017 at 3:01 PM, David Capwell <dc...@gmail.com>
> > wrote:
> > >
> > > > We are on 1.8.0 and have a monitor DAG that monitors the health of
> > > Airflow
> > > > and Celery every minute.  This has been running for awhile now and at
> > 26k
> > > > dag runs. We see that the UI for this DAG is multiple seconds slower
> > (6-7
> > > > second) than any other DAG.
> > > >
> > > > My question is, what do people do about managing history as it grows
> > over
> > > > time? Do people delete history after N or so days?
> > > >
> > > > Thanks for your time reading this email
> > > >
> > >
> >
>

Re: As history grows UI gets slower

Posted by Chris Riccomini <cr...@apache.org>.
Might have a look at this, too:

https://github.com/teamclairvoyant/airflow-maintenance-dags

I haven't used it, but it seems to have a DB cleaning script.

On Mon, Aug 28, 2017 at 3:43 PM, Maxime Beauchemin <
maximebeauchemin@gmail.com> wrote:

> Just make sure to archive based on start_date and not execution_date to
> allow backfills.
>
> Max
>
> On Mon, Aug 28, 2017 at 3:20 PM, Alex Guziel <alex.guziel@airbnb.com.
> invalid
> > wrote:
>
> > Here at Airbnb we delete old "completed" task instances.
> >
> > On Mon, Aug 28, 2017 at 3:01 PM, David Capwell <dc...@gmail.com>
> wrote:
> >
> > > We are on 1.8.0 and have a monitor DAG that monitors the health of
> > Airflow
> > > and Celery every minute.  This has been running for awhile now and at
> 26k
> > > dag runs. We see that the UI for this DAG is multiple seconds slower
> (6-7
> > > second) than any other DAG.
> > >
> > > My question is, what do people do about managing history as it grows
> over
> > > time? Do people delete history after N or so days?
> > >
> > > Thanks for your time reading this email
> > >
> >
>

Re: As history grows UI gets slower

Posted by Maxime Beauchemin <ma...@gmail.com>.
Just make sure to archive based on start_date and not execution_date to
allow backfills.

Max

On Mon, Aug 28, 2017 at 3:20 PM, Alex Guziel <alex.guziel@airbnb.com.invalid
> wrote:

> Here at Airbnb we delete old "completed" task instances.
>
> On Mon, Aug 28, 2017 at 3:01 PM, David Capwell <dc...@gmail.com> wrote:
>
> > We are on 1.8.0 and have a monitor DAG that monitors the health of
> Airflow
> > and Celery every minute.  This has been running for awhile now and at 26k
> > dag runs. We see that the UI for this DAG is multiple seconds slower (6-7
> > second) than any other DAG.
> >
> > My question is, what do people do about managing history as it grows over
> > time? Do people delete history after N or so days?
> >
> > Thanks for your time reading this email
> >
>

Re: As history grows UI gets slower

Posted by Alex Guziel <al...@airbnb.com.INVALID>.
Here at Airbnb we delete old "completed" task instances.

On Mon, Aug 28, 2017 at 3:01 PM, David Capwell <dc...@gmail.com> wrote:

> We are on 1.8.0 and have a monitor DAG that monitors the health of Airflow
> and Celery every minute.  This has been running for awhile now and at 26k
> dag runs. We see that the UI for this DAG is multiple seconds slower (6-7
> second) than any other DAG.
>
> My question is, what do people do about managing history as it grows over
> time? Do people delete history after N or so days?
>
> Thanks for your time reading this email
>