You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Sumit Maheshwari <su...@gmail.com> on 2017/08/18 10:40:25 UTC

Info needed regarding upgrading to 1.8.2

Hi All,

We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a couple
of issues, major ones are:

1. We used to run the webserver in debug mode (-d) so we don't need to
restart whenever we add/modify any dag. But with 1.8.2 debug mode doesn't
have any effect and we need to restart it after any change in dags.

2. Scheduler used to pick up any new dag landed in dags folder, but that is
not happening anymore and we required to restart the scheduler as well.

Any help would be highly appreciated.

Thanks,
Sumit

Re: Info needed regarding upgrading to 1.8.2

Posted by Marc Bollinger <ma...@lumoslabs.com>.
So spurred on by this thread, we were just having an internal discussion
about how best to handle the situation when DAG definitions are updated,
not just when new DAGs appear. Currently, we're doing it as part of a
broader deploy that's also bouncing the scheduler... Is that common [or
recommended] practice?

This PR <https://github.com/apache/incubator-airflow/pull/1374> from the
1.7.1 release that I dredged up while I was looking for the 1.8.1 changes
referenced earlier in the thread seems to suggest that for existing DAGs,
the common pattern would be just updating the DAG folder and clicking
Refresh in the UI. Is that more common?

On Fri, Aug 18, 2017 at 11:14 PM, Sumit Maheshwari <su...@gmail.com>
wrote:

> I think the I missed the *dag_dir_list_interval *setting and assumed that
> scheduler picks up new dags instantly. I will retest things with a lower
> value of that flag and see how it behaves.
>
> Thanks for the help Boris, Max and Bolke.
>
>
>
> Thanks,
> Sumit Maheshwari
> cell. 9632202950
>
>
> On Sat, Aug 19, 2017 at 1:00 AM, Bolke de Bruin <bd...@gmail.com> wrote:
>
> > If you have a large number of dags, it can take some time before the
> > scheduler processes them. This is determined by the following settings in
> > the [scheduler] section:
> >
> > # after how much time a new DAGs should be picked up from the filesystem
> > min_file_process_interval = 0
> > dag_dir_list_interval = 300
> >
> > Bolke
> >
> > > On 18 Aug 2017, at 21:26, Maxime Beauchemin <
> maximebeauchemin@gmail.com>
> > wrote:
> > >
> > > In 1.8.1 I'm pretty sure that in the context of the scheduler, the DAGs
> > are
> > > fully re-evaluated in subprocesses which fixes the re-import
> > > sys.module-related caching issue of earlier versions that created
> > > staleness. 1.8.1 shouldn't have scheduler staleness issues.
> > >
> > > Max
> > >
> > > On Fri, Aug 18, 2017 at 6:09 AM, Boris Tyukin <bo...@boristyukin.com>
> > wrote:
> > >
> > >> see how much time it takes to parse your dag, if you do any heavy
> > >> operations like database calls, file pulls etc. I remember someone had
> > an
> > >> issue with web server because it took a while to parse one dag. By
> > default
> > >> airflow will check dag folder very frequently (unless you changed the
> > >> timing in config). You can start both scheduler and web server
> manually
> > and
> > >> just watch the logs
> > >>
> > >> On Fri, Aug 18, 2017 at 9:03 AM, Sumit Maheshwari <
> > sumeet.manit@gmail.com>
> > >> wrote:
> > >>
> > >>> Thanks Boris,
> > >>>
> > >>> Seems like there is some issue in my Scheduler, which sometimes picks
> > up
> > >>> dag modification immediately and sometimes doesn't pickup at all. And
> > >> when
> > >>> Scheduler doesn't pickup new dag, UI also doesn't show it, i.e. no
> > >> refresh
> > >>> button would be available. I am trying to debug that issue with
> > >> Scheduler,
> > >>> but not sure if it's a real issue at all or something else.
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Aug 18, 2017 at 6:16 PM, Boris Tyukin <boris@boristyukin.com
> >
> > >>> wrote:
> > >>>
> > >>>> we run 1.8.1 with no debug mode. New dags are picked up just fine
> > >> without
> > >>>> restarts. as well as modifications (adding new tasks or changing
> > >>>> definitions of new dags). When I want to see dag updated on UI, I
> hit
> > >>>> refresh button but this is just for UI to show it properly. No
> > restarts
> > >>>> needed in our experience with 1.8.1
> > >>>>
> > >>>> On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <
> > >>> sumeet.manit@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi All,
> > >>>>>
> > >>>>> We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found
> a
> > >>>> couple
> > >>>>> of issues, major ones are:
> > >>>>>
> > >>>>> 1. We used to run the webserver in debug mode (-d) so we don't need
> > >> to
> > >>>>> restart whenever we add/modify any dag. But with 1.8.2 debug mode
> > >>> doesn't
> > >>>>> have any effect and we need to restart it after any change in dags.
> > >>>>>
> > >>>>> 2. Scheduler used to pick up any new dag landed in dags folder, but
> > >>> that
> > >>>> is
> > >>>>> not happening anymore and we required to restart the scheduler as
> > >> well.
> > >>>>>
> > >>>>> Any help would be highly appreciated.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Sumit
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Re: Info needed regarding upgrading to 1.8.2

Posted by Sumit Maheshwari <su...@gmail.com>.
I think the I missed the *dag_dir_list_interval *setting and assumed that
scheduler picks up new dags instantly. I will retest things with a lower
value of that flag and see how it behaves.

Thanks for the help Boris, Max and Bolke.



Thanks,
Sumit Maheshwari
cell. 9632202950


On Sat, Aug 19, 2017 at 1:00 AM, Bolke de Bruin <bd...@gmail.com> wrote:

> If you have a large number of dags, it can take some time before the
> scheduler processes them. This is determined by the following settings in
> the [scheduler] section:
>
> # after how much time a new DAGs should be picked up from the filesystem
> min_file_process_interval = 0
> dag_dir_list_interval = 300
>
> Bolke
>
> > On 18 Aug 2017, at 21:26, Maxime Beauchemin <ma...@gmail.com>
> wrote:
> >
> > In 1.8.1 I'm pretty sure that in the context of the scheduler, the DAGs
> are
> > fully re-evaluated in subprocesses which fixes the re-import
> > sys.module-related caching issue of earlier versions that created
> > staleness. 1.8.1 shouldn't have scheduler staleness issues.
> >
> > Max
> >
> > On Fri, Aug 18, 2017 at 6:09 AM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
> >
> >> see how much time it takes to parse your dag, if you do any heavy
> >> operations like database calls, file pulls etc. I remember someone had
> an
> >> issue with web server because it took a while to parse one dag. By
> default
> >> airflow will check dag folder very frequently (unless you changed the
> >> timing in config). You can start both scheduler and web server manually
> and
> >> just watch the logs
> >>
> >> On Fri, Aug 18, 2017 at 9:03 AM, Sumit Maheshwari <
> sumeet.manit@gmail.com>
> >> wrote:
> >>
> >>> Thanks Boris,
> >>>
> >>> Seems like there is some issue in my Scheduler, which sometimes picks
> up
> >>> dag modification immediately and sometimes doesn't pickup at all. And
> >> when
> >>> Scheduler doesn't pickup new dag, UI also doesn't show it, i.e. no
> >> refresh
> >>> button would be available. I am trying to debug that issue with
> >> Scheduler,
> >>> but not sure if it's a real issue at all or something else.
> >>>
> >>>
> >>>
> >>> On Fri, Aug 18, 2017 at 6:16 PM, Boris Tyukin <bo...@boristyukin.com>
> >>> wrote:
> >>>
> >>>> we run 1.8.1 with no debug mode. New dags are picked up just fine
> >> without
> >>>> restarts. as well as modifications (adding new tasks or changing
> >>>> definitions of new dags). When I want to see dag updated on UI, I hit
> >>>> refresh button but this is just for UI to show it properly. No
> restarts
> >>>> needed in our experience with 1.8.1
> >>>>
> >>>> On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <
> >>> sumeet.manit@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi All,
> >>>>>
> >>>>> We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a
> >>>> couple
> >>>>> of issues, major ones are:
> >>>>>
> >>>>> 1. We used to run the webserver in debug mode (-d) so we don't need
> >> to
> >>>>> restart whenever we add/modify any dag. But with 1.8.2 debug mode
> >>> doesn't
> >>>>> have any effect and we need to restart it after any change in dags.
> >>>>>
> >>>>> 2. Scheduler used to pick up any new dag landed in dags folder, but
> >>> that
> >>>> is
> >>>>> not happening anymore and we required to restart the scheduler as
> >> well.
> >>>>>
> >>>>> Any help would be highly appreciated.
> >>>>>
> >>>>> Thanks,
> >>>>> Sumit
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: Info needed regarding upgrading to 1.8.2

Posted by Bolke de Bruin <bd...@gmail.com>.
If you have a large number of dags, it can take some time before the scheduler processes them. This is determined by the following settings in the [scheduler] section:

# after how much time a new DAGs should be picked up from the filesystem
min_file_process_interval = 0
dag_dir_list_interval = 300

Bolke

> On 18 Aug 2017, at 21:26, Maxime Beauchemin <ma...@gmail.com> wrote:
> 
> In 1.8.1 I'm pretty sure that in the context of the scheduler, the DAGs are
> fully re-evaluated in subprocesses which fixes the re-import
> sys.module-related caching issue of earlier versions that created
> staleness. 1.8.1 shouldn't have scheduler staleness issues.
> 
> Max
> 
> On Fri, Aug 18, 2017 at 6:09 AM, Boris Tyukin <bo...@boristyukin.com> wrote:
> 
>> see how much time it takes to parse your dag, if you do any heavy
>> operations like database calls, file pulls etc. I remember someone had an
>> issue with web server because it took a while to parse one dag. By default
>> airflow will check dag folder very frequently (unless you changed the
>> timing in config). You can start both scheduler and web server manually and
>> just watch the logs
>> 
>> On Fri, Aug 18, 2017 at 9:03 AM, Sumit Maheshwari <su...@gmail.com>
>> wrote:
>> 
>>> Thanks Boris,
>>> 
>>> Seems like there is some issue in my Scheduler, which sometimes picks up
>>> dag modification immediately and sometimes doesn't pickup at all. And
>> when
>>> Scheduler doesn't pickup new dag, UI also doesn't show it, i.e. no
>> refresh
>>> button would be available. I am trying to debug that issue with
>> Scheduler,
>>> but not sure if it's a real issue at all or something else.
>>> 
>>> 
>>> 
>>> On Fri, Aug 18, 2017 at 6:16 PM, Boris Tyukin <bo...@boristyukin.com>
>>> wrote:
>>> 
>>>> we run 1.8.1 with no debug mode. New dags are picked up just fine
>> without
>>>> restarts. as well as modifications (adding new tasks or changing
>>>> definitions of new dags). When I want to see dag updated on UI, I hit
>>>> refresh button but this is just for UI to show it properly. No restarts
>>>> needed in our experience with 1.8.1
>>>> 
>>>> On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <
>>> sumeet.manit@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi All,
>>>>> 
>>>>> We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a
>>>> couple
>>>>> of issues, major ones are:
>>>>> 
>>>>> 1. We used to run the webserver in debug mode (-d) so we don't need
>> to
>>>>> restart whenever we add/modify any dag. But with 1.8.2 debug mode
>>> doesn't
>>>>> have any effect and we need to restart it after any change in dags.
>>>>> 
>>>>> 2. Scheduler used to pick up any new dag landed in dags folder, but
>>> that
>>>> is
>>>>> not happening anymore and we required to restart the scheduler as
>> well.
>>>>> 
>>>>> Any help would be highly appreciated.
>>>>> 
>>>>> Thanks,
>>>>> Sumit
>>>>> 
>>>> 
>>> 
>> 


Re: Info needed regarding upgrading to 1.8.2

Posted by Maxime Beauchemin <ma...@gmail.com>.
In 1.8.1 I'm pretty sure that in the context of the scheduler, the DAGs are
fully re-evaluated in subprocesses which fixes the re-import
sys.module-related caching issue of earlier versions that created
staleness. 1.8.1 shouldn't have scheduler staleness issues.

Max

On Fri, Aug 18, 2017 at 6:09 AM, Boris Tyukin <bo...@boristyukin.com> wrote:

> see how much time it takes to parse your dag, if you do any heavy
> operations like database calls, file pulls etc. I remember someone had an
> issue with web server because it took a while to parse one dag. By default
> airflow will check dag folder very frequently (unless you changed the
> timing in config). You can start both scheduler and web server manually and
> just watch the logs
>
> On Fri, Aug 18, 2017 at 9:03 AM, Sumit Maheshwari <su...@gmail.com>
> wrote:
>
> > Thanks Boris,
> >
> > Seems like there is some issue in my Scheduler, which sometimes picks up
> > dag modification immediately and sometimes doesn't pickup at all. And
> when
> > Scheduler doesn't pickup new dag, UI also doesn't show it, i.e. no
> refresh
> > button would be available. I am trying to debug that issue with
> Scheduler,
> > but not sure if it's a real issue at all or something else.
> >
> >
> >
> > On Fri, Aug 18, 2017 at 6:16 PM, Boris Tyukin <bo...@boristyukin.com>
> > wrote:
> >
> > > we run 1.8.1 with no debug mode. New dags are picked up just fine
> without
> > > restarts. as well as modifications (adding new tasks or changing
> > > definitions of new dags). When I want to see dag updated on UI, I hit
> > > refresh button but this is just for UI to show it properly. No restarts
> > > needed in our experience with 1.8.1
> > >
> > > On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <
> > sumeet.manit@gmail.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a
> > > couple
> > > > of issues, major ones are:
> > > >
> > > > 1. We used to run the webserver in debug mode (-d) so we don't need
> to
> > > > restart whenever we add/modify any dag. But with 1.8.2 debug mode
> > doesn't
> > > > have any effect and we need to restart it after any change in dags.
> > > >
> > > > 2. Scheduler used to pick up any new dag landed in dags folder, but
> > that
> > > is
> > > > not happening anymore and we required to restart the scheduler as
> well.
> > > >
> > > > Any help would be highly appreciated.
> > > >
> > > > Thanks,
> > > > Sumit
> > > >
> > >
> >
>

Re: Info needed regarding upgrading to 1.8.2

Posted by Boris Tyukin <bo...@boristyukin.com>.
see how much time it takes to parse your dag, if you do any heavy
operations like database calls, file pulls etc. I remember someone had an
issue with web server because it took a while to parse one dag. By default
airflow will check dag folder very frequently (unless you changed the
timing in config). You can start both scheduler and web server manually and
just watch the logs

On Fri, Aug 18, 2017 at 9:03 AM, Sumit Maheshwari <su...@gmail.com>
wrote:

> Thanks Boris,
>
> Seems like there is some issue in my Scheduler, which sometimes picks up
> dag modification immediately and sometimes doesn't pickup at all. And when
> Scheduler doesn't pickup new dag, UI also doesn't show it, i.e. no refresh
> button would be available. I am trying to debug that issue with Scheduler,
> but not sure if it's a real issue at all or something else.
>
>
>
> On Fri, Aug 18, 2017 at 6:16 PM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
>
> > we run 1.8.1 with no debug mode. New dags are picked up just fine without
> > restarts. as well as modifications (adding new tasks or changing
> > definitions of new dags). When I want to see dag updated on UI, I hit
> > refresh button but this is just for UI to show it properly. No restarts
> > needed in our experience with 1.8.1
> >
> > On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <
> sumeet.manit@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a
> > couple
> > > of issues, major ones are:
> > >
> > > 1. We used to run the webserver in debug mode (-d) so we don't need to
> > > restart whenever we add/modify any dag. But with 1.8.2 debug mode
> doesn't
> > > have any effect and we need to restart it after any change in dags.
> > >
> > > 2. Scheduler used to pick up any new dag landed in dags folder, but
> that
> > is
> > > not happening anymore and we required to restart the scheduler as well.
> > >
> > > Any help would be highly appreciated.
> > >
> > > Thanks,
> > > Sumit
> > >
> >
>

Re: Info needed regarding upgrading to 1.8.2

Posted by Sumit Maheshwari <su...@gmail.com>.
Thanks Boris,

Seems like there is some issue in my Scheduler, which sometimes picks up
dag modification immediately and sometimes doesn't pickup at all. And when
Scheduler doesn't pickup new dag, UI also doesn't show it, i.e. no refresh
button would be available. I am trying to debug that issue with Scheduler,
but not sure if it's a real issue at all or something else.



On Fri, Aug 18, 2017 at 6:16 PM, Boris Tyukin <bo...@boristyukin.com> wrote:

> we run 1.8.1 with no debug mode. New dags are picked up just fine without
> restarts. as well as modifications (adding new tasks or changing
> definitions of new dags). When I want to see dag updated on UI, I hit
> refresh button but this is just for UI to show it properly. No restarts
> needed in our experience with 1.8.1
>
> On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <su...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a
> couple
> > of issues, major ones are:
> >
> > 1. We used to run the webserver in debug mode (-d) so we don't need to
> > restart whenever we add/modify any dag. But with 1.8.2 debug mode doesn't
> > have any effect and we need to restart it after any change in dags.
> >
> > 2. Scheduler used to pick up any new dag landed in dags folder, but that
> is
> > not happening anymore and we required to restart the scheduler as well.
> >
> > Any help would be highly appreciated.
> >
> > Thanks,
> > Sumit
> >
>

Re: Info needed regarding upgrading to 1.8.2

Posted by Boris Tyukin <bo...@boristyukin.com>.
we run 1.8.1 with no debug mode. New dags are picked up just fine without
restarts. as well as modifications (adding new tasks or changing
definitions of new dags). When I want to see dag updated on UI, I hit
refresh button but this is just for UI to show it properly. No restarts
needed in our experience with 1.8.1

On Fri, Aug 18, 2017 at 6:40 AM, Sumit Maheshwari <su...@gmail.com>
wrote:

> Hi All,
>
> We are trying to upgrade to Airflow ver 1.8.2 from 1.7.0 and found a couple
> of issues, major ones are:
>
> 1. We used to run the webserver in debug mode (-d) so we don't need to
> restart whenever we add/modify any dag. But with 1.8.2 debug mode doesn't
> have any effect and we need to restart it after any change in dags.
>
> 2. Scheduler used to pick up any new dag landed in dags folder, but that is
> not happening anymore and we required to restart the scheduler as well.
>
> Any help would be highly appreciated.
>
> Thanks,
> Sumit
>