You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Sai Phanindhra <ph...@gmail.com> on 2018/11/15 10:28:51 UTC

Moving Airflow Config to Database.

Hello Airflow users,
      I recently encountered as issue with airflow. I am maintaining a
airflow cluster, whenever i make a change in airflow configuration in one
of the machine i have to consciously copy these changes to other machines
in airflow cluster. Problem with this is it's a manual process and
something users tend to forget to sync changes. I want move airflow config
to a database. I will be happy if you can share your valuable
inputs/thoughts on this.

-- 
Sai Phanindhra,
Ph: +91 9043258999

Re: Moving Airflow Config to Database.

Posted by Adityan MS <ad...@poshmark.com>.
Hi Sai, you may want to try and manage your airflow deployment via Ansible.
You can have a jinja template for your airflow.cfg, with variables specific
to the boxes that you want. The ansible script should be able to take care
of keeping all your boxes in the state you want them to be in.

Thanks!

On Thu, Nov 15, 2018 at 5:42 AM Sai Phanindhra <ph...@gmail.com> wrote:

> Thats how it is supposed to work. But, as i read in stack overflow
> supervisord create a subshell and it does not guarentee all env vars to be
> available. We have to explicitly pass environment we might be needing in
> supervisor programme config
>
> On Thu 15 Nov, 2018, 19:07 James Meickle <jmeickle@quantopian.com.invalid
> wrote:
>
> > Just a guess, but do you need to reload supervisorctl itself before
> > restarting the service? If you add an env var to the supervisor config,
> and
> > then restart the supervisor-managed service, it will actually be running
> > with the old supervisor config file still. The supervisor daemon itself
> > must be reloaded.
> >
> >
> >
> > On Thu, Nov 15, 2018 at 7:59 AM Sai Phanindhra <ph...@gmail.com>
> > wrote:
> >
> > > Some of env variables are not getting reflected in supervisord
> > > environment(sometimes new variables are not available, sometimes
> changes
> > to
> > > existing variables are not reflected)
> > >
> > > On Thu 15 Nov, 2018, 17:37 Ash Berlin-Taylor <ash@apache.org wrote:
> > >
> > > > > problem with
> > > > > this approach is these env variables wont behave correctly when we
> > > > > subshells
> > > >
> > > >
> > > > Can you explain what you mean by this?
> > > >
> > > > -ash
> > > >
> > > >
> > > > > On 15 Nov 2018, at 12:03, Sai Phanindhra <ph...@gmail.com>
> > wrote:
> > > > >
> > > > > Hi deng,
> > > > > I am currently using env variables for few airflow config variables
> > > which
> > > > > may differ across machines(airflow folder, log folder etc..,)
> Problem
> > > > with
> > > > > this approach is these env variables wont behave correctly when we
> > > > > subshells. ( I faced issues when i added airflow jobs in
> > supervisord).
> > > > >  Moving config to db not only addresses this issue, it will give
> > > > provision
> > > > > to change config from UI.(most of the time, we cant give box access
> > to
> > > > all
> > > > > users. ). Config is db makes it easy to create/update config
> without
> > > > > touching code.
> > > > >
> > > > > On Thu 15 Nov, 2018, 16:04 Deng Xiaodong <xd.deng.r@gmail.com
> wrote:
> > > > >
> > > > >> A few solutions that may address your problem:
> > > > >>
> > > > >> - Specify your configurations in environment variables, so it
> > becomes
> > > > much
> > > > >> easier to manage across machines
> > > > >> - use network attached storage to save your configuration file and
> > > > mount it
> > > > >> to all your machines (this can address DAG file sync as well)
> > > > >> - ...
> > > > >>
> > > > >> Personally I don’t see point moving configuration to DB.
> > > > >>
> > > > >>
> > > > >> XD
> > > > >>
> > > > >> On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <phani8996@gmail.com
> >
> > > > wrote:
> > > > >>
> > > > >>> Hello Airflow users,
> > > > >>>      I recently encountered as issue with airflow. I am
> > maintaining a
> > > > >>> airflow cluster, whenever i make a change in airflow
> configuration
> > in
> > > > one
> > > > >>> of the machine i have to consciously copy these changes to other
> > > > machines
> > > > >>> in airflow cluster. Problem with this is it's a manual process
> and
> > > > >>> something users tend to forget to sync changes. I want move
> airflow
> > > > >> config
> > > > >>> to a database. I will be happy if you can share your valuable
> > > > >>> inputs/thoughts on this.
> > > > >>>
> > > > >>> --
> > > > >>> Sai Phanindhra,
> > > > >>> Ph: +91 9043258999
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: Moving Airflow Config to Database.

Posted by Sai Phanindhra <ph...@gmail.com>.
Thats how it is supposed to work. But, as i read in stack overflow
supervisord create a subshell and it does not guarentee all env vars to be
available. We have to explicitly pass environment we might be needing in
supervisor programme config

On Thu 15 Nov, 2018, 19:07 James Meickle <jmeickle@quantopian.com.invalid
wrote:

> Just a guess, but do you need to reload supervisorctl itself before
> restarting the service? If you add an env var to the supervisor config, and
> then restart the supervisor-managed service, it will actually be running
> with the old supervisor config file still. The supervisor daemon itself
> must be reloaded.
>
>
>
> On Thu, Nov 15, 2018 at 7:59 AM Sai Phanindhra <ph...@gmail.com>
> wrote:
>
> > Some of env variables are not getting reflected in supervisord
> > environment(sometimes new variables are not available, sometimes changes
> to
> > existing variables are not reflected)
> >
> > On Thu 15 Nov, 2018, 17:37 Ash Berlin-Taylor <ash@apache.org wrote:
> >
> > > > problem with
> > > > this approach is these env variables wont behave correctly when we
> > > > subshells
> > >
> > >
> > > Can you explain what you mean by this?
> > >
> > > -ash
> > >
> > >
> > > > On 15 Nov 2018, at 12:03, Sai Phanindhra <ph...@gmail.com>
> wrote:
> > > >
> > > > Hi deng,
> > > > I am currently using env variables for few airflow config variables
> > which
> > > > may differ across machines(airflow folder, log folder etc..,) Problem
> > > with
> > > > this approach is these env variables wont behave correctly when we
> > > > subshells. ( I faced issues when i added airflow jobs in
> supervisord).
> > > >  Moving config to db not only addresses this issue, it will give
> > > provision
> > > > to change config from UI.(most of the time, we cant give box access
> to
> > > all
> > > > users. ). Config is db makes it easy to create/update config without
> > > > touching code.
> > > >
> > > > On Thu 15 Nov, 2018, 16:04 Deng Xiaodong <xd.deng.r@gmail.com wrote:
> > > >
> > > >> A few solutions that may address your problem:
> > > >>
> > > >> - Specify your configurations in environment variables, so it
> becomes
> > > much
> > > >> easier to manage across machines
> > > >> - use network attached storage to save your configuration file and
> > > mount it
> > > >> to all your machines (this can address DAG file sync as well)
> > > >> - ...
> > > >>
> > > >> Personally I don’t see point moving configuration to DB.
> > > >>
> > > >>
> > > >> XD
> > > >>
> > > >> On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <ph...@gmail.com>
> > > wrote:
> > > >>
> > > >>> Hello Airflow users,
> > > >>>      I recently encountered as issue with airflow. I am
> maintaining a
> > > >>> airflow cluster, whenever i make a change in airflow configuration
> in
> > > one
> > > >>> of the machine i have to consciously copy these changes to other
> > > machines
> > > >>> in airflow cluster. Problem with this is it's a manual process and
> > > >>> something users tend to forget to sync changes. I want move airflow
> > > >> config
> > > >>> to a database. I will be happy if you can share your valuable
> > > >>> inputs/thoughts on this.
> > > >>>
> > > >>> --
> > > >>> Sai Phanindhra,
> > > >>> Ph: +91 9043258999
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: Moving Airflow Config to Database.

Posted by James Meickle <jm...@quantopian.com.INVALID>.
Just a guess, but do you need to reload supervisorctl itself before
restarting the service? If you add an env var to the supervisor config, and
then restart the supervisor-managed service, it will actually be running
with the old supervisor config file still. The supervisor daemon itself
must be reloaded.



On Thu, Nov 15, 2018 at 7:59 AM Sai Phanindhra <ph...@gmail.com> wrote:

> Some of env variables are not getting reflected in supervisord
> environment(sometimes new variables are not available, sometimes changes to
> existing variables are not reflected)
>
> On Thu 15 Nov, 2018, 17:37 Ash Berlin-Taylor <ash@apache.org wrote:
>
> > > problem with
> > > this approach is these env variables wont behave correctly when we
> > > subshells
> >
> >
> > Can you explain what you mean by this?
> >
> > -ash
> >
> >
> > > On 15 Nov 2018, at 12:03, Sai Phanindhra <ph...@gmail.com> wrote:
> > >
> > > Hi deng,
> > > I am currently using env variables for few airflow config variables
> which
> > > may differ across machines(airflow folder, log folder etc..,) Problem
> > with
> > > this approach is these env variables wont behave correctly when we
> > > subshells. ( I faced issues when i added airflow jobs in supervisord).
> > >  Moving config to db not only addresses this issue, it will give
> > provision
> > > to change config from UI.(most of the time, we cant give box access to
> > all
> > > users. ). Config is db makes it easy to create/update config without
> > > touching code.
> > >
> > > On Thu 15 Nov, 2018, 16:04 Deng Xiaodong <xd.deng.r@gmail.com wrote:
> > >
> > >> A few solutions that may address your problem:
> > >>
> > >> - Specify your configurations in environment variables, so it becomes
> > much
> > >> easier to manage across machines
> > >> - use network attached storage to save your configuration file and
> > mount it
> > >> to all your machines (this can address DAG file sync as well)
> > >> - ...
> > >>
> > >> Personally I don’t see point moving configuration to DB.
> > >>
> > >>
> > >> XD
> > >>
> > >> On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <ph...@gmail.com>
> > wrote:
> > >>
> > >>> Hello Airflow users,
> > >>>      I recently encountered as issue with airflow. I am maintaining a
> > >>> airflow cluster, whenever i make a change in airflow configuration in
> > one
> > >>> of the machine i have to consciously copy these changes to other
> > machines
> > >>> in airflow cluster. Problem with this is it's a manual process and
> > >>> something users tend to forget to sync changes. I want move airflow
> > >> config
> > >>> to a database. I will be happy if you can share your valuable
> > >>> inputs/thoughts on this.
> > >>>
> > >>> --
> > >>> Sai Phanindhra,
> > >>> Ph: +91 9043258999
> > >>>
> > >>
> >
> >
>

Re: Moving Airflow Config to Database.

Posted by Sai Phanindhra <ph...@gmail.com>.
Some of env variables are not getting reflected in supervisord
environment(sometimes new variables are not available, sometimes changes to
existing variables are not reflected)

On Thu 15 Nov, 2018, 17:37 Ash Berlin-Taylor <ash@apache.org wrote:

> > problem with
> > this approach is these env variables wont behave correctly when we
> > subshells
>
>
> Can you explain what you mean by this?
>
> -ash
>
>
> > On 15 Nov 2018, at 12:03, Sai Phanindhra <ph...@gmail.com> wrote:
> >
> > Hi deng,
> > I am currently using env variables for few airflow config variables which
> > may differ across machines(airflow folder, log folder etc..,) Problem
> with
> > this approach is these env variables wont behave correctly when we
> > subshells. ( I faced issues when i added airflow jobs in supervisord).
> >  Moving config to db not only addresses this issue, it will give
> provision
> > to change config from UI.(most of the time, we cant give box access to
> all
> > users. ). Config is db makes it easy to create/update config without
> > touching code.
> >
> > On Thu 15 Nov, 2018, 16:04 Deng Xiaodong <xd.deng.r@gmail.com wrote:
> >
> >> A few solutions that may address your problem:
> >>
> >> - Specify your configurations in environment variables, so it becomes
> much
> >> easier to manage across machines
> >> - use network attached storage to save your configuration file and
> mount it
> >> to all your machines (this can address DAG file sync as well)
> >> - ...
> >>
> >> Personally I don’t see point moving configuration to DB.
> >>
> >>
> >> XD
> >>
> >> On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <ph...@gmail.com>
> wrote:
> >>
> >>> Hello Airflow users,
> >>>      I recently encountered as issue with airflow. I am maintaining a
> >>> airflow cluster, whenever i make a change in airflow configuration in
> one
> >>> of the machine i have to consciously copy these changes to other
> machines
> >>> in airflow cluster. Problem with this is it's a manual process and
> >>> something users tend to forget to sync changes. I want move airflow
> >> config
> >>> to a database. I will be happy if you can share your valuable
> >>> inputs/thoughts on this.
> >>>
> >>> --
> >>> Sai Phanindhra,
> >>> Ph: +91 9043258999
> >>>
> >>
>
>

Re: Moving Airflow Config to Database.

Posted by Ash Berlin-Taylor <as...@apache.org>.
> problem with
> this approach is these env variables wont behave correctly when we
> subshells


Can you explain what you mean by this?

-ash


> On 15 Nov 2018, at 12:03, Sai Phanindhra <ph...@gmail.com> wrote:
> 
> Hi deng,
> I am currently using env variables for few airflow config variables which
> may differ across machines(airflow folder, log folder etc..,) Problem with
> this approach is these env variables wont behave correctly when we
> subshells. ( I faced issues when i added airflow jobs in supervisord).
>  Moving config to db not only addresses this issue, it will give provision
> to change config from UI.(most of the time, we cant give box access to all
> users. ). Config is db makes it easy to create/update config without
> touching code.
> 
> On Thu 15 Nov, 2018, 16:04 Deng Xiaodong <xd.deng.r@gmail.com wrote:
> 
>> A few solutions that may address your problem:
>> 
>> - Specify your configurations in environment variables, so it becomes much
>> easier to manage across machines
>> - use network attached storage to save your configuration file and mount it
>> to all your machines (this can address DAG file sync as well)
>> - ...
>> 
>> Personally I don’t see point moving configuration to DB.
>> 
>> 
>> XD
>> 
>> On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <ph...@gmail.com> wrote:
>> 
>>> Hello Airflow users,
>>>      I recently encountered as issue with airflow. I am maintaining a
>>> airflow cluster, whenever i make a change in airflow configuration in one
>>> of the machine i have to consciously copy these changes to other machines
>>> in airflow cluster. Problem with this is it's a manual process and
>>> something users tend to forget to sync changes. I want move airflow
>> config
>>> to a database. I will be happy if you can share your valuable
>>> inputs/thoughts on this.
>>> 
>>> --
>>> Sai Phanindhra,
>>> Ph: +91 9043258999
>>> 
>> 


Re: Moving Airflow Config to Database.

Posted by Sai Phanindhra <ph...@gmail.com>.
Hi deng,
 I am currently using env variables for few airflow config variables which
may differ across machines(airflow folder, log folder etc..,) Problem with
this approach is these env variables wont behave correctly when we
subshells. ( I faced issues when i added airflow jobs in supervisord).
  Moving config to db not only addresses this issue, it will give provision
to change config from UI.(most of the time, we cant give box access to all
users. ). Config is db makes it easy to create/update config without
touching code.

On Thu 15 Nov, 2018, 16:04 Deng Xiaodong <xd.deng.r@gmail.com wrote:

> A few solutions that may address your problem:
>
> - Specify your configurations in environment variables, so it becomes much
> easier to manage across machines
> - use network attached storage to save your configuration file and mount it
> to all your machines (this can address DAG file sync as well)
> - ...
>
> Personally I don’t see point moving configuration to DB.
>
>
> XD
>
> On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <ph...@gmail.com> wrote:
>
> > Hello Airflow users,
> >       I recently encountered as issue with airflow. I am maintaining a
> > airflow cluster, whenever i make a change in airflow configuration in one
> > of the machine i have to consciously copy these changes to other machines
> > in airflow cluster. Problem with this is it's a manual process and
> > something users tend to forget to sync changes. I want move airflow
> config
> > to a database. I will be happy if you can share your valuable
> > inputs/thoughts on this.
> >
> > --
> > Sai Phanindhra,
> > Ph: +91 9043258999
> >
>

Re: Moving Airflow Config to Database.

Posted by Deng Xiaodong <xd...@gmail.com>.
A few solutions that may address your problem:

- Specify your configurations in environment variables, so it becomes much
easier to manage across machines
- use network attached storage to save your configuration file and mount it
to all your machines (this can address DAG file sync as well)
- ...

Personally I don’t see point moving configuration to DB.


XD

On Thu, Nov 15, 2018 at 18:29 Sai Phanindhra <ph...@gmail.com> wrote:

> Hello Airflow users,
>       I recently encountered as issue with airflow. I am maintaining a
> airflow cluster, whenever i make a change in airflow configuration in one
> of the machine i have to consciously copy these changes to other machines
> in airflow cluster. Problem with this is it's a manual process and
> something users tend to forget to sync changes. I want move airflow config
> to a database. I will be happy if you can share your valuable
> inputs/thoughts on this.
>
> --
> Sai Phanindhra,
> Ph: +91 9043258999
>