You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Kristoffer Sjögren <st...@gmail.com> on 2018/04/07 02:32:20 UTC

About the BaseExecutor interface

Hi

We have been running airflow for several years and decided to create a
executor plugin.

After studying the code quite a bit (both code interacting with the
executor and some of the implementations), I still find it hard to distil
concrete requirements for an BaseExecutor implementation.

Specifically, there's not much documentation around high availability and
how task states are concluded in the event of failures and other corner
cases, which I believe is at the core of any scheduling mechanism.

I would be very excited to hear if anyone would be so kind to elaborate a
bit more in depth on the principles used and how the interaction between
the executor and tasks are designed in airflow.

Cheers,
-Kristoffer

Re: About the BaseExecutor interface

Posted by Kristoffer Sjögren <st...@gmail.com>.
I should probably mention that our implementation will be running multiple
executors/schedulers/workers for the sake of resilience and recovery. We
are more concerned about task state correctness than performance if that
makes sense, i.e. the same task should not run twice by accident etc.

On Sat, Apr 7, 2018 at 4:32 AM, Kristoffer Sjögren <st...@gmail.com> wrote:

> Hi
>
> We have been running airflow for several years and decided to create a
> executor plugin.
>
> After studying the code quite a bit (both code interacting with the
> executor and some of the implementations), I still find it hard to distil
> concrete requirements for an BaseExecutor implementation.
>
> Specifically, there's not much documentation around high availability and
> how task states are concluded in the event of failures and other corner
> cases, which I believe is at the core of any scheduling mechanism.
>
> I would be very excited to hear if anyone would be so kind to elaborate a
> bit more in depth on the principles used and how the interaction between
> the executor and tasks are designed in airflow.
>
> Cheers,
> -Kristoffer
>

Re: About the BaseExecutor interface

Posted by Alex Tronchin-James 949-412-7220 <al...@gmail.com>.
Any discoveries from this thread should probably be added to
https://airflow.apache.org/concepts.html#core-ideas

FWIW, I'm not sure if it's still this way on the master branch, but at one
point a couple useful features (e.g. was it accessing task logs from the UI
and maybe running from the UI?) were supported only on the CeleryExecutor,
which made me consider that as the de-facto BaseExecutor.

Clarifying this implementation will make it way easier to extend airflow to
other types of executors which could be preferable to local/celery in some
installations.



On Fri, Apr 6, 2018 at 7:32 PM Kristoffer Sjögren <st...@gmail.com> wrote:

> Hi
>
> We have been running airflow for several years and decided to create a
> executor plugin.
>
> After studying the code quite a bit (both code interacting with the
> executor and some of the implementations), I still find it hard to distil
> concrete requirements for an BaseExecutor implementation.
>
> Specifically, there's not much documentation around high availability and
> how task states are concluded in the event of failures and other corner
> cases, which I believe is at the core of any scheduling mechanism.
>
> I would be very excited to hear if anyone would be so kind to elaborate a
> bit more in depth on the principles used and how the interaction between
> the executor and tasks are designed in airflow.
>
> Cheers,
> -Kristoffer
>