You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@liminal.apache.org by Zion Rubin <zi...@naturalint.com.INVALID> on 2021/04/18 08:08:20 UTC

[RFC] Executors

https://docs.google.com/document/d/1BuhuM7hFrf9p32BXJDF2mcsiplDEdqP45iKRebMi9uk/edit?usp=sharing

Re: [RFC] Executors

Posted by Zion Rubin <zi...@naturalint.com.INVALID>.
Hi,
Thanks for your comments.

Generally, we would like to be able to configure with all possibles as the
executor's infrastructure have.
For example,
K8SExecutor - can have the same properties as KubernetesPodOperator
<https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html#howto-operator-kubernetespodoperator>
 has.
EMRExecutor - can have the same properties as EmrAddStepsOperator
<https://airflow.apache.org/docs/apache-airflow/1.10.15/_modules/airflow/contrib/operators/emr_add_steps_operator.html>
has.

To create EMR, you should have an explicit task for that (e.g.
create_cloudformation_stack) but that's will be discussed in another thread.
The EMRExecutor assumes the EMR cluster already exists.

I think the part of adding jars/dependencies should be part of the EMR
creation (e.g. create_cloudformation_stack will add custom bootstrap script)

I'll add another example in the RFC doc.

Please LMK if you have any further questions.

Thanks,
Zion




On Wed, Apr 21, 2021 at 9:31 AM Assaf Pinhasi <as...@gmail.com>
wrote:

> Generally, I like the design.
>
> What is not clear in my mind, is how "complete" it is in supporting
> real-life executors like Spark and Kubernetes.
>
> For example - for Kubernetes executor, what exactly are the assumptions
> about kube config? how to set up the docker registry if it's private? what
> if I need to use tolerations since I need a GPU node?
> for spark - how to create ephemeral clusters in one task, and then use this
> cluster to the executor to run tasks on?
> How to add jars and python dependencies to the cluster ?
>
> Each of these executors is a "world" of configurations and options, and
> since the integration to the underlying infra is delicate, it will be good
> to know explicitly
> who configures what and where (including limitations) for it to work in
> practice.
>
>
> On Sun, Apr 18, 2021 at 11:08 AM Zion Rubin
> <zi...@naturalint.com.invalid> wrote:
>
> >
> >
> https://docs.google.com/document/d/1BuhuM7hFrf9p32BXJDF2mcsiplDEdqP45iKRebMi9uk/edit?usp=sharing
> >
>

Re: [RFC] Executors

Posted by Assaf Pinhasi <as...@gmail.com>.
Generally, I like the design.

What is not clear in my mind, is how "complete" it is in supporting
real-life executors like Spark and Kubernetes.

For example - for Kubernetes executor, what exactly are the assumptions
about kube config? how to set up the docker registry if it's private? what
if I need to use tolerations since I need a GPU node?
for spark - how to create ephemeral clusters in one task, and then use this
cluster to the executor to run tasks on?
How to add jars and python dependencies to the cluster ?

Each of these executors is a "world" of configurations and options, and
since the integration to the underlying infra is delicate, it will be good
to know explicitly
who configures what and where (including limitations) for it to work in
practice.


On Sun, Apr 18, 2021 at 11:08 AM Zion Rubin
<zi...@naturalint.com.invalid> wrote:

>
> https://docs.google.com/document/d/1BuhuM7hFrf9p32BXJDF2mcsiplDEdqP45iKRebMi9uk/edit?usp=sharing
>