You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Stefano Baghino <st...@teralytics.ch> on 2017/08/03 07:34:13 UTC

Per-task resources with Mesos

Hi everyone,

I'm investigating the possibility for our organization to use Airflow for
workflow management.

Some requirements on our side regard resource management, and in particular
the possibility for the system to run tasks on top of Apache Mesos. Airflow
partially satisfies our requirements in that regard, meaning that after
having a look at the docs and code, it appears to me (correct me if I'm
wrong) that resources are determined for the whole system (via
configuration) and cannot be determined on a per-task basis. We'd need this
because some of our jobs are quite lightweight while others may require a
lot of resources, making it a "one-size-fits-all" configuration quite
wasteful.

I had a look at the AirflowMesosScheduler and MesosExecutor and thought it
would be nice to add this feature and perhaps I can add it myself. What I
would need is some guidance on how to make this fit into the overall system
design: is there an established way to explicitly ask for resources for a
specific task in the DAG? If not, what could be a possible way to introduce
it? And if this reveals itself to be outside of the scope of Airflow, how
do you think I can make it meet our requirement?

Thanks in advance.

P.S.: if by any chance some of you are on the Mesos mailing list as well,
you may know that I'm having issues in making Airflow run successfully
using Mesos due to missing Python packages. I'm not sure whether this
mailing list is an appropriate place for users to get help. If so, I could
probably share that post here as well. Thanks!

-- 
Stefano Baghino | TERALYTICS
*software engineer*

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
phone: +41 43 508 24 57
email: stefano.baghino@teralytics.ch
www.teralytics.net

Company registration number: CH-020.3.037.709-7 | Trade register Canton
Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
de Vries

This e-mail message contains confidential information which is for the sole
attention and use of the intended recipient. Please notify us at once if
you think that it may not be intended for you and delete it immediately.

Re: Per-task resources with Mesos

Posted by Stefano Baghino <st...@teralytics.ch>.
Hi everybody,

I'd like to come back to this to give you an update on how we've made the
thing work. And of course, thanks for your kind responses.

We found out about Eremetic, a Mesos framework to run one-off jobs in
Docker containers (let's say it's a one-off jobs counterpart to Marathon,
in some respect).

I've written a small Python wrapper around their HTTP API and then an
EremeticOperator that uses this client (POSTs a job creation request and,
if successful, polls from the task status). We then use the LocalExecutor
for the whole thing. It's still a small toy and think there's a lot to be
improved to avoid undesirable situations in terms of handling partial
failures, but so far it works.

Here is an excerpt from a small test pipeline I wrote to give you a sense
of how the whole thing looks like:

dag = DAG(
    dag_id='some_dag',
    start_date=datetime(2017, 7, 1),
    schedule_interval=timedelta(days=1),
    default_args={
        'eremetic_url': 'https://url_to_eremetic/',
        'image': 'some_registry/some_image_with_spark',
        'volumes': [
            {
                'container_path': '/path/to/relevant/container/stuff',
                'host_path': '/path/to/relevant/host/stuff'
            }
        ]
    })

...

some_spark_job = EremeticOperator(
    task_id='some_spark_job',
    command='/path/to/relevant/container/stuff/bin/spark-submit run-example
SparkPi 10',
    cpu=4,
    mem=4096,
    dag=dag)

On Fri, Aug 4, 2017 at 11:48 PM, Maxime Beauchemin <
maximebeauchemin@gmail.com> wrote:

> At Airbnb using the Celery executor we use queues to wire tasks to machines
> provisioned in specific ways and we use the cgroup feature to constrain
> resource utilization as we fire up tasks. That requires running the worker
> service as root as its a requirement to impersonate and use cgroups.
>
> In the context of Mesos things may be different as you may want to do that
> on a different layer. I'd read through the MesosExecutor to see if it does
> any of this already, or to figure out where you may be able to hook things
> up.
>
> Note that (from memory) the MesosExecutor relies on pickling to get
> serialized DAGs [through the database] to Mesos slots, and that chances are
> high that we may deprecate that feature in the future. By that time we'll
> probably have a "DagFetcher" abstraction, allowing to get the DAG
> definition in another way on the fly.
>
> Max
>
> On Thu, Aug 3, 2017 at 10:24 AM, Victor Monteiro <vi...@ubee.in>
> wrote:
>
> > Hi Stefano, have you read about queues? Airflow has this concept and I
> > think you can decide for which queue a task should go. By doing this and
> > integrating it with mesos, I believe you can make a mesos cluster with
> more
> > resources to get tasks from a certain queue specific for heavy
> > computations.
> >
> > Maybe this can solve your problem (not sure) :D
> >
> > 2017-08-03 4:34 GMT-03:00 Stefano Baghino <stefano.baghino@teralytics.ch
> >:
> >
> > > Hi everyone,
> > >
> > > I'm investigating the possibility for our organization to use Airflow
> for
> > > workflow management.
> > >
> > > Some requirements on our side regard resource management, and in
> > particular
> > > the possibility for the system to run tasks on top of Apache Mesos.
> > Airflow
> > > partially satisfies our requirements in that regard, meaning that after
> > > having a look at the docs and code, it appears to me (correct me if I'm
> > > wrong) that resources are determined for the whole system (via
> > > configuration) and cannot be determined on a per-task basis. We'd need
> > this
> > > because some of our jobs are quite lightweight while others may
> require a
> > > lot of resources, making it a "one-size-fits-all" configuration quite
> > > wasteful.
> > >
> > > I had a look at the AirflowMesosScheduler and MesosExecutor and thought
> > it
> > > would be nice to add this feature and perhaps I can add it myself.
> What I
> > > would need is some guidance on how to make this fit into the overall
> > system
> > > design: is there an established way to explicitly ask for resources
> for a
> > > specific task in the DAG? If not, what could be a possible way to
> > introduce
> > > it? And if this reveals itself to be outside of the scope of Airflow,
> how
> > > do you think I can make it meet our requirement?
> > >
> > > Thanks in advance.
> > >
> > > P.S.: if by any chance some of you are on the Mesos mailing list as
> well,
> > > you may know that I'm having issues in making Airflow run successfully
> > > using Mesos due to missing Python packages. I'm not sure whether this
> > > mailing list is an appropriate place for users to get help. If so, I
> > could
> > > probably share that post here as well. Thanks!
> > >
> > > --
> > > Stefano Baghino | TERALYTICS
> > > *software engineer*
> > >
> > > Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> > > phone: +41 43 508 24 57
> > > email: stefano.baghino@teralytics.ch
> > > www.teralytics.net
> > >
> > > Company registration number: CH-020.3.037.709-7 | Trade register Canton
> > > Zurich
> > > Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
> > Yann
> > > de Vries
> > >
> > > This e-mail message contains confidential information which is for the
> > sole
> > > attention and use of the intended recipient. Please notify us at once
> if
> > > you think that it may not be intended for you and delete it
> immediately.
> > >
> >
>



-- 
Stefano Baghino | TERALYTICS
*software engineer*

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
phone: +41 43 508 24 57
email: stefano.baghino@teralytics.ch
www.teralytics.net

Company registration number: CH-020.3.037.709-7 | Trade register Canton
Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
de Vries

This e-mail message contains confidential information which is for the sole
attention and use of the intended recipient. Please notify us at once if
you think that it may not be intended for you and delete it immediately.

Re: Per-task resources with Mesos

Posted by Maxime Beauchemin <ma...@gmail.com>.
At Airbnb using the Celery executor we use queues to wire tasks to machines
provisioned in specific ways and we use the cgroup feature to constrain
resource utilization as we fire up tasks. That requires running the worker
service as root as its a requirement to impersonate and use cgroups.

In the context of Mesos things may be different as you may want to do that
on a different layer. I'd read through the MesosExecutor to see if it does
any of this already, or to figure out where you may be able to hook things
up.

Note that (from memory) the MesosExecutor relies on pickling to get
serialized DAGs [through the database] to Mesos slots, and that chances are
high that we may deprecate that feature in the future. By that time we'll
probably have a "DagFetcher" abstraction, allowing to get the DAG
definition in another way on the fly.

Max

On Thu, Aug 3, 2017 at 10:24 AM, Victor Monteiro <vi...@ubee.in>
wrote:

> Hi Stefano, have you read about queues? Airflow has this concept and I
> think you can decide for which queue a task should go. By doing this and
> integrating it with mesos, I believe you can make a mesos cluster with more
> resources to get tasks from a certain queue specific for heavy
> computations.
>
> Maybe this can solve your problem (not sure) :D
>
> 2017-08-03 4:34 GMT-03:00 Stefano Baghino <st...@teralytics.ch>:
>
> > Hi everyone,
> >
> > I'm investigating the possibility for our organization to use Airflow for
> > workflow management.
> >
> > Some requirements on our side regard resource management, and in
> particular
> > the possibility for the system to run tasks on top of Apache Mesos.
> Airflow
> > partially satisfies our requirements in that regard, meaning that after
> > having a look at the docs and code, it appears to me (correct me if I'm
> > wrong) that resources are determined for the whole system (via
> > configuration) and cannot be determined on a per-task basis. We'd need
> this
> > because some of our jobs are quite lightweight while others may require a
> > lot of resources, making it a "one-size-fits-all" configuration quite
> > wasteful.
> >
> > I had a look at the AirflowMesosScheduler and MesosExecutor and thought
> it
> > would be nice to add this feature and perhaps I can add it myself. What I
> > would need is some guidance on how to make this fit into the overall
> system
> > design: is there an established way to explicitly ask for resources for a
> > specific task in the DAG? If not, what could be a possible way to
> introduce
> > it? And if this reveals itself to be outside of the scope of Airflow, how
> > do you think I can make it meet our requirement?
> >
> > Thanks in advance.
> >
> > P.S.: if by any chance some of you are on the Mesos mailing list as well,
> > you may know that I'm having issues in making Airflow run successfully
> > using Mesos due to missing Python packages. I'm not sure whether this
> > mailing list is an appropriate place for users to get help. If so, I
> could
> > probably share that post here as well. Thanks!
> >
> > --
> > Stefano Baghino | TERALYTICS
> > *software engineer*
> >
> > Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> > phone: +41 43 508 24 57
> > email: stefano.baghino@teralytics.ch
> > www.teralytics.net
> >
> > Company registration number: CH-020.3.037.709-7 | Trade register Canton
> > Zurich
> > Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
> Yann
> > de Vries
> >
> > This e-mail message contains confidential information which is for the
> sole
> > attention and use of the intended recipient. Please notify us at once if
> > you think that it may not be intended for you and delete it immediately.
> >
>

Re: Per-task resources with Mesos

Posted by Victor Monteiro <vi...@ubee.in>.
Hi Stefano, have you read about queues? Airflow has this concept and I
think you can decide for which queue a task should go. By doing this and
integrating it with mesos, I believe you can make a mesos cluster with more
resources to get tasks from a certain queue specific for heavy computations.

Maybe this can solve your problem (not sure) :D

2017-08-03 4:34 GMT-03:00 Stefano Baghino <st...@teralytics.ch>:

> Hi everyone,
>
> I'm investigating the possibility for our organization to use Airflow for
> workflow management.
>
> Some requirements on our side regard resource management, and in particular
> the possibility for the system to run tasks on top of Apache Mesos. Airflow
> partially satisfies our requirements in that regard, meaning that after
> having a look at the docs and code, it appears to me (correct me if I'm
> wrong) that resources are determined for the whole system (via
> configuration) and cannot be determined on a per-task basis. We'd need this
> because some of our jobs are quite lightweight while others may require a
> lot of resources, making it a "one-size-fits-all" configuration quite
> wasteful.
>
> I had a look at the AirflowMesosScheduler and MesosExecutor and thought it
> would be nice to add this feature and perhaps I can add it myself. What I
> would need is some guidance on how to make this fit into the overall system
> design: is there an established way to explicitly ask for resources for a
> specific task in the DAG? If not, what could be a possible way to introduce
> it? And if this reveals itself to be outside of the scope of Airflow, how
> do you think I can make it meet our requirement?
>
> Thanks in advance.
>
> P.S.: if by any chance some of you are on the Mesos mailing list as well,
> you may know that I'm having issues in making Airflow run successfully
> using Mesos due to missing Python packages. I'm not sure whether this
> mailing list is an appropriate place for users to get help. If so, I could
> probably share that post here as well. Thanks!
>
> --
> Stefano Baghino | TERALYTICS
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone: +41 43 508 24 57
> email: stefano.baghino@teralytics.ch
> www.teralytics.net
>
> Company registration number: CH-020.3.037.709-7 | Trade register Canton
> Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
> de Vries
>
> This e-mail message contains confidential information which is for the sole
> attention and use of the intended recipient. Please notify us at once if
> you think that it may not be intended for you and delete it immediately.
>