You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Kyle Hamlin <ha...@gmail.com> on 2018/04/27 22:32:28 UTC

Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Hi all,

If I have a Kubernetes cluster running in DCOC and a Dask cluster running
in that same Kubernetes cluster is it possible/does it makes sense to use
the KubernetesExecutor to launch tasks into the Dask cluster (these are ML
jobs with sklearn)? I feel like there is a bit of inception going on here
in my mind and I just want to make sure a setup like this makes sense?
Thanks in advance for anyone's input!

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Posted by Daniel Imberman <da...@gmail.com>.
@Kyle didn't see your middle message there:

You could certainly have k8s scale a Dask Cluster (I think k8s can
autoscale based on CPU and memory usage). In that case, yeah I'd say making
a DaskOperator would probably be the most straightforward way to go. You
can use almost every operator in the k8sexecutor so you'd have the benefit
of the executor elsewhere, but for this task you'd basically be launching a
pod just to monitor the Dask task and then die.

On Sun, Apr 29, 2018 at 3:47 PM Daniel Imberman <da...@gmail.com>
wrote:

> @Kylen so what I'm trying to understand is why you would want to run a
> static DASK cluster when you can launch Dask containers/pods using the
> executor?
>
> Seems like there are a few possible options:
>
> 1.  add the Dask pip modules to the airflow docker image and call on that
> image in the executor_config whenever you need to launch a Dask task. This
> would allow you to launch Dask jobs whenever you want in an elastic manner.
> 2. If there are benefits to keeping the static Dask cluster, then writing
> a DaskOperator would be pretty straightforward. You could use the
> DaskExecutor as a scaffold and basically write an operator that sends a
> request to the Dask cluster and then monitors the job unti the task is
> finished. You could also check out the KubernetesPodOperator to see how
> that would look.
>
>
>
> On Sun, Apr 29, 2018 at 2:58 PM Kyle Hamlin <ha...@gmail.com> wrote:
>
>> Hi Fokko,
>>
>> So its always been my intention to use the KubernetesExecutor. What I'm
>> trying to figure out is how to pair the KubernetesExecutor with a
>> Dask cluster, since Dask clusters have many optimizations for ML type
>> tasks.
>>
>> On Sat, Apr 28, 2018 at 2:29 PM Driesprong, Fokko <fo...@driesprong.frl>
>> wrote:
>>
>> > Also one of the main benefits of the Kubernetes Executor is having a
>> Docker
>> > image that contains all the dependencies that you need for your job.
>> > Personally I would switch to Kubernetes when it leaves the experimental
>> > stage.
>> >
>> > Cheers, Fokko
>> >
>> > 2018-04-28 16:27 GMT+02:00 Kyle Hamlin <ha...@gmail.com>:
>> >
>> > > I don't have a Dask cluster yet, but I'm interested in taking
>> advantage
>> > of
>> > > it for ML tasks. My use case would be bursting a lot of ML jobs into a
>> > > Dask cluster all at once.
>> > > From what I understand, Dask clusters utilize caching to help speed up
>> > jobs
>> > > so I don't know if it makes sense to launch a Dask cluster for every
>> > single
>> > > ML job. Conceivably, I could just have a single Dask worker running
>> 24/7
>> > > and when its time to burst k8 could autoscale the Dask workers as
>> more ML
>> > > jobs are launched into the Dask cluster?
>> > >
>> > > On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <
>> > > daniel.imberman@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Kyle,
>> > > >
>> > > > So you have a static Dask cluster running your k8s cluster? Is there
>> > any
>> > > > reason you wouldn't just launch the Dask cluster for the job you're
>> > > running
>> > > > and then tear it down? I feel like with k8s the elasticity is one of
>> > the
>> > > > main benefits.
>> > > >
>> > > > On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <ha...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > If I have a Kubernetes cluster running in DCOC and a Dask cluster
>> > > running
>> > > > > in that same Kubernetes cluster is it possible/does it makes
>> sense to
>> > > use
>> > > > > the KubernetesExecutor to launch tasks into the Dask cluster
>> (these
>> > are
>> > > > ML
>> > > > > jobs with sklearn)? I feel like there is a bit of inception going
>> on
>> > > here
>> > > > > in my mind and I just want to make sure a setup like this makes
>> > sense?
>> > > > > Thanks in advance for anyone's input!
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Kyle Hamlin
>> > >
>> >
>>
>>
>> --
>> Kyle Hamlin
>>
>

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Posted by Daniel Imberman <da...@gmail.com>.
@Kylen so what I'm trying to understand is why you would want to run a
static DASK cluster when you can launch Dask containers/pods using the
executor?

Seems like there are a few possible options:

1.  add the Dask pip modules to the airflow docker image and call on that
image in the executor_config whenever you need to launch a Dask task. This
would allow you to launch Dask jobs whenever you want in an elastic manner.
2. If there are benefits to keeping the static Dask cluster, then writing a
DaskOperator would be pretty straightforward. You could use the
DaskExecutor as a scaffold and basically write an operator that sends a
request to the Dask cluster and then monitors the job unti the task is
finished. You could also check out the KubernetesPodOperator to see how
that would look.



On Sun, Apr 29, 2018 at 2:58 PM Kyle Hamlin <ha...@gmail.com> wrote:

> Hi Fokko,
>
> So its always been my intention to use the KubernetesExecutor. What I'm
> trying to figure out is how to pair the KubernetesExecutor with a
> Dask cluster, since Dask clusters have many optimizations for ML type
> tasks.
>
> On Sat, Apr 28, 2018 at 2:29 PM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
> > Also one of the main benefits of the Kubernetes Executor is having a
> Docker
> > image that contains all the dependencies that you need for your job.
> > Personally I would switch to Kubernetes when it leaves the experimental
> > stage.
> >
> > Cheers, Fokko
> >
> > 2018-04-28 16:27 GMT+02:00 Kyle Hamlin <ha...@gmail.com>:
> >
> > > I don't have a Dask cluster yet, but I'm interested in taking advantage
> > of
> > > it for ML tasks. My use case would be bursting a lot of ML jobs into a
> > > Dask cluster all at once.
> > > From what I understand, Dask clusters utilize caching to help speed up
> > jobs
> > > so I don't know if it makes sense to launch a Dask cluster for every
> > single
> > > ML job. Conceivably, I could just have a single Dask worker running
> 24/7
> > > and when its time to burst k8 could autoscale the Dask workers as more
> ML
> > > jobs are launched into the Dask cluster?
> > >
> > > On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <
> > > daniel.imberman@gmail.com>
> > > wrote:
> > >
> > > > Hi Kyle,
> > > >
> > > > So you have a static Dask cluster running your k8s cluster? Is there
> > any
> > > > reason you wouldn't just launch the Dask cluster for the job you're
> > > running
> > > > and then tear it down? I feel like with k8s the elasticity is one of
> > the
> > > > main benefits.
> > > >
> > > > On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <ha...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > If I have a Kubernetes cluster running in DCOC and a Dask cluster
> > > running
> > > > > in that same Kubernetes cluster is it possible/does it makes sense
> to
> > > use
> > > > > the KubernetesExecutor to launch tasks into the Dask cluster (these
> > are
> > > > ML
> > > > > jobs with sklearn)? I feel like there is a bit of inception going
> on
> > > here
> > > > > in my mind and I just want to make sure a setup like this makes
> > sense?
> > > > > Thanks in advance for anyone's input!
> > > > >
> > > >
> > >
> > >
> > > --
> > > Kyle Hamlin
> > >
> >
>
>
> --
> Kyle Hamlin
>

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Posted by Kyle Hamlin <ha...@gmail.com>.
Hi Fokko,

So its always been my intention to use the KubernetesExecutor. What I'm
trying to figure out is how to pair the KubernetesExecutor with a
Dask cluster, since Dask clusters have many optimizations for ML type tasks.

On Sat, Apr 28, 2018 at 2:29 PM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

> Also one of the main benefits of the Kubernetes Executor is having a Docker
> image that contains all the dependencies that you need for your job.
> Personally I would switch to Kubernetes when it leaves the experimental
> stage.
>
> Cheers, Fokko
>
> 2018-04-28 16:27 GMT+02:00 Kyle Hamlin <ha...@gmail.com>:
>
> > I don't have a Dask cluster yet, but I'm interested in taking advantage
> of
> > it for ML tasks. My use case would be bursting a lot of ML jobs into a
> > Dask cluster all at once.
> > From what I understand, Dask clusters utilize caching to help speed up
> jobs
> > so I don't know if it makes sense to launch a Dask cluster for every
> single
> > ML job. Conceivably, I could just have a single Dask worker running 24/7
> > and when its time to burst k8 could autoscale the Dask workers as more ML
> > jobs are launched into the Dask cluster?
> >
> > On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <
> > daniel.imberman@gmail.com>
> > wrote:
> >
> > > Hi Kyle,
> > >
> > > So you have a static Dask cluster running your k8s cluster? Is there
> any
> > > reason you wouldn't just launch the Dask cluster for the job you're
> > running
> > > and then tear it down? I feel like with k8s the elasticity is one of
> the
> > > main benefits.
> > >
> > > On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <ha...@gmail.com>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > If I have a Kubernetes cluster running in DCOC and a Dask cluster
> > running
> > > > in that same Kubernetes cluster is it possible/does it makes sense to
> > use
> > > > the KubernetesExecutor to launch tasks into the Dask cluster (these
> are
> > > ML
> > > > jobs with sklearn)? I feel like there is a bit of inception going on
> > here
> > > > in my mind and I just want to make sure a setup like this makes
> sense?
> > > > Thanks in advance for anyone's input!
> > > >
> > >
> >
> >
> > --
> > Kyle Hamlin
> >
>


-- 
Kyle Hamlin

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
Also one of the main benefits of the Kubernetes Executor is having a Docker
image that contains all the dependencies that you need for your job.
Personally I would switch to Kubernetes when it leaves the experimental
stage.

Cheers, Fokko

2018-04-28 16:27 GMT+02:00 Kyle Hamlin <ha...@gmail.com>:

> I don't have a Dask cluster yet, but I'm interested in taking advantage of
> it for ML tasks. My use case would be bursting a lot of ML jobs into a
> Dask cluster all at once.
> From what I understand, Dask clusters utilize caching to help speed up jobs
> so I don't know if it makes sense to launch a Dask cluster for every single
> ML job. Conceivably, I could just have a single Dask worker running 24/7
> and when its time to burst k8 could autoscale the Dask workers as more ML
> jobs are launched into the Dask cluster?
>
> On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <
> daniel.imberman@gmail.com>
> wrote:
>
> > Hi Kyle,
> >
> > So you have a static Dask cluster running your k8s cluster? Is there any
> > reason you wouldn't just launch the Dask cluster for the job you're
> running
> > and then tear it down? I feel like with k8s the elasticity is one of the
> > main benefits.
> >
> > On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <ha...@gmail.com>
> wrote:
> >
> > > Hi all,
> > >
> > > If I have a Kubernetes cluster running in DCOC and a Dask cluster
> running
> > > in that same Kubernetes cluster is it possible/does it makes sense to
> use
> > > the KubernetesExecutor to launch tasks into the Dask cluster (these are
> > ML
> > > jobs with sklearn)? I feel like there is a bit of inception going on
> here
> > > in my mind and I just want to make sure a setup like this makes sense?
> > > Thanks in advance for anyone's input!
> > >
> >
>
>
> --
> Kyle Hamlin
>

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Posted by Kyle Hamlin <ha...@gmail.com>.
I don't have a Dask cluster yet, but I'm interested in taking advantage of
it for ML tasks. My use case would be bursting a lot of ML jobs into a
Dask cluster all at once.
From what I understand, Dask clusters utilize caching to help speed up jobs
so I don't know if it makes sense to launch a Dask cluster for every single
ML job. Conceivably, I could just have a single Dask worker running 24/7
and when its time to burst k8 could autoscale the Dask workers as more ML
jobs are launched into the Dask cluster?

On Fri, Apr 27, 2018 at 10:35 PM Daniel Imberman <da...@gmail.com>
wrote:

> Hi Kyle,
>
> So you have a static Dask cluster running your k8s cluster? Is there any
> reason you wouldn't just launch the Dask cluster for the job you're running
> and then tear it down? I feel like with k8s the elasticity is one of the
> main benefits.
>
> On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <ha...@gmail.com> wrote:
>
> > Hi all,
> >
> > If I have a Kubernetes cluster running in DCOC and a Dask cluster running
> > in that same Kubernetes cluster is it possible/does it makes sense to use
> > the KubernetesExecutor to launch tasks into the Dask cluster (these are
> ML
> > jobs with sklearn)? I feel like there is a bit of inception going on here
> > in my mind and I just want to make sure a setup like this makes sense?
> > Thanks in advance for anyone's input!
> >
>


-- 
Kyle Hamlin

Re: Use KubernetesExecutor to launch tasks into a Dask cluster in Kubernetes

Posted by Daniel Imberman <da...@gmail.com>.
Hi Kyle,

So you have a static Dask cluster running your k8s cluster? Is there any
reason you wouldn't just launch the Dask cluster for the job you're running
and then tear it down? I feel like with k8s the elasticity is one of the
main benefits.

On Fri, Apr 27, 2018 at 12:32 PM Kyle Hamlin <ha...@gmail.com> wrote:

> Hi all,
>
> If I have a Kubernetes cluster running in DCOC and a Dask cluster running
> in that same Kubernetes cluster is it possible/does it makes sense to use
> the KubernetesExecutor to launch tasks into the Dask cluster (these are ML
> jobs with sklearn)? I feel like there is a bit of inception going on here
> in my mind and I just want to make sure a setup like this makes sense?
> Thanks in advance for anyone's input!
>