You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@aurora.apache.org by Mauricio Garavaglia <ma...@gmail.com> on 2016/01/12 04:46:24 UTC

Parameterize each Job Instance.

Hi guys,

We are using the docker rbd volume plugin
<https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin> to
provide persistent storage to the aurora jobs that runs in the containers.
Something like:

p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
jobs = [ Service(..., container = Container(docker = Docker(..., parameters
= p)))]

But in the case of jobs with multiple instances it's required to start each
container using different volumes, in our case different ceph images. This
could be achieved by deploying, for example, 10 instances and then update
each one independently to use the appropiate volume. Of course this is
quite inconvenient, error prone, and adds a lot of logic and state outside
aurora.

We where thinking if it would make sense to have a way to parameterize the
task instances, in a similar way that is done with portmapping for example.
In the job definition have something like

params = [
  Parameter( name='volume',
value='service-{{instanceParameters.volume}}:/foo' )
]
...
jobs = [
  Service(
    name = 'logstash',
    ...
    instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
    instances = 3,
    container = Container(
      docker = Docker(
        image = 'image',
        parameters = params
      )
    )
  )
]


Something like that, it would create 3 instances of the tasks, each one
running in a container that uses the volumes foo, bar, and zaa.

Does it make sense? I'd be glad to work on it but I want to validate the
idea with you first and hear comments about the api/implementation.

Thanks


Mauricio

Re: Parameterize each Job Instance.

Posted by Bill Farner <wf...@apache.org>.

Semi off-the-cuff thought, but one option is to re-define the instance ID
-> TaskConfig association in JobConfiguration:

struct JobConfiguration {
  *...*
  6: TaskConfig taskConfig
  /**
   * The number of instances in the job. Generated instance IDs for
tasks will be in the range
   * [0, instances).
   */
  8: i32 instanceCount
}

Some prior art you could draw from is JobUpdateInstructions, which models a
heterogeneous set of tasks (while supporting normalization):

struct JobUpdateInstructions {
  /** Actual InstanceId -> TaskConfig mapping when the update was requested. */
  1: set<InstanceTaskConfig> initialState

  /** Desired configuration when the update completes. */
  2: InstanceTaskConfig desiredState
  ...
}

struct InstanceTaskConfig {
  /** A TaskConfig associated with instances. */
  1: TaskConfig task

  /** Instances associated with the TaskConfig. */
  2: set<Range> instances
}

So you could imagine JobConfiguration containing set<InstanceTaskConfig> to
be the eventual replacement of the taskConfig, instanceCount fields.

If we proceed this way, it suggests that we should change
JobUpdateInstructions.desiredState to also be set<InstanceTaskConfig> for
parity.


On Tue, Jan 12, 2016 at 8:10 PM, Mauricio Garavaglia <
mauriciogaravaglia@gmail.com> wrote:

> Thanks for the input guys! I was wondering if you have any thoughts about
> how the API should look like.
>
> On Tue, Jan 12, 2016 at 1:00 PM, John Sirois <jo...@gmail.com>
> wrote:
>
> > On Mon, Jan 11, 2016 at 11:02 PM, John Sirois <jo...@gmail.com>
> > wrote:
> >
> > >
> > >
> > > On Mon, Jan 11, 2016 at 11:00 PM, Bill Farner <wf...@apache.org>
> > wrote:
> > >
> > >> In the log, tasks are denormalized anyhow:
> > >>
> > >>
> >
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45
> > >
> > >
> > > Right - but now we'd be making that denormalization systemically
> > > in-effective.  IIUC its values-equals based denorm,  I'd think we'd
> need
> > > diffing in a cluster using, for example, ceph + docker ~exclusively.
> > >
> >
> > I was being generally confusing here.  To be more precise, the issue I'm
> > concerned about is the newish log snapshot deduping feature [1] being
> > foiled by all TaskConfig's for a job's tasks now being unique via
> > `ExecutorConfig.data` [2].
> > This is an optimization concern only, and IIUC it only becomes of concern
> > in very large clusters as evidenced by the fact the log dedup feature
> came
> > late in the use of Aurora by Twitter.
> >
> > This could definitely be worked out.
> >
> > [1]
> >
> >
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L196-L208
> > [2]
> >
> >
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L167
> >
> > >
> > >
> > >>
> > >>
> > >>
> > >>
> > >> On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <jo...@conductant.com>
> > wrote:
> > >>
> > >> > On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wf...@apache.org>
> > >> wrote:
> > >> >
> > >> > > Funny, that's actually how the scheduler API originally worked.  I
> > >> think
> > >> > > this is worth exploring, and would indeed completely sidestep the
> > >> > paradigm
> > >> > > shift i mentioned above.
> > >> > >
> > >> >
> > >> > I think the crux might be handling a structural diff of the thrift
> for
> > >> the
> > >> > Tasks to keep the log dedupe optimizations in-play for the most
> part;
> > ie
> > >> > store Task0 in-full, and Task1-N as thrift struct diffs against 0.
> > >> Maybe
> > >> > something simpler like a binary diff would be enough too.
> > >> >
> > >> >
> > >> > >
> > >> > > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <john@conductant.com
> >
> > >> > wrote:
> > >> > >
> > >> > > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <
> wfarner@apache.org
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > There's a chicken and egg problem though. That variable will
> > only
> > >> be
> > >> > > > filled
> > >> > > > > in on the executor, when we're already running in the docker
> > >> > > environment.
> > >> > > > > In this case, the parameter is used to *define* the docker
> > >> > environment.
> > >> > > > >
> > >> > > >
> > >> > > > So, from a naive standpoint, the fact that Job is exploded into
> > >> Tasks
> > >> > by
> > >> > > > the scheduler but that explosion is not exposed to the client
> > seems
> > >> to
> > >> > be
> > >> > > > the impedance mismatch here.
> > >> > > > I have not thought through this much at all, but say that
> > >> fundamentally
> > >> > > the
> > >> > > > scheduler took a Job that was a list of Tasks - possibly
> > >> heterogeneous.
> > >> > > > The current Job expands to homogeneous Tasks could be just a
> > >> standard
> > >> > > > convenience.
> > >> > > >
> > >> > > > In that sort of world, the customized params could be injected
> > >> client
> > >> > > side
> > >> > > > to form a list of heterogeneous tasks and the Scheduler could
> stay
> > >> > dumb -
> > >> > > > at least wrt Task parameterization.
> > >> > > >
> > >> > > >
> > >> > > > > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <
> > >> benley@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > As a starting point, you might be able to cook up something
> > >> > involving
> > >> > > > > > {{mesos.instance}} as a lookup key to a pystachio list.  You
> > do
> > >> > have
> > >> > > a
> > >> > > > > > unique integer task number per instance to work with.
> > >> > > > > >
> > >> > > > > > cf.
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> > >> > > > > >
> > >> > > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <
> > wfarner@apache.org
> > >> >
> > >> > > > wrote:
> > >> > > > > >
> > >> > > > > > > I agree that this appears necessary when parameters are
> > >> needed to
> > >> > > > > define
> > >> > > > > > > the runtime environment of the task (in this case, setting
> > up
> > >> the
> > >> > > > > docker
> > >> > > > > > > container).
> > >> > > > > > >
> > >> > > > > > > What's particularly interesting here is that this would
> call
> > >> for
> > >> > > the
> > >> > > > > > > scheduler to fill in the parameter values prior to
> launching
> > >> each
> > >> > > > task.
> > >> > > > > > > Using pystachio variables for this is certainly the most
> > >> natural
> > >> > in
> > >> > > > the
> > >> > > > > > > DSL, but becomes a paradigm shift since the scheduler is
> > >> > currently
> > >> > > > > > ignorant
> > >> > > > > > > of pystachio.
> > >> > > > > > >
> > >> > > > > > > Possibly only worth mentioning for shock value, but in the
> > DSL
> > >> > this
> > >> > > > > > starts
> > >> > > > > > > to look like lambdas pretty quickly.
> > >> > > > > > >
> > >> > > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > >> > > > > > > mauriciogaravaglia@gmail.com> wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi guys,
> > >> > > > > > > >
> > >> > > > > > > > We are using the docker rbd volume plugin
> > >> > > > > > > > <
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin
> > >> >
> > >> > > > > > > > to
> > >> > > > > > > > provide persistent storage to the aurora jobs that runs
> in
> > >> the
> > >> > > > > > > containers.
> > >> > > > > > > > Something like:
> > >> > > > > > > >
> > >> > > > > > > > p = [Parameter(name='volume',
> > value='my-ceph-volume:/foo'),
> > >> > ...]
> > >> > > > > > > > jobs = [ Service(..., container = Container(docker =
> > >> > Docker(...,
> > >> > > > > > > parameters
> > >> > > > > > > > = p)))]
> > >> > > > > > > >
> > >> > > > > > > > But in the case of jobs with multiple instances it's
> > >> required
> > >> > to
> > >> > > > > start
> > >> > > > > > > each
> > >> > > > > > > > container using different volumes, in our case different
> > >> ceph
> > >> > > > images.
> > >> > > > > > > This
> > >> > > > > > > > could be achieved by deploying, for example, 10
> instances
> > >> and
> > >> > > then
> > >> > > > > > update
> > >> > > > > > > > each one independently to use the appropiate volume. Of
> > >> course
> > >> > > this
> > >> > > > > is
> > >> > > > > > > > quite inconvenient, error prone, and adds a lot of logic
> > and
> > >> > > state
> > >> > > > > > > outside
> > >> > > > > > > > aurora.
> > >> > > > > > > >
> > >> > > > > > > > We where thinking if it would make sense to have a way
> to
> > >> > > > > parameterize
> > >> > > > > > > the
> > >> > > > > > > > task instances, in a similar way that is done with
> > >> portmapping
> > >> > > for
> > >> > > > > > > example.
> > >> > > > > > > > In the job definition have something like
> > >> > > > > > > >
> > >> > > > > > > > params = [
> > >> > > > > > > >   Parameter( name='volume',
> > >> > > > > > > > value='service-{{instanceParameters.volume}}:/foo' )
> > >> > > > > > > > ]
> > >> > > > > > > > ...
> > >> > > > > > > > jobs = [
> > >> > > > > > > >   Service(
> > >> > > > > > > >     name = 'logstash',
> > >> > > > > > > >     ...
> > >> > > > > > > >     instanceParameters = { "volume" : ["foo", "bar",
> > >> "zaa"]},
> > >> > > > > > > >     instances = 3,
> > >> > > > > > > >     container = Container(
> > >> > > > > > > >       docker = Docker(
> > >> > > > > > > >         image = 'image',
> > >> > > > > > > >         parameters = params
> > >> > > > > > > >       )
> > >> > > > > > > >     )
> > >> > > > > > > >   )
> > >> > > > > > > > ]
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Something like that, it would create 3 instances of the
> > >> tasks,
> > >> > > each
> > >> > > > > one
> > >> > > > > > > > running in a container that uses the volumes foo, bar,
> and
> > >> zaa.
> > >> > > > > > > >
> > >> > > > > > > > Does it make sense? I'd be glad to work on it but I want
> > to
> > >> > > > validate
> > >> > > > > > the
> > >> > > > > > > > idea with you first and hear comments about the
> > >> > > api/implementation.
> > >> > > > > > > >
> > >> > > > > > > > Thanks
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Mauricio
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > John Sirois
> > >> > > > 303-512-3301
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > John Sirois
> > >> > 303-512-3301
> > >> >
> > >>
> > >
> > >
> >
>

Re: Parameterize each Job Instance.

Posted by Mauricio Garavaglia <ma...@gmail.com>.

Thanks for the input guys! I was wondering if you have any thoughts about
how the API should look like.

On Tue, Jan 12, 2016 at 1:00 PM, John Sirois <jo...@gmail.com> wrote:

> On Mon, Jan 11, 2016 at 11:02 PM, John Sirois <jo...@gmail.com>
> wrote:
>
> >
> >
> > On Mon, Jan 11, 2016 at 11:00 PM, Bill Farner <wf...@apache.org>
> wrote:
> >
> >> In the log, tasks are denormalized anyhow:
> >>
> >>
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45
> >
> >
> > Right - but now we'd be making that denormalization systemically
> > in-effective.  IIUC its values-equals based denorm,  I'd think we'd need
> > diffing in a cluster using, for example, ceph + docker ~exclusively.
> >
>
> I was being generally confusing here.  To be more precise, the issue I'm
> concerned about is the newish log snapshot deduping feature [1] being
> foiled by all TaskConfig's for a job's tasks now being unique via
> `ExecutorConfig.data` [2].
> This is an optimization concern only, and IIUC it only becomes of concern
> in very large clusters as evidenced by the fact the log dedup feature came
> late in the use of Aurora by Twitter.
>
> This could definitely be worked out.
>
> [1]
>
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L196-L208
> [2]
>
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L167
>
> >
> >
> >>
> >>
> >>
> >>
> >> On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <jo...@conductant.com>
> wrote:
> >>
> >> > On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wf...@apache.org>
> >> wrote:
> >> >
> >> > > Funny, that's actually how the scheduler API originally worked.  I
> >> think
> >> > > this is worth exploring, and would indeed completely sidestep the
> >> > paradigm
> >> > > shift i mentioned above.
> >> > >
> >> >
> >> > I think the crux might be handling a structural diff of the thrift for
> >> the
> >> > Tasks to keep the log dedupe optimizations in-play for the most part;
> ie
> >> > store Task0 in-full, and Task1-N as thrift struct diffs against 0.
> >> Maybe
> >> > something simpler like a binary diff would be enough too.
> >> >
> >> >
> >> > >
> >> > > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <jo...@conductant.com>
> >> > wrote:
> >> > >
> >> > > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wfarner@apache.org
> >
> >> > > wrote:
> >> > > >
> >> > > > > There's a chicken and egg problem though. That variable will
> only
> >> be
> >> > > > filled
> >> > > > > in on the executor, when we're already running in the docker
> >> > > environment.
> >> > > > > In this case, the parameter is used to *define* the docker
> >> > environment.
> >> > > > >
> >> > > >
> >> > > > So, from a naive standpoint, the fact that Job is exploded into
> >> Tasks
> >> > by
> >> > > > the scheduler but that explosion is not exposed to the client
> seems
> >> to
> >> > be
> >> > > > the impedance mismatch here.
> >> > > > I have not thought through this much at all, but say that
> >> fundamentally
> >> > > the
> >> > > > scheduler took a Job that was a list of Tasks - possibly
> >> heterogeneous.
> >> > > > The current Job expands to homogeneous Tasks could be just a
> >> standard
> >> > > > convenience.
> >> > > >
> >> > > > In that sort of world, the customized params could be injected
> >> client
> >> > > side
> >> > > > to form a list of heterogeneous tasks and the Scheduler could stay
> >> > dumb -
> >> > > > at least wrt Task parameterization.
> >> > > >
> >> > > >
> >> > > > > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <
> >> benley@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > As a starting point, you might be able to cook up something
> >> > involving
> >> > > > > > {{mesos.instance}} as a lookup key to a pystachio list.  You
> do
> >> > have
> >> > > a
> >> > > > > > unique integer task number per instance to work with.
> >> > > > > >
> >> > > > > > cf.
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> >> > > > > >
> >> > > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <
> wfarner@apache.org
> >> >
> >> > > > wrote:
> >> > > > > >
> >> > > > > > > I agree that this appears necessary when parameters are
> >> needed to
> >> > > > > define
> >> > > > > > > the runtime environment of the task (in this case, setting
> up
> >> the
> >> > > > > docker
> >> > > > > > > container).
> >> > > > > > >
> >> > > > > > > What's particularly interesting here is that this would call
> >> for
> >> > > the
> >> > > > > > > scheduler to fill in the parameter values prior to launching
> >> each
> >> > > > task.
> >> > > > > > > Using pystachio variables for this is certainly the most
> >> natural
> >> > in
> >> > > > the
> >> > > > > > > DSL, but becomes a paradigm shift since the scheduler is
> >> > currently
> >> > > > > > ignorant
> >> > > > > > > of pystachio.
> >> > > > > > >
> >> > > > > > > Possibly only worth mentioning for shock value, but in the
> DSL
> >> > this
> >> > > > > > starts
> >> > > > > > > to look like lambdas pretty quickly.
> >> > > > > > >
> >> > > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> >> > > > > > > mauriciogaravaglia@gmail.com> wrote:
> >> > > > > > >
> >> > > > > > > > Hi guys,
> >> > > > > > > >
> >> > > > > > > > We are using the docker rbd volume plugin
> >> > > > > > > > <
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin
> >> >
> >> > > > > > > > to
> >> > > > > > > > provide persistent storage to the aurora jobs that runs in
> >> the
> >> > > > > > > containers.
> >> > > > > > > > Something like:
> >> > > > > > > >
> >> > > > > > > > p = [Parameter(name='volume',
> value='my-ceph-volume:/foo'),
> >> > ...]
> >> > > > > > > > jobs = [ Service(..., container = Container(docker =
> >> > Docker(...,
> >> > > > > > > parameters
> >> > > > > > > > = p)))]
> >> > > > > > > >
> >> > > > > > > > But in the case of jobs with multiple instances it's
> >> required
> >> > to
> >> > > > > start
> >> > > > > > > each
> >> > > > > > > > container using different volumes, in our case different
> >> ceph
> >> > > > images.
> >> > > > > > > This
> >> > > > > > > > could be achieved by deploying, for example, 10 instances
> >> and
> >> > > then
> >> > > > > > update
> >> > > > > > > > each one independently to use the appropiate volume. Of
> >> course
> >> > > this
> >> > > > > is
> >> > > > > > > > quite inconvenient, error prone, and adds a lot of logic
> and
> >> > > state
> >> > > > > > > outside
> >> > > > > > > > aurora.
> >> > > > > > > >
> >> > > > > > > > We where thinking if it would make sense to have a way to
> >> > > > > parameterize
> >> > > > > > > the
> >> > > > > > > > task instances, in a similar way that is done with
> >> portmapping
> >> > > for
> >> > > > > > > example.
> >> > > > > > > > In the job definition have something like
> >> > > > > > > >
> >> > > > > > > > params = [
> >> > > > > > > >   Parameter( name='volume',
> >> > > > > > > > value='service-{{instanceParameters.volume}}:/foo' )
> >> > > > > > > > ]
> >> > > > > > > > ...
> >> > > > > > > > jobs = [
> >> > > > > > > >   Service(
> >> > > > > > > >     name = 'logstash',
> >> > > > > > > >     ...
> >> > > > > > > >     instanceParameters = { "volume" : ["foo", "bar",
> >> "zaa"]},
> >> > > > > > > >     instances = 3,
> >> > > > > > > >     container = Container(
> >> > > > > > > >       docker = Docker(
> >> > > > > > > >         image = 'image',
> >> > > > > > > >         parameters = params
> >> > > > > > > >       )
> >> > > > > > > >     )
> >> > > > > > > >   )
> >> > > > > > > > ]
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > Something like that, it would create 3 instances of the
> >> tasks,
> >> > > each
> >> > > > > one
> >> > > > > > > > running in a container that uses the volumes foo, bar, and
> >> zaa.
> >> > > > > > > >
> >> > > > > > > > Does it make sense? I'd be glad to work on it but I want
> to
> >> > > > validate
> >> > > > > > the
> >> > > > > > > > idea with you first and hear comments about the
> >> > > api/implementation.
> >> > > > > > > >
> >> > > > > > > > Thanks
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > Mauricio
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > John Sirois
> >> > > > 303-512-3301
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > John Sirois
> >> > 303-512-3301
> >> >
> >>
> >
> >
>

Re: Parameterize each Job Instance.

Posted by John Sirois <jo...@gmail.com>.

On Mon, Jan 11, 2016 at 11:02 PM, John Sirois <jo...@gmail.com> wrote:

>
>
> On Mon, Jan 11, 2016 at 11:00 PM, Bill Farner <wf...@apache.org> wrote:
>
>> In the log, tasks are denormalized anyhow:
>>
>> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45
>
>
> Right - but now we'd be making that denormalization systemically
> in-effective.  IIUC its values-equals based denorm,  I'd think we'd need
> diffing in a cluster using, for example, ceph + docker ~exclusively.
>

I was being generally confusing here.  To be more precise, the issue I'm
concerned about is the newish log snapshot deduping feature [1] being
foiled by all TaskConfig's for a job's tasks now being unique via
`ExecutorConfig.data` [2].
This is an optimization concern only, and IIUC it only becomes of concern
in very large clusters as evidenced by the fact the log dedup feature came
late in the use of Aurora by Twitter.

This could definitely be worked out.

[1]
https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L196-L208
[2]
https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L167

>
>
>>
>>
>>
>>
>> On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <jo...@conductant.com> wrote:
>>
>> > On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wf...@apache.org>
>> wrote:
>> >
>> > > Funny, that's actually how the scheduler API originally worked.  I
>> think
>> > > this is worth exploring, and would indeed completely sidestep the
>> > paradigm
>> > > shift i mentioned above.
>> > >
>> >
>> > I think the crux might be handling a structural diff of the thrift for
>> the
>> > Tasks to keep the log dedupe optimizations in-play for the most part; ie
>> > store Task0 in-full, and Task1-N as thrift struct diffs against 0.
>> Maybe
>> > something simpler like a binary diff would be enough too.
>> >
>> >
>> > >
>> > > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <jo...@conductant.com>
>> > wrote:
>> > >
>> > > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wf...@apache.org>
>> > > wrote:
>> > > >
>> > > > > There's a chicken and egg problem though. That variable will only
>> be
>> > > > filled
>> > > > > in on the executor, when we're already running in the docker
>> > > environment.
>> > > > > In this case, the parameter is used to *define* the docker
>> > environment.
>> > > > >
>> > > >
>> > > > So, from a naive standpoint, the fact that Job is exploded into
>> Tasks
>> > by
>> > > > the scheduler but that explosion is not exposed to the client seems
>> to
>> > be
>> > > > the impedance mismatch here.
>> > > > I have not thought through this much at all, but say that
>> fundamentally
>> > > the
>> > > > scheduler took a Job that was a list of Tasks - possibly
>> heterogeneous.
>> > > > The current Job expands to homogeneous Tasks could be just a
>> standard
>> > > > convenience.
>> > > >
>> > > > In that sort of world, the customized params could be injected
>> client
>> > > side
>> > > > to form a list of heterogeneous tasks and the Scheduler could stay
>> > dumb -
>> > > > at least wrt Task parameterization.
>> > > >
>> > > >
>> > > > > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <
>> benley@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > As a starting point, you might be able to cook up something
>> > involving
>> > > > > > {{mesos.instance}} as a lookup key to a pystachio list.  You do
>> > have
>> > > a
>> > > > > > unique integer task number per instance to work with.
>> > > > > >
>> > > > > > cf.
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
>> > > > > >
>> > > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wfarner@apache.org
>> >
>> > > > wrote:
>> > > > > >
>> > > > > > > I agree that this appears necessary when parameters are
>> needed to
>> > > > > define
>> > > > > > > the runtime environment of the task (in this case, setting up
>> the
>> > > > > docker
>> > > > > > > container).
>> > > > > > >
>> > > > > > > What's particularly interesting here is that this would call
>> for
>> > > the
>> > > > > > > scheduler to fill in the parameter values prior to launching
>> each
>> > > > task.
>> > > > > > > Using pystachio variables for this is certainly the most
>> natural
>> > in
>> > > > the
>> > > > > > > DSL, but becomes a paradigm shift since the scheduler is
>> > currently
>> > > > > > ignorant
>> > > > > > > of pystachio.
>> > > > > > >
>> > > > > > > Possibly only worth mentioning for shock value, but in the DSL
>> > this
>> > > > > > starts
>> > > > > > > to look like lambdas pretty quickly.
>> > > > > > >
>> > > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
>> > > > > > > mauriciogaravaglia@gmail.com> wrote:
>> > > > > > >
>> > > > > > > > Hi guys,
>> > > > > > > >
>> > > > > > > > We are using the docker rbd volume plugin
>> > > > > > > > <
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin
>> >
>> > > > > > > > to
>> > > > > > > > provide persistent storage to the aurora jobs that runs in
>> the
>> > > > > > > containers.
>> > > > > > > > Something like:
>> > > > > > > >
>> > > > > > > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'),
>> > ...]
>> > > > > > > > jobs = [ Service(..., container = Container(docker =
>> > Docker(...,
>> > > > > > > parameters
>> > > > > > > > = p)))]
>> > > > > > > >
>> > > > > > > > But in the case of jobs with multiple instances it's
>> required
>> > to
>> > > > > start
>> > > > > > > each
>> > > > > > > > container using different volumes, in our case different
>> ceph
>> > > > images.
>> > > > > > > This
>> > > > > > > > could be achieved by deploying, for example, 10 instances
>> and
>> > > then
>> > > > > > update
>> > > > > > > > each one independently to use the appropiate volume. Of
>> course
>> > > this
>> > > > > is
>> > > > > > > > quite inconvenient, error prone, and adds a lot of logic and
>> > > state
>> > > > > > > outside
>> > > > > > > > aurora.
>> > > > > > > >
>> > > > > > > > We where thinking if it would make sense to have a way to
>> > > > > parameterize
>> > > > > > > the
>> > > > > > > > task instances, in a similar way that is done with
>> portmapping
>> > > for
>> > > > > > > example.
>> > > > > > > > In the job definition have something like
>> > > > > > > >
>> > > > > > > > params = [
>> > > > > > > >   Parameter( name='volume',
>> > > > > > > > value='service-{{instanceParameters.volume}}:/foo' )
>> > > > > > > > ]
>> > > > > > > > ...
>> > > > > > > > jobs = [
>> > > > > > > >   Service(
>> > > > > > > >     name = 'logstash',
>> > > > > > > >     ...
>> > > > > > > >     instanceParameters = { "volume" : ["foo", "bar",
>> "zaa"]},
>> > > > > > > >     instances = 3,
>> > > > > > > >     container = Container(
>> > > > > > > >       docker = Docker(
>> > > > > > > >         image = 'image',
>> > > > > > > >         parameters = params
>> > > > > > > >       )
>> > > > > > > >     )
>> > > > > > > >   )
>> > > > > > > > ]
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Something like that, it would create 3 instances of the
>> tasks,
>> > > each
>> > > > > one
>> > > > > > > > running in a container that uses the volumes foo, bar, and
>> zaa.
>> > > > > > > >
>> > > > > > > > Does it make sense? I'd be glad to work on it but I want to
>> > > > validate
>> > > > > > the
>> > > > > > > > idea with you first and hear comments about the
>> > > api/implementation.
>> > > > > > > >
>> > > > > > > > Thanks
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Mauricio
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > John Sirois
>> > > > 303-512-3301
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > John Sirois
>> > 303-512-3301
>> >
>>
>
>

Re: Parameterize each Job Instance.

Posted by John Sirois <jo...@gmail.com>.

On Mon, Jan 11, 2016 at 11:00 PM, Bill Farner <wf...@apache.org> wrote:

> In the log, tasks are denormalized anyhow:
>
> https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45


Right - but now we'd be making that denormalization systemically
in-effective.  IIUC its values-equals based denorm,  I'd think we'd need
diffing in a cluster using, for example, ceph + docker ~exclusively.


>
>
>
>
> On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <jo...@conductant.com> wrote:
>
> > On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wf...@apache.org>
> wrote:
> >
> > > Funny, that's actually how the scheduler API originally worked.  I
> think
> > > this is worth exploring, and would indeed completely sidestep the
> > paradigm
> > > shift i mentioned above.
> > >
> >
> > I think the crux might be handling a structural diff of the thrift for
> the
> > Tasks to keep the log dedupe optimizations in-play for the most part; ie
> > store Task0 in-full, and Task1-N as thrift struct diffs against 0.  Maybe
> > something simpler like a binary diff would be enough too.
> >
> >
> > >
> > > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <jo...@conductant.com>
> > wrote:
> > >
> > > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wf...@apache.org>
> > > wrote:
> > > >
> > > > > There's a chicken and egg problem though. That variable will only
> be
> > > > filled
> > > > > in on the executor, when we're already running in the docker
> > > environment.
> > > > > In this case, the parameter is used to *define* the docker
> > environment.
> > > > >
> > > >
> > > > So, from a naive standpoint, the fact that Job is exploded into Tasks
> > by
> > > > the scheduler but that explosion is not exposed to the client seems
> to
> > be
> > > > the impedance mismatch here.
> > > > I have not thought through this much at all, but say that
> fundamentally
> > > the
> > > > scheduler took a Job that was a list of Tasks - possibly
> heterogeneous.
> > > > The current Job expands to homogeneous Tasks could be just a standard
> > > > convenience.
> > > >
> > > > In that sort of world, the customized params could be injected client
> > > side
> > > > to form a list of heterogeneous tasks and the Scheduler could stay
> > dumb -
> > > > at least wrt Task parameterization.
> > > >
> > > >
> > > > > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <
> benley@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > As a starting point, you might be able to cook up something
> > involving
> > > > > > {{mesos.instance}} as a lookup key to a pystachio list.  You do
> > have
> > > a
> > > > > > unique integer task number per instance to work with.
> > > > > >
> > > > > > cf.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> > > > > >
> > > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org>
> > > > wrote:
> > > > > >
> > > > > > > I agree that this appears necessary when parameters are needed
> to
> > > > > define
> > > > > > > the runtime environment of the task (in this case, setting up
> the
> > > > > docker
> > > > > > > container).
> > > > > > >
> > > > > > > What's particularly interesting here is that this would call
> for
> > > the
> > > > > > > scheduler to fill in the parameter values prior to launching
> each
> > > > task.
> > > > > > > Using pystachio variables for this is certainly the most
> natural
> > in
> > > > the
> > > > > > > DSL, but becomes a paradigm shift since the scheduler is
> > currently
> > > > > > ignorant
> > > > > > > of pystachio.
> > > > > > >
> > > > > > > Possibly only worth mentioning for shock value, but in the DSL
> > this
> > > > > > starts
> > > > > > > to look like lambdas pretty quickly.
> > > > > > >
> > > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > > > > > > mauriciogaravaglia@gmail.com> wrote:
> > > > > > >
> > > > > > > > Hi guys,
> > > > > > > >
> > > > > > > > We are using the docker rbd volume plugin
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > > > > > > > to
> > > > > > > > provide persistent storage to the aurora jobs that runs in
> the
> > > > > > > containers.
> > > > > > > > Something like:
> > > > > > > >
> > > > > > > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'),
> > ...]
> > > > > > > > jobs = [ Service(..., container = Container(docker =
> > Docker(...,
> > > > > > > parameters
> > > > > > > > = p)))]
> > > > > > > >
> > > > > > > > But in the case of jobs with multiple instances it's required
> > to
> > > > > start
> > > > > > > each
> > > > > > > > container using different volumes, in our case different ceph
> > > > images.
> > > > > > > This
> > > > > > > > could be achieved by deploying, for example, 10 instances and
> > > then
> > > > > > update
> > > > > > > > each one independently to use the appropiate volume. Of
> course
> > > this
> > > > > is
> > > > > > > > quite inconvenient, error prone, and adds a lot of logic and
> > > state
> > > > > > > outside
> > > > > > > > aurora.
> > > > > > > >
> > > > > > > > We where thinking if it would make sense to have a way to
> > > > > parameterize
> > > > > > > the
> > > > > > > > task instances, in a similar way that is done with
> portmapping
> > > for
> > > > > > > example.
> > > > > > > > In the job definition have something like
> > > > > > > >
> > > > > > > > params = [
> > > > > > > >   Parameter( name='volume',
> > > > > > > > value='service-{{instanceParameters.volume}}:/foo' )
> > > > > > > > ]
> > > > > > > > ...
> > > > > > > > jobs = [
> > > > > > > >   Service(
> > > > > > > >     name = 'logstash',
> > > > > > > >     ...
> > > > > > > >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> > > > > > > >     instances = 3,
> > > > > > > >     container = Container(
> > > > > > > >       docker = Docker(
> > > > > > > >         image = 'image',
> > > > > > > >         parameters = params
> > > > > > > >       )
> > > > > > > >     )
> > > > > > > >   )
> > > > > > > > ]
> > > > > > > >
> > > > > > > >
> > > > > > > > Something like that, it would create 3 instances of the
> tasks,
> > > each
> > > > > one
> > > > > > > > running in a container that uses the volumes foo, bar, and
> zaa.
> > > > > > > >
> > > > > > > > Does it make sense? I'd be glad to work on it but I want to
> > > > validate
> > > > > > the
> > > > > > > > idea with you first and hear comments about the
> > > api/implementation.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > Mauricio
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > John Sirois
> > > > 303-512-3301
> > > >
> > >
> >
> >
> >
> > --
> > John Sirois
> > 303-512-3301
> >
>

Re: Parameterize each Job Instance.

Posted by Bill Farner <wf...@apache.org>.

In the log, tasks are denormalized anyhow:
https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L43-L45



On Mon, Jan 11, 2016 at 9:54 PM, John Sirois <jo...@conductant.com> wrote:

> On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wf...@apache.org> wrote:
>
> > Funny, that's actually how the scheduler API originally worked.  I think
> > this is worth exploring, and would indeed completely sidestep the
> paradigm
> > shift i mentioned above.
> >
>
> I think the crux might be handling a structural diff of the thrift for the
> Tasks to keep the log dedupe optimizations in-play for the most part; ie
> store Task0 in-full, and Task1-N as thrift struct diffs against 0.  Maybe
> something simpler like a binary diff would be enough too.
>
>
> >
> > On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <jo...@conductant.com>
> wrote:
> >
> > > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wf...@apache.org>
> > wrote:
> > >
> > > > There's a chicken and egg problem though. That variable will only be
> > > filled
> > > > in on the executor, when we're already running in the docker
> > environment.
> > > > In this case, the parameter is used to *define* the docker
> environment.
> > > >
> > >
> > > So, from a naive standpoint, the fact that Job is exploded into Tasks
> by
> > > the scheduler but that explosion is not exposed to the client seems to
> be
> > > the impedance mismatch here.
> > > I have not thought through this much at all, but say that fundamentally
> > the
> > > scheduler took a Job that was a list of Tasks - possibly heterogeneous.
> > > The current Job expands to homogeneous Tasks could be just a standard
> > > convenience.
> > >
> > > In that sort of world, the customized params could be injected client
> > side
> > > to form a list of heterogeneous tasks and the Scheduler could stay
> dumb -
> > > at least wrt Task parameterization.
> > >
> > >
> > > > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <be...@gmail.com>
> > > > wrote:
> > > >
> > > > > As a starting point, you might be able to cook up something
> involving
> > > > > {{mesos.instance}} as a lookup key to a pystachio list.  You do
> have
> > a
> > > > > unique integer task number per instance to work with.
> > > > >
> > > > > cf.
> > > > >
> > > > >
> > > >
> > >
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> > > > >
> > > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org>
> > > wrote:
> > > > >
> > > > > > I agree that this appears necessary when parameters are needed to
> > > > define
> > > > > > the runtime environment of the task (in this case, setting up the
> > > > docker
> > > > > > container).
> > > > > >
> > > > > > What's particularly interesting here is that this would call for
> > the
> > > > > > scheduler to fill in the parameter values prior to launching each
> > > task.
> > > > > > Using pystachio variables for this is certainly the most natural
> in
> > > the
> > > > > > DSL, but becomes a paradigm shift since the scheduler is
> currently
> > > > > ignorant
> > > > > > of pystachio.
> > > > > >
> > > > > > Possibly only worth mentioning for shock value, but in the DSL
> this
> > > > > starts
> > > > > > to look like lambdas pretty quickly.
> > > > > >
> > > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > > > > > mauriciogaravaglia@gmail.com> wrote:
> > > > > >
> > > > > > > Hi guys,
> > > > > > >
> > > > > > > We are using the docker rbd volume plugin
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > > > > > > to
> > > > > > > provide persistent storage to the aurora jobs that runs in the
> > > > > > containers.
> > > > > > > Something like:
> > > > > > >
> > > > > > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'),
> ...]
> > > > > > > jobs = [ Service(..., container = Container(docker =
> Docker(...,
> > > > > > parameters
> > > > > > > = p)))]
> > > > > > >
> > > > > > > But in the case of jobs with multiple instances it's required
> to
> > > > start
> > > > > > each
> > > > > > > container using different volumes, in our case different ceph
> > > images.
> > > > > > This
> > > > > > > could be achieved by deploying, for example, 10 instances and
> > then
> > > > > update
> > > > > > > each one independently to use the appropiate volume. Of course
> > this
> > > > is
> > > > > > > quite inconvenient, error prone, and adds a lot of logic and
> > state
> > > > > > outside
> > > > > > > aurora.
> > > > > > >
> > > > > > > We where thinking if it would make sense to have a way to
> > > > parameterize
> > > > > > the
> > > > > > > task instances, in a similar way that is done with portmapping
> > for
> > > > > > example.
> > > > > > > In the job definition have something like
> > > > > > >
> > > > > > > params = [
> > > > > > >   Parameter( name='volume',
> > > > > > > value='service-{{instanceParameters.volume}}:/foo' )
> > > > > > > ]
> > > > > > > ...
> > > > > > > jobs = [
> > > > > > >   Service(
> > > > > > >     name = 'logstash',
> > > > > > >     ...
> > > > > > >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> > > > > > >     instances = 3,
> > > > > > >     container = Container(
> > > > > > >       docker = Docker(
> > > > > > >         image = 'image',
> > > > > > >         parameters = params
> > > > > > >       )
> > > > > > >     )
> > > > > > >   )
> > > > > > > ]
> > > > > > >
> > > > > > >
> > > > > > > Something like that, it would create 3 instances of the tasks,
> > each
> > > > one
> > > > > > > running in a container that uses the volumes foo, bar, and zaa.
> > > > > > >
> > > > > > > Does it make sense? I'd be glad to work on it but I want to
> > > validate
> > > > > the
> > > > > > > idea with you first and hear comments about the
> > api/implementation.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > Mauricio
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > John Sirois
> > > 303-512-3301
> > >
> >
>
>
>
> --
> John Sirois
> 303-512-3301
>

Re: Parameterize each Job Instance.

Posted by John Sirois <jo...@conductant.com>.

On Mon, Jan 11, 2016 at 10:40 PM, Bill Farner <wf...@apache.org> wrote:

> Funny, that's actually how the scheduler API originally worked.  I think
> this is worth exploring, and would indeed completely sidestep the paradigm
> shift i mentioned above.
>

I think the crux might be handling a structural diff of the thrift for the
Tasks to keep the log dedupe optimizations in-play for the most part; ie
store Task0 in-full, and Task1-N as thrift struct diffs against 0.  Maybe
something simpler like a binary diff would be enough too.


>
> On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <jo...@conductant.com> wrote:
>
> > On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wf...@apache.org>
> wrote:
> >
> > > There's a chicken and egg problem though. That variable will only be
> > filled
> > > in on the executor, when we're already running in the docker
> environment.
> > > In this case, the parameter is used to *define* the docker environment.
> > >
> >
> > So, from a naive standpoint, the fact that Job is exploded into Tasks by
> > the scheduler but that explosion is not exposed to the client seems to be
> > the impedance mismatch here.
> > I have not thought through this much at all, but say that fundamentally
> the
> > scheduler took a Job that was a list of Tasks - possibly heterogeneous.
> > The current Job expands to homogeneous Tasks could be just a standard
> > convenience.
> >
> > In that sort of world, the customized params could be injected client
> side
> > to form a list of heterogeneous tasks and the Scheduler could stay dumb -
> > at least wrt Task parameterization.
> >
> >
> > > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <be...@gmail.com>
> > > wrote:
> > >
> > > > As a starting point, you might be able to cook up something involving
> > > > {{mesos.instance}} as a lookup key to a pystachio list.  You do have
> a
> > > > unique integer task number per instance to work with.
> > > >
> > > > cf.
> > > >
> > > >
> > >
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> > > >
> > > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org>
> > wrote:
> > > >
> > > > > I agree that this appears necessary when parameters are needed to
> > > define
> > > > > the runtime environment of the task (in this case, setting up the
> > > docker
> > > > > container).
> > > > >
> > > > > What's particularly interesting here is that this would call for
> the
> > > > > scheduler to fill in the parameter values prior to launching each
> > task.
> > > > > Using pystachio variables for this is certainly the most natural in
> > the
> > > > > DSL, but becomes a paradigm shift since the scheduler is currently
> > > > ignorant
> > > > > of pystachio.
> > > > >
> > > > > Possibly only worth mentioning for shock value, but in the DSL this
> > > > starts
> > > > > to look like lambdas pretty quickly.
> > > > >
> > > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > > > > mauriciogaravaglia@gmail.com> wrote:
> > > > >
> > > > > > Hi guys,
> > > > > >
> > > > > > We are using the docker rbd volume plugin
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > > > > > to
> > > > > > provide persistent storage to the aurora jobs that runs in the
> > > > > containers.
> > > > > > Something like:
> > > > > >
> > > > > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
> > > > > > jobs = [ Service(..., container = Container(docker = Docker(...,
> > > > > parameters
> > > > > > = p)))]
> > > > > >
> > > > > > But in the case of jobs with multiple instances it's required to
> > > start
> > > > > each
> > > > > > container using different volumes, in our case different ceph
> > images.
> > > > > This
> > > > > > could be achieved by deploying, for example, 10 instances and
> then
> > > > update
> > > > > > each one independently to use the appropiate volume. Of course
> this
> > > is
> > > > > > quite inconvenient, error prone, and adds a lot of logic and
> state
> > > > > outside
> > > > > > aurora.
> > > > > >
> > > > > > We where thinking if it would make sense to have a way to
> > > parameterize
> > > > > the
> > > > > > task instances, in a similar way that is done with portmapping
> for
> > > > > example.
> > > > > > In the job definition have something like
> > > > > >
> > > > > > params = [
> > > > > >   Parameter( name='volume',
> > > > > > value='service-{{instanceParameters.volume}}:/foo' )
> > > > > > ]
> > > > > > ...
> > > > > > jobs = [
> > > > > >   Service(
> > > > > >     name = 'logstash',
> > > > > >     ...
> > > > > >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> > > > > >     instances = 3,
> > > > > >     container = Container(
> > > > > >       docker = Docker(
> > > > > >         image = 'image',
> > > > > >         parameters = params
> > > > > >       )
> > > > > >     )
> > > > > >   )
> > > > > > ]
> > > > > >
> > > > > >
> > > > > > Something like that, it would create 3 instances of the tasks,
> each
> > > one
> > > > > > running in a container that uses the volumes foo, bar, and zaa.
> > > > > >
> > > > > > Does it make sense? I'd be glad to work on it but I want to
> > validate
> > > > the
> > > > > > idea with you first and hear comments about the
> api/implementation.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > Mauricio
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > John Sirois
> > 303-512-3301
> >
>



-- 
John Sirois
303-512-3301

Re: Parameterize each Job Instance.

Posted by Bill Farner <wf...@apache.org>.

Funny, that's actually how the scheduler API originally worked.  I think
this is worth exploring, and would indeed completely sidestep the paradigm
shift i mentioned above.

On Mon, Jan 11, 2016 at 9:20 PM, John Sirois <jo...@conductant.com> wrote:

> On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wf...@apache.org> wrote:
>
> > There's a chicken and egg problem though. That variable will only be
> filled
> > in on the executor, when we're already running in the docker environment.
> > In this case, the parameter is used to *define* the docker environment.
> >
>
> So, from a naive standpoint, the fact that Job is exploded into Tasks by
> the scheduler but that explosion is not exposed to the client seems to be
> the impedance mismatch here.
> I have not thought through this much at all, but say that fundamentally the
> scheduler took a Job that was a list of Tasks - possibly heterogeneous.
> The current Job expands to homogeneous Tasks could be just a standard
> convenience.
>
> In that sort of world, the customized params could be injected client side
> to form a list of heterogeneous tasks and the Scheduler could stay dumb -
> at least wrt Task parameterization.
>
>
> > On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <be...@gmail.com>
> > wrote:
> >
> > > As a starting point, you might be able to cook up something involving
> > > {{mesos.instance}} as a lookup key to a pystachio list.  You do have a
> > > unique integer task number per instance to work with.
> > >
> > > cf.
> > >
> > >
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> > >
> > > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org>
> wrote:
> > >
> > > > I agree that this appears necessary when parameters are needed to
> > define
> > > > the runtime environment of the task (in this case, setting up the
> > docker
> > > > container).
> > > >
> > > > What's particularly interesting here is that this would call for the
> > > > scheduler to fill in the parameter values prior to launching each
> task.
> > > > Using pystachio variables for this is certainly the most natural in
> the
> > > > DSL, but becomes a paradigm shift since the scheduler is currently
> > > ignorant
> > > > of pystachio.
> > > >
> > > > Possibly only worth mentioning for shock value, but in the DSL this
> > > starts
> > > > to look like lambdas pretty quickly.
> > > >
> > > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > > > mauriciogaravaglia@gmail.com> wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > We are using the docker rbd volume plugin
> > > > > <
> > > >
> > >
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > > > > to
> > > > > provide persistent storage to the aurora jobs that runs in the
> > > > containers.
> > > > > Something like:
> > > > >
> > > > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
> > > > > jobs = [ Service(..., container = Container(docker = Docker(...,
> > > > parameters
> > > > > = p)))]
> > > > >
> > > > > But in the case of jobs with multiple instances it's required to
> > start
> > > > each
> > > > > container using different volumes, in our case different ceph
> images.
> > > > This
> > > > > could be achieved by deploying, for example, 10 instances and then
> > > update
> > > > > each one independently to use the appropiate volume. Of course this
> > is
> > > > > quite inconvenient, error prone, and adds a lot of logic and state
> > > > outside
> > > > > aurora.
> > > > >
> > > > > We where thinking if it would make sense to have a way to
> > parameterize
> > > > the
> > > > > task instances, in a similar way that is done with portmapping for
> > > > example.
> > > > > In the job definition have something like
> > > > >
> > > > > params = [
> > > > >   Parameter( name='volume',
> > > > > value='service-{{instanceParameters.volume}}:/foo' )
> > > > > ]
> > > > > ...
> > > > > jobs = [
> > > > >   Service(
> > > > >     name = 'logstash',
> > > > >     ...
> > > > >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> > > > >     instances = 3,
> > > > >     container = Container(
> > > > >       docker = Docker(
> > > > >         image = 'image',
> > > > >         parameters = params
> > > > >       )
> > > > >     )
> > > > >   )
> > > > > ]
> > > > >
> > > > >
> > > > > Something like that, it would create 3 instances of the tasks, each
> > one
> > > > > running in a container that uses the volumes foo, bar, and zaa.
> > > > >
> > > > > Does it make sense? I'd be glad to work on it but I want to
> validate
> > > the
> > > > > idea with you first and hear comments about the api/implementation.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > Mauricio
> > > > >
> > > >
> > >
> >
>
>
>
> --
> John Sirois
> 303-512-3301
>

Re: Parameterize each Job Instance.

Posted by John Sirois <jo...@conductant.com>.

On Mon, Jan 11, 2016 at 10:10 PM, Bill Farner <wf...@apache.org> wrote:

> There's a chicken and egg problem though. That variable will only be filled
> in on the executor, when we're already running in the docker environment.
> In this case, the parameter is used to *define* the docker environment.
>

So, from a naive standpoint, the fact that Job is exploded into Tasks by
the scheduler but that explosion is not exposed to the client seems to be
the impedance mismatch here.
I have not thought through this much at all, but say that fundamentally the
scheduler took a Job that was a list of Tasks - possibly heterogeneous.
The current Job expands to homogeneous Tasks could be just a standard
convenience.

In that sort of world, the customized params could be injected client side
to form a list of heterogeneous tasks and the Scheduler could stay dumb -
at least wrt Task parameterization.


> On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <be...@gmail.com>
> wrote:
>
> > As a starting point, you might be able to cook up something involving
> > {{mesos.instance}} as a lookup key to a pystachio list.  You do have a
> > unique integer task number per instance to work with.
> >
> > cf.
> >
> >
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
> >
> > On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org> wrote:
> >
> > > I agree that this appears necessary when parameters are needed to
> define
> > > the runtime environment of the task (in this case, setting up the
> docker
> > > container).
> > >
> > > What's particularly interesting here is that this would call for the
> > > scheduler to fill in the parameter values prior to launching each task.
> > > Using pystachio variables for this is certainly the most natural in the
> > > DSL, but becomes a paradigm shift since the scheduler is currently
> > ignorant
> > > of pystachio.
> > >
> > > Possibly only worth mentioning for shock value, but in the DSL this
> > starts
> > > to look like lambdas pretty quickly.
> > >
> > > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > > mauriciogaravaglia@gmail.com> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > We are using the docker rbd volume plugin
> > > > <
> > >
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > > > to
> > > > provide persistent storage to the aurora jobs that runs in the
> > > containers.
> > > > Something like:
> > > >
> > > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
> > > > jobs = [ Service(..., container = Container(docker = Docker(...,
> > > parameters
> > > > = p)))]
> > > >
> > > > But in the case of jobs with multiple instances it's required to
> start
> > > each
> > > > container using different volumes, in our case different ceph images.
> > > This
> > > > could be achieved by deploying, for example, 10 instances and then
> > update
> > > > each one independently to use the appropiate volume. Of course this
> is
> > > > quite inconvenient, error prone, and adds a lot of logic and state
> > > outside
> > > > aurora.
> > > >
> > > > We where thinking if it would make sense to have a way to
> parameterize
> > > the
> > > > task instances, in a similar way that is done with portmapping for
> > > example.
> > > > In the job definition have something like
> > > >
> > > > params = [
> > > >   Parameter( name='volume',
> > > > value='service-{{instanceParameters.volume}}:/foo' )
> > > > ]
> > > > ...
> > > > jobs = [
> > > >   Service(
> > > >     name = 'logstash',
> > > >     ...
> > > >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> > > >     instances = 3,
> > > >     container = Container(
> > > >       docker = Docker(
> > > >         image = 'image',
> > > >         parameters = params
> > > >       )
> > > >     )
> > > >   )
> > > > ]
> > > >
> > > >
> > > > Something like that, it would create 3 instances of the tasks, each
> one
> > > > running in a container that uses the volumes foo, bar, and zaa.
> > > >
> > > > Does it make sense? I'd be glad to work on it but I want to validate
> > the
> > > > idea with you first and hear comments about the api/implementation.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > Mauricio
> > > >
> > >
> >
>



-- 
John Sirois
303-512-3301

Re: Parameterize each Job Instance.

Posted by Bill Farner <wf...@apache.org>.

There's a chicken and egg problem though. That variable will only be filled
in on the executor, when we're already running in the docker environment.
In this case, the parameter is used to *define* the docker environment.

On Mon, Jan 11, 2016 at 9:07 PM, benley@gmail.com <be...@gmail.com> wrote:

> As a starting point, you might be able to cook up something involving
> {{mesos.instance}} as a lookup key to a pystachio list.  You do have a
> unique integer task number per instance to work with.
>
> cf.
>
> http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces
>
> On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org> wrote:
>
> > I agree that this appears necessary when parameters are needed to define
> > the runtime environment of the task (in this case, setting up the docker
> > container).
> >
> > What's particularly interesting here is that this would call for the
> > scheduler to fill in the parameter values prior to launching each task.
> > Using pystachio variables for this is certainly the most natural in the
> > DSL, but becomes a paradigm shift since the scheduler is currently
> ignorant
> > of pystachio.
> >
> > Possibly only worth mentioning for shock value, but in the DSL this
> starts
> > to look like lambdas pretty quickly.
> >
> > On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> > mauriciogaravaglia@gmail.com> wrote:
> >
> > > Hi guys,
> > >
> > > We are using the docker rbd volume plugin
> > > <
> >
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > > to
> > > provide persistent storage to the aurora jobs that runs in the
> > containers.
> > > Something like:
> > >
> > > p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
> > > jobs = [ Service(..., container = Container(docker = Docker(...,
> > parameters
> > > = p)))]
> > >
> > > But in the case of jobs with multiple instances it's required to start
> > each
> > > container using different volumes, in our case different ceph images.
> > This
> > > could be achieved by deploying, for example, 10 instances and then
> update
> > > each one independently to use the appropiate volume. Of course this is
> > > quite inconvenient, error prone, and adds a lot of logic and state
> > outside
> > > aurora.
> > >
> > > We where thinking if it would make sense to have a way to parameterize
> > the
> > > task instances, in a similar way that is done with portmapping for
> > example.
> > > In the job definition have something like
> > >
> > > params = [
> > >   Parameter( name='volume',
> > > value='service-{{instanceParameters.volume}}:/foo' )
> > > ]
> > > ...
> > > jobs = [
> > >   Service(
> > >     name = 'logstash',
> > >     ...
> > >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> > >     instances = 3,
> > >     container = Container(
> > >       docker = Docker(
> > >         image = 'image',
> > >         parameters = params
> > >       )
> > >     )
> > >   )
> > > ]
> > >
> > >
> > > Something like that, it would create 3 instances of the tasks, each one
> > > running in a container that uses the volumes foo, bar, and zaa.
> > >
> > > Does it make sense? I'd be glad to work on it but I want to validate
> the
> > > idea with you first and hear comments about the api/implementation.
> > >
> > > Thanks
> > >
> > >
> > > Mauricio
> > >
> >
>

Re: Parameterize each Job Instance.

Posted by "benley@gmail.com" <be...@gmail.com>.

As a starting point, you might be able to cook up something involving
{{mesos.instance}} as a lookup key to a pystachio list.  You do have a
unique integer task number per instance to work with.

cf.
http://aurora.apache.org/documentation/latest/configuration-reference/#template-namespaces

On Mon, Jan 11, 2016 at 8:05 PM Bill Farner <wf...@apache.org> wrote:

> I agree that this appears necessary when parameters are needed to define
> the runtime environment of the task (in this case, setting up the docker
> container).
>
> What's particularly interesting here is that this would call for the
> scheduler to fill in the parameter values prior to launching each task.
> Using pystachio variables for this is certainly the most natural in the
> DSL, but becomes a paradigm shift since the scheduler is currently ignorant
> of pystachio.
>
> Possibly only worth mentioning for shock value, but in the DSL this starts
> to look like lambdas pretty quickly.
>
> On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
> mauriciogaravaglia@gmail.com> wrote:
>
> > Hi guys,
> >
> > We are using the docker rbd volume plugin
> > <
> https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> > to
> > provide persistent storage to the aurora jobs that runs in the
> containers.
> > Something like:
> >
> > p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
> > jobs = [ Service(..., container = Container(docker = Docker(...,
> parameters
> > = p)))]
> >
> > But in the case of jobs with multiple instances it's required to start
> each
> > container using different volumes, in our case different ceph images.
> This
> > could be achieved by deploying, for example, 10 instances and then update
> > each one independently to use the appropiate volume. Of course this is
> > quite inconvenient, error prone, and adds a lot of logic and state
> outside
> > aurora.
> >
> > We where thinking if it would make sense to have a way to parameterize
> the
> > task instances, in a similar way that is done with portmapping for
> example.
> > In the job definition have something like
> >
> > params = [
> >   Parameter( name='volume',
> > value='service-{{instanceParameters.volume}}:/foo' )
> > ]
> > ...
> > jobs = [
> >   Service(
> >     name = 'logstash',
> >     ...
> >     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
> >     instances = 3,
> >     container = Container(
> >       docker = Docker(
> >         image = 'image',
> >         parameters = params
> >       )
> >     )
> >   )
> > ]
> >
> >
> > Something like that, it would create 3 instances of the tasks, each one
> > running in a container that uses the volumes foo, bar, and zaa.
> >
> > Does it make sense? I'd be glad to work on it but I want to validate the
> > idea with you first and hear comments about the api/implementation.
> >
> > Thanks
> >
> >
> > Mauricio
> >
>

Re: Parameterize each Job Instance.

Posted by Bill Farner <wf...@apache.org>.

I agree that this appears necessary when parameters are needed to define
the runtime environment of the task (in this case, setting up the docker
container).

What's particularly interesting here is that this would call for the
scheduler to fill in the parameter values prior to launching each task.
Using pystachio variables for this is certainly the most natural in the
DSL, but becomes a paradigm shift since the scheduler is currently ignorant
of pystachio.

Possibly only worth mentioning for shock value, but in the DSL this starts
to look like lambdas pretty quickly.

On Mon, Jan 11, 2016 at 7:46 PM, Mauricio Garavaglia <
mauriciogaravaglia@gmail.com> wrote:

> Hi guys,
>
> We are using the docker rbd volume plugin
> <https://ceph.com/planet/getting-started-with-the-docker-rbd-volume-plugin>
> to
> provide persistent storage to the aurora jobs that runs in the containers.
> Something like:
>
> p = [Parameter(name='volume', value='my-ceph-volume:/foo'), ...]
> jobs = [ Service(..., container = Container(docker = Docker(..., parameters
> = p)))]
>
> But in the case of jobs with multiple instances it's required to start each
> container using different volumes, in our case different ceph images. This
> could be achieved by deploying, for example, 10 instances and then update
> each one independently to use the appropiate volume. Of course this is
> quite inconvenient, error prone, and adds a lot of logic and state outside
> aurora.
>
> We where thinking if it would make sense to have a way to parameterize the
> task instances, in a similar way that is done with portmapping for example.
> In the job definition have something like
>
> params = [
>   Parameter( name='volume',
> value='service-{{instanceParameters.volume}}:/foo' )
> ]
> ...
> jobs = [
>   Service(
>     name = 'logstash',
>     ...
>     instanceParameters = { "volume" : ["foo", "bar", "zaa"]},
>     instances = 3,
>     container = Container(
>       docker = Docker(
>         image = 'image',
>         parameters = params
>       )
>     )
>   )
> ]
>
>
> Something like that, it would create 3 instances of the tasks, each one
> running in a container that uses the volumes foo, bar, and zaa.
>
> Does it make sense? I'd be glad to work on it but I want to validate the
> idea with you first and hear comments about the api/implementation.
>
> Thanks
>
>
> Mauricio
>