You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@aurora.apache.org by Igor Morozov <ig...@gmail.com> on 2016/06/16 20:28:47 UTC

Few things we would like to support in aurora scheduler

Hi aurora people,

I would like to start a discussion around few things we would like to see
supported in aurora scheduler. It is based on our experience of integrating
aurora into Uber infrastructure and I believe all the items I'm going to
talk about will benefit the community and people running aurora clusters.

1. We support multiple aurora clusters in different failure domains and we
run services in those domains. The upgrade workflow for those services
includes rolling out the same version of a service software to all aurora
clusters concurrently while monitoring the health status and other service
vitals that includes like checking error logs, service stats,
downstream/upstream services health. That means we occasionally need to
manually trigger a rollback if things go south and rollback all the update
jobs in all aurora clusters for that particular service. So here are the
problems we discovered so far with this approach:

       - We don't have an easy way to assign a common unique identifier for
all JobUpdates in different aurora clusters in order to reconcile them
later into a single meta update job so to speak. Instead we need to
generate that ID and keep it in every aurora's JobUpdate
metadata(JobUpdateRequest.taskConfig). Then in order to get the status the
upgrade workflow running in different data centers we have to query all
recent jobs and based on their metadata content try to filter in ones that
we thing belongs to a currently running upgrade for the service.

We propose to change
struct JobUpdateRequest {
  /** Desired TaskConfig to apply. */
  1: TaskConfig taskConfig

  /** Desired number of instances of the task config. */
  2: i32 instanceCount

  /** Update settings and limits. */
  3: JobUpdateSettings settings

*  /**Optional Job Update key's id, if not specified aurora will generate
one**/*

*  4: optional string id*}

There is potentially another much more involved solution of supporting user
defined metadata mentioned in this ticket:
https://issues.apache.org/jira/browse/AURORA-1711


    -  All that brings us to a second problem we had to deal with during
the upgrade:
We don't have a good way to manually trigger a job update rollback in
aurora. The use case is again the same, while running multiple update jobs
in different aurora clusters we have a real production requirement to start
rolling back update jobs if things are misbehaving and the nature of this
misbehavior could be potentially very complex. Currently we abort the job
update and start a new one that would essentially roll cluster forward to a
previously run version of the software.

We propose a new convenience API to rollback a running or complete
JobUpdate:

*  /**Rollback job update. */*
*  Response rollbackJobUpdate(*
*      /** The update to rollback. */*
*      1: JobUpdateKey key,*
*      /** A user-specified message to include with the induced job update
state change. */*
*      3: string message)*

2. The next problem is related to the way we collect  service cluster
status. I couldn't find a way to quickly get latest statuses for all
instances/shards for a job in one query. Instead we query all task statuses
for a job, them manually iterate through all the statuses and filter the
latest one in grouped by instance ids. For services with lots of churn on
tasks statuses that means huge blobs of thrift transferred every time we
issue a query. I was thinking adding something in this line:
struct TaskQuery {
  // TODO(maxim): Remove in 0.7.0. (AURORA-749)
  8: Identity owner
  14: string role
  9: string environment
  2: string jobName
  4: set<string> taskIds
  5: set<ScheduleStatus> statuses
  7: set<i32> instanceIds
  10: set<string> slaveHosts
  11: set<JobKey> jobKeys
  12: i32 offset
  13: i32 limit
*  14: i32 limit_per_instance*
}

but I'm less certain on API here so any help would be welcome.

All the changes we propose would be backward compatible.

-- 
-Igor

Re: Few things we would like to support in aurora scheduler

Posted by Igor Morozov <ig...@gmail.com>.

>
>
>> +1 to the idea, but there is ambiguity in what rollback means when you
>> pass
>> a JobUpdateKey.
>>
>> Example:
>>
>> *undoJobUpdate* (it would replay the previousState instructions from the
>> given job update)
>> *rollbackToJobUpdate* (you'd pass the JobUpdateKey and it would replay the
>> instructions from that job update)
>>
>> Correct me if I'm missing something here but if we assume the semantic of
>> startJobUpdate defined as moving job instances from initial state A to
>> desired state B then rollbackJobUpdate will be a complete inverse operation
>> for this procedure.
>
>
Sorry so that would mean the "undoJobUpdate" semantic.

>
>>
>>
>>
>
>
> --
> -Igor
>



-- 
-Igor

Re: Few things we would like to support in aurora scheduler

Posted by Igor Morozov <ig...@gmail.com>.

On Thu, Jun 16, 2016 at 7:26 PM, David McLaughlin <dm...@apache.org>
wrote:

> On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <ig...@gmail.com> wrote:
>
> > Hi aurora people,
> >
> > I would like to start a discussion around few things we would like to see
> > supported in aurora scheduler. It is based on our experience of
> integrating
> > aurora into Uber infrastructure and I believe all the items I'm going to
> > talk about will benefit the community and people running aurora clusters.
> >
> > 1. We support multiple aurora clusters in different failure domains and
> we
> > run services in those domains. The upgrade workflow for those services
> > includes rolling out the same version of a service software to all aurora
> > clusters concurrently while monitoring the health status and other
> service
> > vitals that includes like checking error logs, service stats,
> > downstream/upstream services health. That means we occasionally need to
> > manually trigger a rollback if things go south and rollback all the
> update
> > jobs in all aurora clusters for that particular service. So here are the
> > problems we discovered so far with this approach:
> >
> >        - We don't have an easy way to assign a common unique identifier
> for
> > all JobUpdates in different aurora clusters in order to reconcile them
> > later into a single meta update job so to speak. Instead we need to
> > generate that ID and keep it in every aurora's JobUpdate
> > metadata(JobUpdateRequest.taskConfig). Then in order to get the status
> the
> > upgrade workflow running in different data centers we have to query all
> > recent jobs and based on their metadata content try to filter in ones
> that
> > we thing belongs to a currently running upgrade for the service.
> >
> > We propose to change
> > struct JobUpdateRequest {
> >   /** Desired TaskConfig to apply. */
> >   1: TaskConfig taskConfig
> >
> >   /** Desired number of instances of the task config. */
> >   2: i32 instanceCount
> >
> >   /** Update settings and limits. */
> >   3: JobUpdateSettings settings
> >
> > *  /**Optional Job Update key's id, if not specified aurora will generate
> > one**/*
> >
> > *  4: optional string id*}
> >
> > There is potentially another much more involved solution of supporting
> user
> > defined metadata mentioned in this ticket:
> > https://issues.apache.org/jira/browse/AURORA-1711
>
>
>
> I actually think the linked ticket is less involved? It has no impact on
> logic, etc. So the work involved is just updating the Thrift object and
> then writing in the metadata to the storage layer. But I'm fine with either
> (or both!) approaches.
>
> I agree that both approached would work, I guess more important part here
> for us is ability to query JobUpdates by their metadata or combination of
> metadata fields



>
>
> >
> >
> >
> >     -  All that brings us to a second problem we had to deal with during
> > the upgrade:
> > We don't have a good way to manually trigger a job update rollback in
> > aurora. The use case is again the same, while running multiple update
> jobs
> > in different aurora clusters we have a real production requirement to
> start
> > rolling back update jobs if things are misbehaving and the nature of this
> > misbehavior could be potentially very complex. Currently we abort the job
> > update and start a new one that would essentially roll cluster forward
> to a
> > previously run version of the software.
> >
> > We propose a new convenience API to rollback a running or complete
> > JobUpdate:
> >
> > *  /**Rollback job update. */*
> > *  Response rollbackJobUpdate(*
> > *      /** The update to rollback. */*
> > *      1: JobUpdateKey key,*
> > *      /** A user-specified message to include with the induced job
> update
> > state change. */*
> > *      3: string message)*
>
>
>
>
> +1 to the idea, but there is ambiguity in what rollback means when you pass
> a JobUpdateKey.
>
> Example:
>
> *undoJobUpdate* (it would replay the previousState instructions from the
> given job update)
> *rollbackToJobUpdate* (you'd pass the JobUpdateKey and it would replay the
> instructions from that job update)
>
> Correct me if I'm missing something here but if we assume the semantic of
> startJobUpdate defined as moving job instances from initial state A to
> desired state B then rollbackJobUpdate will be a complete inverse operation
> for this procedure.


>
> > 2. The next problem is related to the way we collect  service cluster
> > status. I couldn't find a way to quickly get latest statuses for all
> > instances/shards for a job in one query. Instead we query all task
> statuses
> > for a job, them manually iterate through all the statuses and filter the
> > latest one in grouped by instance ids. For services with lots of churn on
> > tasks statuses that means huge blobs of thrift transferred every time we
> > issue a query. I was thinking adding something in this line:
> > struct TaskQuery {
> >   // TODO(maxim): Remove in 0.7.0. (AURORA-749)
> >   8: Identity owner
> >   14: string role
> >   9: string environment
> >   2: string jobName
> >   4: set<string> taskIds
> >   5: set<ScheduleStatus> statuses
> >   7: set<i32> instanceIds
> >   10: set<string> slaveHosts
> >   11: set<JobKey> jobKeys
> >   12: i32 offset
> >   13: i32 limit
> > *  14: i32 limit_per_instance*
> > }
> >
> > but I'm less certain on API here so any help would be welcome.
> >
> > All the changes we propose would be backward compatible.
>
>
> > --
> > -Igor
> >
>



-- 
-Igor

Re: Few things we would like to support in aurora scheduler

Posted by David McLaughlin <dm...@apache.org>.

On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <ig...@gmail.com> wrote:

> Hi aurora people,
>
> I would like to start a discussion around few things we would like to see
> supported in aurora scheduler. It is based on our experience of integrating
> aurora into Uber infrastructure and I believe all the items I'm going to
> talk about will benefit the community and people running aurora clusters.
>
> 1. We support multiple aurora clusters in different failure domains and we
> run services in those domains. The upgrade workflow for those services
> includes rolling out the same version of a service software to all aurora
> clusters concurrently while monitoring the health status and other service
> vitals that includes like checking error logs, service stats,
> downstream/upstream services health. That means we occasionally need to
> manually trigger a rollback if things go south and rollback all the update
> jobs in all aurora clusters for that particular service. So here are the
> problems we discovered so far with this approach:
>
>        - We don't have an easy way to assign a common unique identifier for
> all JobUpdates in different aurora clusters in order to reconcile them
> later into a single meta update job so to speak. Instead we need to
> generate that ID and keep it in every aurora's JobUpdate
> metadata(JobUpdateRequest.taskConfig). Then in order to get the status the
> upgrade workflow running in different data centers we have to query all
> recent jobs and based on their metadata content try to filter in ones that
> we thing belongs to a currently running upgrade for the service.
>
> We propose to change
> struct JobUpdateRequest {
>   /** Desired TaskConfig to apply. */
>   1: TaskConfig taskConfig
>
>   /** Desired number of instances of the task config. */
>   2: i32 instanceCount
>
>   /** Update settings and limits. */
>   3: JobUpdateSettings settings
>
> *  /**Optional Job Update key's id, if not specified aurora will generate
> one**/*
>
> *  4: optional string id*}
>
> There is potentially another much more involved solution of supporting user
> defined metadata mentioned in this ticket:
> https://issues.apache.org/jira/browse/AURORA-1711



I actually think the linked ticket is less involved? It has no impact on
logic, etc. So the work involved is just updating the Thrift object and
then writing in the metadata to the storage layer. But I'm fine with either
(or both!) approaches.



>
>
>
>     -  All that brings us to a second problem we had to deal with during
> the upgrade:
> We don't have a good way to manually trigger a job update rollback in
> aurora. The use case is again the same, while running multiple update jobs
> in different aurora clusters we have a real production requirement to start
> rolling back update jobs if things are misbehaving and the nature of this
> misbehavior could be potentially very complex. Currently we abort the job
> update and start a new one that would essentially roll cluster forward to a
> previously run version of the software.
>
> We propose a new convenience API to rollback a running or complete
> JobUpdate:
>
> *  /**Rollback job update. */*
> *  Response rollbackJobUpdate(*
> *      /** The update to rollback. */*
> *      1: JobUpdateKey key,*
> *      /** A user-specified message to include with the induced job update
> state change. */*
> *      3: string message)*




+1 to the idea, but there is ambiguity in what rollback means when you pass
a JobUpdateKey.

Example:

*undoJobUpdate* (it would replay the previousState instructions from the
given job update)
*rollbackToJobUpdate* (you'd pass the JobUpdateKey and it would replay the
instructions from that job update)




> 2. The next problem is related to the way we collect  service cluster
> status. I couldn't find a way to quickly get latest statuses for all
> instances/shards for a job in one query. Instead we query all task statuses
> for a job, them manually iterate through all the statuses and filter the
> latest one in grouped by instance ids. For services with lots of churn on
> tasks statuses that means huge blobs of thrift transferred every time we
> issue a query. I was thinking adding something in this line:
> struct TaskQuery {
>   // TODO(maxim): Remove in 0.7.0. (AURORA-749)
>   8: Identity owner
>   14: string role
>   9: string environment
>   2: string jobName
>   4: set<string> taskIds
>   5: set<ScheduleStatus> statuses
>   7: set<i32> instanceIds
>   10: set<string> slaveHosts
>   11: set<JobKey> jobKeys
>   12: i32 offset
>   13: i32 limit
> *  14: i32 limit_per_instance*
> }
>
> but I'm less certain on API here so any help would be welcome.
>
> All the changes we propose would be backward compatible.


> --
> -Igor
>

Re: Few things we would like to support in aurora scheduler

Posted by Igor Morozov <ig...@gmail.com>.

I created two tickets to track the discussion there:

https://issues.apache.org/jira/browse/AURORA-1721
https://issues.apache.org/jira/browse/AURORA-1722

I'm willing to work on rollback and potentially (depending on a result of
the discussion) on adding TaskQuery flag.

Thanks,
-Igor

On Sun, Jun 19, 2016 at 8:24 AM, Erb, Stephan <St...@blue-yonder.com>
wrote:

>
> >> The next problem is related to the way we collect  service cluster
> >> status. I couldn't find a way to quickly get latest statuses for all
> >> instances/shards for a job in one query. Instead we query all task
> statuses
> >> for a job, them manually iterate through all the statuses and filter the
> >> latest one in grouped by instance ids. For services with lots of churn
> on
> >> tasks statuses that means huge blobs of thrift transferred every time we
> > issue a query. I was thinking adding something in this line:
> >
> >
> >Does a TaskQuery filtering by job key and ACTIVE_STATES solve this?  Still
> >includes the TaskConfig, but it's a single query, and probably rarely
> >exceeds 1 MB in response payload.
>
> We have a related problem, where we are interested in the status of the
> last executed cron job. Unfortunately, ACTIVE_STATES don’t help here. One
> potential solution I have thought about was a flag in TaskQuery for
> enabling server-side sorting of tasks by their latest event time. We could
> then query the status of the latest run by using this flag in combination
> with limit=1. This could also be composed by the limit_per_instance flag to
> guarantee the usecase mentioned here.
>
>
>
> On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <ig...@gmail.com> wrote:
>
> > Hi aurora people,
> >
> > I would like to start a discussion around few things we would like to see
> > supported in aurora scheduler. It is based on our experience of
> integrating
> > aurora into Uber infrastructure and I believe all the items I'm going to
> > talk about will benefit the community and people running aurora clusters.
> >
> > 1. We support multiple aurora clusters in different failure domains and
> we
> > run services in those domains. The upgrade workflow for those services
> > includes rolling out the same version of a service software to all aurora
> > clusters concurrently while monitoring the health status and other
> service
> > vitals that includes like checking error logs, service stats,
> > downstream/upstream services health. That means we occasionally need to
> > manually trigger a rollback if things go south and rollback all the
> update
> > jobs in all aurora clusters for that particular service. So here are the
> > problems we discovered so far with this approach:
> >
> >        - We don't have an easy way to assign a common unique identifier
> for
> > all JobUpdates in different aurora clusters in order to reconcile them
> > later into a single meta update job so to speak. Instead we need to
> > generate that ID and keep it in every aurora's JobUpdate
> > metadata(JobUpdateRequest.taskConfig). Then in order to get the status
> the
> > upgrade workflow running in different data centers we have to query all
> > recent jobs and based on their metadata content try to filter in ones
> that
> > we thing belongs to a currently running upgrade for the service.
> >
> > We propose to change
> > struct JobUpdateRequest {
> >   /** Desired TaskConfig to apply. */
> >   1: TaskConfig taskConfig
> >
> >   /** Desired number of instances of the task config. */
> >   2: i32 instanceCount
> >
> >   /** Update settings and limits. */
> >   3: JobUpdateSettings settings
> >
> > *  /**Optional Job Update key's id, if not specified aurora will generate
> > one**/*
> >
> > *  4: optional string id*}
> >
> > There is potentially another much more involved solution of supporting
> user
> > defined metadata mentioned in this ticket:
> > https://issues.apache.org/jira/browse/AURORA-1711
> >
> >
> >     -  All that brings us to a second problem we had to deal with during
> > the upgrade:
> > We don't have a good way to manually trigger a job update rollback in
> > aurora. The use case is again the same, while running multiple update
> jobs
> > in different aurora clusters we have a real production requirement to
> start
> > rolling back update jobs if things are misbehaving and the nature of this
> > misbehavior could be potentially very complex. Currently we abort the job
> > update and start a new one that would essentially roll cluster forward
> to a
> > previously run version of the software.
> >
> > We propose a new convenience API to rollback a running or complete
> > JobUpdate:
> >
> > *  /**Rollback job update. */*
> > *  Response rollbackJobUpdate(*
> > *      /** The update to rollback. */*
> > *      1: JobUpdateKey key,*
> > *      /** A user-specified message to include with the induced job
> update
> > state change. */*
> > *      3: string message)*
> >
> > 2. The next problem is related to the way we collect  service cluster
> > status. I couldn't find a way to quickly get latest statuses for all
> > instances/shards for a job in one query. Instead we query all task
> statuses
> > for a job, them manually iterate through all the statuses and filter the
> > latest one in grouped by instance ids. For services with lots of churn on
> > tasks statuses that means huge blobs of thrift transferred every time we
> > issue a query. I was thinking adding something in this line:
> > struct TaskQuery {
> >   // TODO(maxim): Remove in 0.7.0. (AURORA-749)
> >   8: Identity owner
> >   14: string role
> >   9: string environment
> >   2: string jobName
> >   4: set<string> taskIds
> >   5: set<ScheduleStatus> statuses
> >   7: set<i32> instanceIds
> >   10: set<string> slaveHosts
> >   11: set<JobKey> jobKeys
> >   12: i32 offset
> >   13: i32 limit
> > *  14: i32 limit_per_instance*
> > }
> >
> > but I'm less certain on API here so any help would be welcome.
> >
> > All the changes we propose would be backward compatible.
> >
> > --
> > -Igor
> >
>
>
>


-- 
-Igor

Re: Few things we would like to support in aurora scheduler

Posted by "Erb, Stephan" <St...@blue-yonder.com>.

>> The next problem is related to the way we collect  service cluster
>> status. I couldn't find a way to quickly get latest statuses for all
>> instances/shards for a job in one query. Instead we query all task statuses
>> for a job, them manually iterate through all the statuses and filter the
>> latest one in grouped by instance ids. For services with lots of churn on
>> tasks statuses that means huge blobs of thrift transferred every time we
> issue a query. I was thinking adding something in this line:
>
>
>Does a TaskQuery filtering by job key and ACTIVE_STATES solve this?  Still
>includes the TaskConfig, but it's a single query, and probably rarely
>exceeds 1 MB in response payload.

We have a related problem, where we are interested in the status of the last executed cron job. Unfortunately, ACTIVE_STATES don’t help here. One potential solution I have thought about was a flag in TaskQuery for enabling server-side sorting of tasks by their latest event time. We could then query the status of the latest run by using this flag in combination with limit=1. This could also be composed by the limit_per_instance flag to guarantee the usecase mentioned here.



On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <ig...@gmail.com> wrote:

> Hi aurora people,
>
> I would like to start a discussion around few things we would like to see
> supported in aurora scheduler. It is based on our experience of integrating
> aurora into Uber infrastructure and I believe all the items I'm going to
> talk about will benefit the community and people running aurora clusters.
>
> 1. We support multiple aurora clusters in different failure domains and we
> run services in those domains. The upgrade workflow for those services
> includes rolling out the same version of a service software to all aurora
> clusters concurrently while monitoring the health status and other service
> vitals that includes like checking error logs, service stats,
> downstream/upstream services health. That means we occasionally need to
> manually trigger a rollback if things go south and rollback all the update
> jobs in all aurora clusters for that particular service. So here are the
> problems we discovered so far with this approach:
>
>        - We don't have an easy way to assign a common unique identifier for
> all JobUpdates in different aurora clusters in order to reconcile them
> later into a single meta update job so to speak. Instead we need to
> generate that ID and keep it in every aurora's JobUpdate
> metadata(JobUpdateRequest.taskConfig). Then in order to get the status the
> upgrade workflow running in different data centers we have to query all
> recent jobs and based on their metadata content try to filter in ones that
> we thing belongs to a currently running upgrade for the service.
>
> We propose to change
> struct JobUpdateRequest {
>   /** Desired TaskConfig to apply. */
>   1: TaskConfig taskConfig
>
>   /** Desired number of instances of the task config. */
>   2: i32 instanceCount
>
>   /** Update settings and limits. */
>   3: JobUpdateSettings settings
>
> *  /**Optional Job Update key's id, if not specified aurora will generate
> one**/*
>
> *  4: optional string id*}
>
> There is potentially another much more involved solution of supporting user
> defined metadata mentioned in this ticket:
> https://issues.apache.org/jira/browse/AURORA-1711
>
>
>     -  All that brings us to a second problem we had to deal with during
> the upgrade:
> We don't have a good way to manually trigger a job update rollback in
> aurora. The use case is again the same, while running multiple update jobs
> in different aurora clusters we have a real production requirement to start
> rolling back update jobs if things are misbehaving and the nature of this
> misbehavior could be potentially very complex. Currently we abort the job
> update and start a new one that would essentially roll cluster forward to a
> previously run version of the software.
>
> We propose a new convenience API to rollback a running or complete
> JobUpdate:
>
> *  /**Rollback job update. */*
> *  Response rollbackJobUpdate(*
> *      /** The update to rollback. */*
> *      1: JobUpdateKey key,*
> *      /** A user-specified message to include with the induced job update
> state change. */*
> *      3: string message)*
>
> 2. The next problem is related to the way we collect  service cluster
> status. I couldn't find a way to quickly get latest statuses for all
> instances/shards for a job in one query. Instead we query all task statuses
> for a job, them manually iterate through all the statuses and filter the
> latest one in grouped by instance ids. For services with lots of churn on
> tasks statuses that means huge blobs of thrift transferred every time we
> issue a query. I was thinking adding something in this line:
> struct TaskQuery {
>   // TODO(maxim): Remove in 0.7.0. (AURORA-749)
>   8: Identity owner
>   14: string role
>   9: string environment
>   2: string jobName
>   4: set<string> taskIds
>   5: set<ScheduleStatus> statuses
>   7: set<i32> instanceIds
>   10: set<string> slaveHosts
>   11: set<JobKey> jobKeys
>   12: i32 offset
>   13: i32 limit
> *  14: i32 limit_per_instance*
> }
>
> but I'm less certain on API here so any help would be welcome.
>
> All the changes we propose would be backward compatible.
>
> --
> -Igor
>

Re: Few things we would like to support in aurora scheduler

Posted by Igor Morozov <ig...@gmail.com>.

On Thu, Jun 16, 2016 at 5:20 PM, Bill Farner <wf...@apache.org> wrote:

> >
> > We don't have an easy way to assign a common unique identifier for
> > all JobUpdates in different aurora clusters in order to reconcile them
> > later into a single meta update job so to speak. Instead we need to
> > generate that ID and keep it in every aurora's JobUpdate
> > metadata(JobUpdateRequest.taskConfig). Then in order to get the status
> the
> > upgrade workflow running in different data centers we have to query all
> > recent jobs and based on their metadata content try to filter in ones
> that
> > we thing belongs to a currently running upgrade for the service.
>
>
> Can you elaborate on the shortcoming of using TaskConfig.metadata?  From a
> quick read, it seems like your proposal does with an explicit field what
> you can accomplish with the more versatile metadata field.  For example,
> you could store a git commit SHA in TaskConfig.metadata, and identify the
> commit in use by each instance as well as track the revision changes when a
> job is updated.
>
> Sure we could use TaskConfig's metadata. In fact that's what we do now but
it's not pretty to query JobUpdate that we need in this case. We query some
reasonable window of job updates, iterate through their metadata to select
the one we want with the most recently updated_at timestamp and have to pay
transfer penalty.
The reason I'm so fixated on the size of aurora response is that our
deployment system is built in way it pulls the state of a service from
different sources periodically which while heavily cached still causing us
lots of traffic to aurora.

However, i feel like i may be missing some context as "query all recent
> jobs" sounds like a broader query scope than i would expect.
>
> We propose a new convenience API to rollback a running or complete
> > JobUpdate:
> > *  /**Rollback job update. */*
> > *  Response rollbackJobUpdate(*
> > *      /** The update to rollback. */*
> > *      1: JobUpdateKey key,*
> > *      /** A user-specified message to include with the induced job
> update
> > state change. */*
> > *      3: string message)*
>
>
> I think this is a great idea!  It's something i've thought about for a
> while, but haven't really had the personal need.
>
> The next problem is related to the way we collect  service cluster
> > status. I couldn't find a way to quickly get latest statuses for all
> > instances/shards for a job in one query. Instead we query all task
> statuses
> > for a job, them manually iterate through all the statuses and filter the
> > latest one in grouped by instance ids. For services with lots of churn on
> > tasks statuses that means huge blobs of thrift transferred every time we
> > issue a query. I was thinking adding something in this line:
>
>
> Does a TaskQuery filtering by job key and ACTIVE_STATES solve this?  Still
> includes the TaskConfig, but it's a single query, and probably rarely
> exceeds 1 MB in response payload.
>
> Yes but we also need error statuses, a user can quickly examine the state
of a service and go for failed instances to get their errors (we pull all
error logs from mesos's sandbox directly) hence have to query everything.

>
> On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <ig...@gmail.com> wrote:
>
> > Hi aurora people,
> >
> > I would like to start a discussion around few things we would like to see
> > supported in aurora scheduler. It is based on our experience of
> integrating
> > aurora into Uber infrastructure and I believe all the items I'm going to
> > talk about will benefit the community and people running aurora clusters.
> >
> > 1. We support multiple aurora clusters in different failure domains and
> we
> > run services in those domains. The upgrade workflow for those services
> > includes rolling out the same version of a service software to all aurora
> > clusters concurrently while monitoring the health status and other
> service
> > vitals that includes like checking error logs, service stats,
> > downstream/upstream services health. That means we occasionally need to
> > manually trigger a rollback if things go south and rollback all the
> update
> > jobs in all aurora clusters for that particular service. So here are the
> > problems we discovered so far with this approach:
> >
> >        - We don't have an easy way to assign a common unique identifier
> for
> > all JobUpdates in different aurora clusters in order to reconcile them
> > later into a single meta update job so to speak. Instead we need to
> > generate that ID and keep it in every aurora's JobUpdate
> > metadata(JobUpdateRequest.taskConfig). Then in order to get the status
> the
> > upgrade workflow running in different data centers we have to query all
> > recent jobs and based on their metadata content try to filter in ones
> that
> > we thing belongs to a currently running upgrade for the service.
> >
> > We propose to change
> > struct JobUpdateRequest {
> >   /** Desired TaskConfig to apply. */
> >   1: TaskConfig taskConfig
> >
> >   /** Desired number of instances of the task config. */
> >   2: i32 instanceCount
> >
> >   /** Update settings and limits. */
> >   3: JobUpdateSettings settings
> >
> > *  /**Optional Job Update key's id, if not specified aurora will generate
> > one**/*
> >
> > *  4: optional string id*}
> >
> > There is potentially another much more involved solution of supporting
> user
> > defined metadata mentioned in this ticket:
> > https://issues.apache.org/jira/browse/AURORA-1711
> >
> >
> >     -  All that brings us to a second problem we had to deal with during
> > the upgrade:
> > We don't have a good way to manually trigger a job update rollback in
> > aurora. The use case is again the same, while running multiple update
> jobs
> > in different aurora clusters we have a real production requirement to
> start
> > rolling back update jobs if things are misbehaving and the nature of this
> > misbehavior could be potentially very complex. Currently we abort the job
> > update and start a new one that would essentially roll cluster forward
> to a
> > previously run version of the software.
> >
> > We propose a new convenience API to rollback a running or complete
> > JobUpdate:
> >
> > *  /**Rollback job update. */*
> > *  Response rollbackJobUpdate(*
> > *      /** The update to rollback. */*
> > *      1: JobUpdateKey key,*
> > *      /** A user-specified message to include with the induced job
> update
> > state change. */*
> > *      3: string message)*
> >
> > 2. The next problem is related to the way we collect  service cluster
> > status. I couldn't find a way to quickly get latest statuses for all
> > instances/shards for a job in one query. Instead we query all task
> statuses
> > for a job, them manually iterate through all the statuses and filter the
> > latest one in grouped by instance ids. For services with lots of churn on
> > tasks statuses that means huge blobs of thrift transferred every time we
> > issue a query. I was thinking adding something in this line:
> > struct TaskQuery {
> >   // TODO(maxim): Remove in 0.7.0. (AURORA-749)
> >   8: Identity owner
> >   14: string role
> >   9: string environment
> >   2: string jobName
> >   4: set<string> taskIds
> >   5: set<ScheduleStatus> statuses
> >   7: set<i32> instanceIds
> >   10: set<string> slaveHosts
> >   11: set<JobKey> jobKeys
> >   12: i32 offset
> >   13: i32 limit
> > *  14: i32 limit_per_instance*
> > }
> >
> > but I'm less certain on API here so any help would be welcome.
> >
> > All the changes we propose would be backward compatible.
> >
> > --
> > -Igor
> >
>



-- 
-Igor

Re: Few things we would like to support in aurora scheduler

Posted by Bill Farner <wf...@apache.org>.

>
> We don't have an easy way to assign a common unique identifier for
> all JobUpdates in different aurora clusters in order to reconcile them
> later into a single meta update job so to speak. Instead we need to
> generate that ID and keep it in every aurora's JobUpdate
> metadata(JobUpdateRequest.taskConfig). Then in order to get the status the
> upgrade workflow running in different data centers we have to query all
> recent jobs and based on their metadata content try to filter in ones that
> we thing belongs to a currently running upgrade for the service.


Can you elaborate on the shortcoming of using TaskConfig.metadata?  From a
quick read, it seems like your proposal does with an explicit field what
you can accomplish with the more versatile metadata field.  For example,
you could store a git commit SHA in TaskConfig.metadata, and identify the
commit in use by each instance as well as track the revision changes when a
job is updated.

However, i feel like i may be missing some context as "query all recent
jobs" sounds like a broader query scope than i would expect.

We propose a new convenience API to rollback a running or complete
> JobUpdate:
> *  /**Rollback job update. */*
> *  Response rollbackJobUpdate(*
> *      /** The update to rollback. */*
> *      1: JobUpdateKey key,*
> *      /** A user-specified message to include with the induced job update
> state change. */*
> *      3: string message)*


I think this is a great idea!  It's something i've thought about for a
while, but haven't really had the personal need.

The next problem is related to the way we collect  service cluster
> status. I couldn't find a way to quickly get latest statuses for all
> instances/shards for a job in one query. Instead we query all task statuses
> for a job, them manually iterate through all the statuses and filter the
> latest one in grouped by instance ids. For services with lots of churn on
> tasks statuses that means huge blobs of thrift transferred every time we
> issue a query. I was thinking adding something in this line:


Does a TaskQuery filtering by job key and ACTIVE_STATES solve this?  Still
includes the TaskConfig, but it's a single query, and probably rarely
exceeds 1 MB in response payload.


On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <ig...@gmail.com> wrote:

> Hi aurora people,
>
> I would like to start a discussion around few things we would like to see
> supported in aurora scheduler. It is based on our experience of integrating
> aurora into Uber infrastructure and I believe all the items I'm going to
> talk about will benefit the community and people running aurora clusters.
>
> 1. We support multiple aurora clusters in different failure domains and we
> run services in those domains. The upgrade workflow for those services
> includes rolling out the same version of a service software to all aurora
> clusters concurrently while monitoring the health status and other service
> vitals that includes like checking error logs, service stats,
> downstream/upstream services health. That means we occasionally need to
> manually trigger a rollback if things go south and rollback all the update
> jobs in all aurora clusters for that particular service. So here are the
> problems we discovered so far with this approach:
>
>        - We don't have an easy way to assign a common unique identifier for
> all JobUpdates in different aurora clusters in order to reconcile them
> later into a single meta update job so to speak. Instead we need to
> generate that ID and keep it in every aurora's JobUpdate
> metadata(JobUpdateRequest.taskConfig). Then in order to get the status the
> upgrade workflow running in different data centers we have to query all
> recent jobs and based on their metadata content try to filter in ones that
> we thing belongs to a currently running upgrade for the service.
>
> We propose to change
> struct JobUpdateRequest {
>   /** Desired TaskConfig to apply. */
>   1: TaskConfig taskConfig
>
>   /** Desired number of instances of the task config. */
>   2: i32 instanceCount
>
>   /** Update settings and limits. */
>   3: JobUpdateSettings settings
>
> *  /**Optional Job Update key's id, if not specified aurora will generate
> one**/*
>
> *  4: optional string id*}
>
> There is potentially another much more involved solution of supporting user
> defined metadata mentioned in this ticket:
> https://issues.apache.org/jira/browse/AURORA-1711
>
>
>     -  All that brings us to a second problem we had to deal with during
> the upgrade:
> We don't have a good way to manually trigger a job update rollback in
> aurora. The use case is again the same, while running multiple update jobs
> in different aurora clusters we have a real production requirement to start
> rolling back update jobs if things are misbehaving and the nature of this
> misbehavior could be potentially very complex. Currently we abort the job
> update and start a new one that would essentially roll cluster forward to a
> previously run version of the software.
>
> We propose a new convenience API to rollback a running or complete
> JobUpdate:
>
> *  /**Rollback job update. */*
> *  Response rollbackJobUpdate(*
> *      /** The update to rollback. */*
> *      1: JobUpdateKey key,*
> *      /** A user-specified message to include with the induced job update
> state change. */*
> *      3: string message)*
>
> 2. The next problem is related to the way we collect  service cluster
> status. I couldn't find a way to quickly get latest statuses for all
> instances/shards for a job in one query. Instead we query all task statuses
> for a job, them manually iterate through all the statuses and filter the
> latest one in grouped by instance ids. For services with lots of churn on
> tasks statuses that means huge blobs of thrift transferred every time we
> issue a query. I was thinking adding something in this line:
> struct TaskQuery {
>   // TODO(maxim): Remove in 0.7.0. (AURORA-749)
>   8: Identity owner
>   14: string role
>   9: string environment
>   2: string jobName
>   4: set<string> taskIds
>   5: set<ScheduleStatus> statuses
>   7: set<i32> instanceIds
>   10: set<string> slaveHosts
>   11: set<JobKey> jobKeys
>   12: i32 offset
>   13: i32 limit
> *  14: i32 limit_per_instance*
> }
>
> but I'm less certain on API here so any help would be welcome.
>
> All the changes we propose would be backward compatible.
>
> --
> -Igor
>