You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@aurora.apache.org by Maxim Khutornenko <ma...@apache.org> on 2016/07/01 16:15:48 UTC

Re: [PROPOSAL] Job as a first-class citizen

Thanks for the feedback! I will follow up with an itemized epic to
track this refactoring work.

On Wed, Jun 29, 2016 at 2:29 PM, Jake Farrell <jf...@apache.org> wrote:
> huge +1, socket activation is our exact use case for this type of action
> also
>
> -Jake
>
> On Wed, Jun 29, 2016 at 5:18 PM, Erb, Stephan <St...@blue-yonder.com>
> wrote:
>
>> I recently thought about the same idea. Use case for us would be to scale
>> a job 0 instances. While this sounds useless at first, it can be quite
>> powerful when trying to implement a feature like socket activation.
>>
>> ________________________________________
>> From: Maxim Khutornenko <ma...@apache.org>
>> Sent: Wednesday, June 29, 2016 22:43
>> To: dev@aurora.apache.org
>> Subject: [PROPOSAL] Job as a first-class citizen
>>
>> TL;DR - I am proposing we store and maintain job-level data
>> (JobConfiguration [1]) instead of relying on storing everything in a
>> TaskConfig [2].
>>
>>
>> Aurora storage currently does not have a concept of a "job" when it
>> comes to services and adhoc jobs. Instead, it relies on a collection
>> of TaskConfigs that represent a view of what the job state is. This is
>> in stark contrast to cron jobs, which are already represented by the
>> JobConfiguration struct.
>>
>> This lack of representation limits our ability to deliver richer
>> features and may result in suboptimal design and storage utilization.
>> Specifically, the following is currently impossible:
>>
>> - storing normalized job-level data without repeating it in every task
>> (e.g. contactEmail, isService);
>>
>> - maintaining job-level data that may be different for every instance
>> (SLA requirements, topology specs for stateful services and etc.);
>>
>> - knowing what the job instance count is without pulling all ACTIVE
>> tasks and iterating over them.
>>
>> To address the above, I propose we start treating Aurora job as a
>> tangible entity in the storage and specifically use JobConfiguration
>> wherever applicable. As a welcome side effect, this will let us:
>>
>> - allow instantaneous job updates when job-level fields are updated
>> (e.g. those that don't require instance restarts);
>> - finally get rid of the deprecated Identity struct [3];
>> - reduce or completely eliminate DB garbage collection of abandoned job
>> keys [4]
>>
>> Any thoughts, suggestions, objections?
>>
>> Thanks,
>> Maxim
>>
>>
>> [1] -
>> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L316-L338
>>
>> [2] -
>> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L240-L284
>>
>> [3] - https://issues.apache.org/jira/browse/AURORA-84
>>
>> [4] - RowGarbageCollector:
>>
>> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/storage/db/RowGarbageCollector.java
>>

Re: [PROPOSAL] Job as a first-class citizen

Posted by Maxim Khutornenko <ma...@apache.org>.
I have updated the summary
<https://docs.google.com/document/d/1myYX3yuofGr8JIzud98xXd5mqgpZ8q_RqKBpSff4-WE>
with a minor but important change. Instead of relying on TaskHistoryPruner
to remove JobConfigurations from the storage, the cleanup is now going to
happen inside a TaskStateChange event listener when all job instances reach
terminal status. As before, feedback is highly appreciated!

On Tue, Jul 26, 2016 at 4:55 PM, Maxim Khutornenko <ma...@apache.org> wrote:

> I felt this change is large enough to warrant a brief design summary.
> Please, take a look at this document
> <https://docs.google.com/document/d/1myYX3yuofGr8JIzud98xXd5mqgpZ8q_RqKBpSff4-WE>and
> leave your feedback as applicable.
>
> On Fri, Jul 1, 2016 at 9:15 AM, Maxim Khutornenko <ma...@apache.org>
> wrote:
>
>> Thanks for the feedback! I will follow up with an itemized epic to
>> track this refactoring work.
>>
>> On Wed, Jun 29, 2016 at 2:29 PM, Jake Farrell <jf...@apache.org>
>> wrote:
>> > huge +1, socket activation is our exact use case for this type of action
>> > also
>> >
>> > -Jake
>> >
>> > On Wed, Jun 29, 2016 at 5:18 PM, Erb, Stephan <
>> Stephan.Erb@blue-yonder.com>
>> > wrote:
>> >
>> >> I recently thought about the same idea. Use case for us would be to
>> scale
>> >> a job 0 instances. While this sounds useless at first, it can be quite
>> >> powerful when trying to implement a feature like socket activation.
>> >>
>> >> ________________________________________
>> >> From: Maxim Khutornenko <ma...@apache.org>
>> >> Sent: Wednesday, June 29, 2016 22:43
>> >> To: dev@aurora.apache.org
>> >> Subject: [PROPOSAL] Job as a first-class citizen
>> >>
>> >> TL;DR - I am proposing we store and maintain job-level data
>> >> (JobConfiguration [1]) instead of relying on storing everything in a
>> >> TaskConfig [2].
>> >>
>> >>
>> >> Aurora storage currently does not have a concept of a "job" when it
>> >> comes to services and adhoc jobs. Instead, it relies on a collection
>> >> of TaskConfigs that represent a view of what the job state is. This is
>> >> in stark contrast to cron jobs, which are already represented by the
>> >> JobConfiguration struct.
>> >>
>> >> This lack of representation limits our ability to deliver richer
>> >> features and may result in suboptimal design and storage utilization.
>> >> Specifically, the following is currently impossible:
>> >>
>> >> - storing normalized job-level data without repeating it in every task
>> >> (e.g. contactEmail, isService);
>> >>
>> >> - maintaining job-level data that may be different for every instance
>> >> (SLA requirements, topology specs for stateful services and etc.);
>> >>
>> >> - knowing what the job instance count is without pulling all ACTIVE
>> >> tasks and iterating over them.
>> >>
>> >> To address the above, I propose we start treating Aurora job as a
>> >> tangible entity in the storage and specifically use JobConfiguration
>> >> wherever applicable. As a welcome side effect, this will let us:
>> >>
>> >> - allow instantaneous job updates when job-level fields are updated
>> >> (e.g. those that don't require instance restarts);
>> >> - finally get rid of the deprecated Identity struct [3];
>> >> - reduce or completely eliminate DB garbage collection of abandoned job
>> >> keys [4]
>> >>
>> >> Any thoughts, suggestions, objections?
>> >>
>> >> Thanks,
>> >> Maxim
>> >>
>> >>
>> >> [1] -
>> >>
>> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L316-L338
>> >>
>> >> [2] -
>> >>
>> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L240-L284
>> >>
>> >> [3] - https://issues.apache.org/jira/browse/AURORA-84
>> >>
>> >> [4] - RowGarbageCollector:
>> >>
>> >>
>> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/storage/db/RowGarbageCollector.java
>> >>
>>
>
>

Re: [PROPOSAL] Job as a first-class citizen

Posted by Maxim Khutornenko <ma...@apache.org>.
I felt this change is large enough to warrant a brief design summary.
Please, take a look at this document
<https://docs.google.com/document/d/1myYX3yuofGr8JIzud98xXd5mqgpZ8q_RqKBpSff4-WE>and
leave your feedback as applicable.

On Fri, Jul 1, 2016 at 9:15 AM, Maxim Khutornenko <ma...@apache.org> wrote:

> Thanks for the feedback! I will follow up with an itemized epic to
> track this refactoring work.
>
> On Wed, Jun 29, 2016 at 2:29 PM, Jake Farrell <jf...@apache.org> wrote:
> > huge +1, socket activation is our exact use case for this type of action
> > also
> >
> > -Jake
> >
> > On Wed, Jun 29, 2016 at 5:18 PM, Erb, Stephan <
> Stephan.Erb@blue-yonder.com>
> > wrote:
> >
> >> I recently thought about the same idea. Use case for us would be to
> scale
> >> a job 0 instances. While this sounds useless at first, it can be quite
> >> powerful when trying to implement a feature like socket activation.
> >>
> >> ________________________________________
> >> From: Maxim Khutornenko <ma...@apache.org>
> >> Sent: Wednesday, June 29, 2016 22:43
> >> To: dev@aurora.apache.org
> >> Subject: [PROPOSAL] Job as a first-class citizen
> >>
> >> TL;DR - I am proposing we store and maintain job-level data
> >> (JobConfiguration [1]) instead of relying on storing everything in a
> >> TaskConfig [2].
> >>
> >>
> >> Aurora storage currently does not have a concept of a "job" when it
> >> comes to services and adhoc jobs. Instead, it relies on a collection
> >> of TaskConfigs that represent a view of what the job state is. This is
> >> in stark contrast to cron jobs, which are already represented by the
> >> JobConfiguration struct.
> >>
> >> This lack of representation limits our ability to deliver richer
> >> features and may result in suboptimal design and storage utilization.
> >> Specifically, the following is currently impossible:
> >>
> >> - storing normalized job-level data without repeating it in every task
> >> (e.g. contactEmail, isService);
> >>
> >> - maintaining job-level data that may be different for every instance
> >> (SLA requirements, topology specs for stateful services and etc.);
> >>
> >> - knowing what the job instance count is without pulling all ACTIVE
> >> tasks and iterating over them.
> >>
> >> To address the above, I propose we start treating Aurora job as a
> >> tangible entity in the storage and specifically use JobConfiguration
> >> wherever applicable. As a welcome side effect, this will let us:
> >>
> >> - allow instantaneous job updates when job-level fields are updated
> >> (e.g. those that don't require instance restarts);
> >> - finally get rid of the deprecated Identity struct [3];
> >> - reduce or completely eliminate DB garbage collection of abandoned job
> >> keys [4]
> >>
> >> Any thoughts, suggestions, objections?
> >>
> >> Thanks,
> >> Maxim
> >>
> >>
> >> [1] -
> >>
> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L316-L338
> >>
> >> [2] -
> >>
> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L240-L284
> >>
> >> [3] - https://issues.apache.org/jira/browse/AURORA-84
> >>
> >> [4] - RowGarbageCollector:
> >>
> >>
> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/storage/db/RowGarbageCollector.java
> >>
>