You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@aurora.apache.org by "Oliver, James" <Ja...@pegs.com> on 2014/07/09 19:24:39 UTC

Sharded Service Coordination

Good morning,

My company is in the process of adopting Apache's open source stack, and I've been tasked with building Aurora jobs to deploy a few popular open source technologies on Mesos. Aurora is an elegant scheduler and in our estimation will met our company's needs. However, we are struggling to meet some of the configuration requirements of some of the tools we wish to deploy.

Scenario: When a distributed service is deployed, we need to programmatically determine all hosts selected for deployment and their reserved ports in order to properly configure the service. We've solved this in a few not-so-elegant ways:

1. We wrote a Process to publish host/port information to a distributed file system, block until {{instances}} were written, read the information and finally configure the service. This works, but IMO it is not an elegant solution.
2. Next, we designed a REST API for service registration (based off of the Aurora job key) and published this information to our ZooKeeper ensemble. This solution removes the dependency on a pre-configured distributed file system. However, some overhead is still required (multiple instances are necessary so as to not introduce a point of failure). Aurora jobs now require some initial configuration to be able to communicate with the service. Communication to this API is a little non-trivial because the REST service doesn't block until all information is communicated – this would be problematic due to the nature of the HTTP protocol.

At this point, we realized that an even better solution might be to communicate directly to Aurora scheduler to get this data. Seeing as Aurora scripts are just Python, we could probably implement it in a reusable fashion…but I'm curious if anyone has already gone down this path?

Thank you,
James O

Re: Sharded Service Coordination

Posted by Bill Farner <wf...@apache.org>.

Unfortunately the mesos log API does not currently support reading the log
from non-leading replicas.  Depending on where we go with the storage
overhaul, it's possible we could enable reads from non-leading schedulers,
however.

That said, querying this from the leading scheduler is relatively cheap.

-=Bill


On Wed, Jul 9, 2014 at 5:59 PM, Bhuvan Arumugam <bh...@apache.org> wrote:

> Bill, I'm hoping we could also pull information like host, port,
> process, state, etc from WAL replica logs stored in scheduler. No?
>
> On Wed, Jul 9, 2014 at 1:35 PM, Bill Farner <wf...@apache.org> wrote:
> > FWIW at Twitter we do something that sounds like a mix between (1) and
> (2).
> >  We consume the (otherwise unused) Announcer configuration parameter [1]
> to
> > Job, and extend the executor to publish all allocated
> {{thermos.port[x]}}s
> > to ZooKeeper.  This is something we would love to open source, so please
> > nudge us to do so if you would like this feature!
> >
> > [1]
> >
> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/config/schema/base.py#L69
> >
> > -=Bill
> >
> >
> > On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <Ja...@pegs.com>
> > wrote:
> >
> >> Good morning,
> >>
> >> My company is in the process of adopting Apache's open source stack, and
> >> I've been tasked with building Aurora jobs to deploy a few popular open
> >> source technologies on Mesos. Aurora is an elegant scheduler and in our
> >> estimation will met our company's needs. However, we are struggling to
> meet
> >> some of the configuration requirements of some of the tools we wish to
> >> deploy.
> >>
> >> Scenario: When a distributed service is deployed, we need to
> >> programmatically determine all hosts selected for deployment and their
> >> reserved ports in order to properly configure the service. We've solved
> >> this in a few not-so-elegant ways:
> >>
> >>  1.  We wrote a Process to publish host/port information to a
> distributed
> >> file system, block until {{instances}} were written, read the
> information
> >> and finally configure the service. This works, but IMO it is not an
> elegant
> >> solution.
> >>  2.  Next, we designed a REST API for service registration (based off of
> >> the Aurora job key) and published this information to our ZooKeeper
> >> ensemble. This solution removes the dependency on a pre-configured
> >> distributed file system. However, some overhead is still required
> (multiple
> >> instances are necessary so as to not introduce a point of failure).
> Aurora
> >> jobs now require some initial configuration to be able to communicate
> with
> >> the service. Communication to this API is a little non-trivial because
> the
> >> REST service doesn't block until all information is communicated – this
> >> would be problematic due to the nature of the HTTP protocol.
> >>
> >> At this point, we realized that an even better solution might be to
> >> communicate directly to Aurora scheduler to get this data. Seeing as
> Aurora
> >> scripts are just Python, we could probably implement it in a reusable
> >> fashion…but I'm curious if anyone has already gone down this path?
> >>
> >> Thank you,
> >> James O
> >>
>
>
>
> --
> Regards,
> Bhuvan Arumugam
> www.livecipher.com
>

Re: Sharded Service Coordination

Posted by Bhuvan Arumugam <bh...@apache.org>.

Bill, I'm hoping we could also pull information like host, port,
process, state, etc from WAL replica logs stored in scheduler. No?

On Wed, Jul 9, 2014 at 1:35 PM, Bill Farner <wf...@apache.org> wrote:
> FWIW at Twitter we do something that sounds like a mix between (1) and (2).
>  We consume the (otherwise unused) Announcer configuration parameter [1] to
> Job, and extend the executor to publish all allocated {{thermos.port[x]}}s
> to ZooKeeper.  This is something we would love to open source, so please
> nudge us to do so if you would like this feature!
>
> [1]
> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/config/schema/base.py#L69
>
> -=Bill
>
>
> On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <Ja...@pegs.com>
> wrote:
>
>> Good morning,
>>
>> My company is in the process of adopting Apache's open source stack, and
>> I've been tasked with building Aurora jobs to deploy a few popular open
>> source technologies on Mesos. Aurora is an elegant scheduler and in our
>> estimation will met our company's needs. However, we are struggling to meet
>> some of the configuration requirements of some of the tools we wish to
>> deploy.
>>
>> Scenario: When a distributed service is deployed, we need to
>> programmatically determine all hosts selected for deployment and their
>> reserved ports in order to properly configure the service. We've solved
>> this in a few not-so-elegant ways:
>>
>>  1.  We wrote a Process to publish host/port information to a distributed
>> file system, block until {{instances}} were written, read the information
>> and finally configure the service. This works, but IMO it is not an elegant
>> solution.
>>  2.  Next, we designed a REST API for service registration (based off of
>> the Aurora job key) and published this information to our ZooKeeper
>> ensemble. This solution removes the dependency on a pre-configured
>> distributed file system. However, some overhead is still required (multiple
>> instances are necessary so as to not introduce a point of failure). Aurora
>> jobs now require some initial configuration to be able to communicate with
>> the service. Communication to this API is a little non-trivial because the
>> REST service doesn't block until all information is communicated – this
>> would be problematic due to the nature of the HTTP protocol.
>>
>> At this point, we realized that an even better solution might be to
>> communicate directly to Aurora scheduler to get this data. Seeing as Aurora
>> scripts are just Python, we could probably implement it in a reusable
>> fashion…but I'm curious if anyone has already gone down this path?
>>
>> Thank you,
>> James O
>>



-- 
Regards,
Bhuvan Arumugam
www.livecipher.com

Re: Sharded Service Coordination

Posted by Bill Farner <wf...@apache.org>.

Tracking this at https://issues.apache.org/jira/browse/AURORA-587

-=Bill


On Fri, Jul 11, 2014 at 2:24 PM, Joe Stein <jo...@stealth.ly> wrote:

> +1 for adding it to Aurora, looking forward to using it, thanks!!!!
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
>
> On Wed, Jul 9, 2014 at 6:48 PM, Oliver, James <Ja...@pegs.com>
> wrote:
>
> > *Nudge nudge*
> >
> > We would love to take advantage of this tool if you folks would be so
> kind!
> >
> > Thanks,
> > James
> >
> >
> > On 7/9/14 1:35 PM, "Bill Farner" <wf...@apache.org> wrote:
> >
> > >FWIW at Twitter we do something that sounds like a mix between (1) and
> > >(2).
> > > We consume the (otherwise unused) Announcer configuration parameter [1]
> > >to
> > >Job, and extend the executor to publish all allocated
> {{thermos.port[x]}}s
> > >to ZooKeeper.  This is something we would love to open source, so please
> > >nudge us to do so if you would like this feature!
> > >
> > >[1]
> > >
> >
> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apa
> > >che/aurora/config/schema/base.py#L69
> > >
> > >-=Bill
> > >
> > >
> > >On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <Ja...@pegs.com>
> > >wrote:
> > >
> > >> Good morning,
> > >>
> > >> My company is in the process of adopting Apache's open source stack,
> and
> > >> I've been tasked with building Aurora jobs to deploy a few popular
> open
> > >> source technologies on Mesos. Aurora is an elegant scheduler and in
> our
> > >> estimation will met our company's needs. However, we are struggling to
> > >>meet
> > >> some of the configuration requirements of some of the tools we wish to
> > >> deploy.
> > >>
> > >> Scenario: When a distributed service is deployed, we need to
> > >> programmatically determine all hosts selected for deployment and their
> > >> reserved ports in order to properly configure the service. We've
> solved
> > >> this in a few not-so-elegant ways:
> > >>
> > >>  1.  We wrote a Process to publish host/port information to a
> > >>distributed
> > >> file system, block until {{instances}} were written, read the
> > >>information
> > >> and finally configure the service. This works, but IMO it is not an
> > >>elegant
> > >> solution.
> > >>  2.  Next, we designed a REST API for service registration (based off
> of
> > >> the Aurora job key) and published this information to our ZooKeeper
> > >> ensemble. This solution removes the dependency on a pre-configured
> > >> distributed file system. However, some overhead is still required
> > >>(multiple
> > >> instances are necessary so as to not introduce a point of failure).
> > >>Aurora
> > >> jobs now require some initial configuration to be able to communicate
> > >>with
> > >> the service. Communication to this API is a little non-trivial because
> > >>the
> > >> REST service doesn't block until all information is communicated 
> this
> > >> would be problematic due to the nature of the HTTP protocol.
> > >>
> > >> At this point, we realized that an even better solution might be to
> > >> communicate directly to Aurora scheduler to get this data. Seeing as
> > >>Aurora
> > >> scripts are just Python, we could probably implement it in a reusable
> > >> fashionŠbut I'm curious if anyone has already gone down this path?
> > >>
> > >> Thank you,
> > >> James O
> > >>
> >
> >
>

Re: Sharded Service Coordination

Posted by Joe Stein <jo...@stealth.ly>.

+1 for adding it to Aurora, looking forward to using it, thanks!!!!

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/


On Wed, Jul 9, 2014 at 6:48 PM, Oliver, James <Ja...@pegs.com> wrote:

> *Nudge nudge*
>
> We would love to take advantage of this tool if you folks would be so kind!
>
> Thanks,
> James
>
>
> On 7/9/14 1:35 PM, "Bill Farner" <wf...@apache.org> wrote:
>
> >FWIW at Twitter we do something that sounds like a mix between (1) and
> >(2).
> > We consume the (otherwise unused) Announcer configuration parameter [1]
> >to
> >Job, and extend the executor to publish all allocated {{thermos.port[x]}}s
> >to ZooKeeper.  This is something we would love to open source, so please
> >nudge us to do so if you would like this feature!
> >
> >[1]
> >
> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apa
> >che/aurora/config/schema/base.py#L69
> >
> >-=Bill
> >
> >
> >On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <Ja...@pegs.com>
> >wrote:
> >
> >> Good morning,
> >>
> >> My company is in the process of adopting Apache's open source stack, and
> >> I've been tasked with building Aurora jobs to deploy a few popular open
> >> source technologies on Mesos. Aurora is an elegant scheduler and in our
> >> estimation will met our company's needs. However, we are struggling to
> >>meet
> >> some of the configuration requirements of some of the tools we wish to
> >> deploy.
> >>
> >> Scenario: When a distributed service is deployed, we need to
> >> programmatically determine all hosts selected for deployment and their
> >> reserved ports in order to properly configure the service. We've solved
> >> this in a few not-so-elegant ways:
> >>
> >>  1.  We wrote a Process to publish host/port information to a
> >>distributed
> >> file system, block until {{instances}} were written, read the
> >>information
> >> and finally configure the service. This works, but IMO it is not an
> >>elegant
> >> solution.
> >>  2.  Next, we designed a REST API for service registration (based off of
> >> the Aurora job key) and published this information to our ZooKeeper
> >> ensemble. This solution removes the dependency on a pre-configured
> >> distributed file system. However, some overhead is still required
> >>(multiple
> >> instances are necessary so as to not introduce a point of failure).
> >>Aurora
> >> jobs now require some initial configuration to be able to communicate
> >>with
> >> the service. Communication to this API is a little non-trivial because
> >>the
> >> REST service doesn't block until all information is communicated  this
> >> would be problematic due to the nature of the HTTP protocol.
> >>
> >> At this point, we realized that an even better solution might be to
> >> communicate directly to Aurora scheduler to get this data. Seeing as
> >>Aurora
> >> scripts are just Python, we could probably implement it in a reusable
> >> fashionŠbut I'm curious if anyone has already gone down this path?
> >>
> >> Thank you,
> >> James O
> >>
>
>

Re: Sharded Service Coordination

Posted by "Oliver, James" <Ja...@pegs.com>.

*Nudge nudge*

We would love to take advantage of this tool if you folks would be so kind!

Thanks,
James


On 7/9/14 1:35 PM, "Bill Farner" <wf...@apache.org> wrote:

>FWIW at Twitter we do something that sounds like a mix between (1) and
>(2).
> We consume the (otherwise unused) Announcer configuration parameter [1]
>to
>Job, and extend the executor to publish all allocated {{thermos.port[x]}}s
>to ZooKeeper.  This is something we would love to open source, so please
>nudge us to do so if you would like this feature!
>
>[1]
>https://github.com/apache/incubator-aurora/blob/master/src/main/python/apa
>che/aurora/config/schema/base.py#L69
>
>-=Bill
>
>
>On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <Ja...@pegs.com>
>wrote:
>
>> Good morning,
>>
>> My company is in the process of adopting Apache's open source stack, and
>> I've been tasked with building Aurora jobs to deploy a few popular open
>> source technologies on Mesos. Aurora is an elegant scheduler and in our
>> estimation will met our company's needs. However, we are struggling to
>>meet
>> some of the configuration requirements of some of the tools we wish to
>> deploy.
>>
>> Scenario: When a distributed service is deployed, we need to
>> programmatically determine all hosts selected for deployment and their
>> reserved ports in order to properly configure the service. We've solved
>> this in a few not-so-elegant ways:
>>
>>  1.  We wrote a Process to publish host/port information to a
>>distributed
>> file system, block until {{instances}} were written, read the
>>information
>> and finally configure the service. This works, but IMO it is not an
>>elegant
>> solution.
>>  2.  Next, we designed a REST API for service registration (based off of
>> the Aurora job key) and published this information to our ZooKeeper
>> ensemble. This solution removes the dependency on a pre-configured
>> distributed file system. However, some overhead is still required
>>(multiple
>> instances are necessary so as to not introduce a point of failure).
>>Aurora
>> jobs now require some initial configuration to be able to communicate
>>with
>> the service. Communication to this API is a little non-trivial because
>>the
>> REST service doesn't block until all information is communicated  this
>> would be problematic due to the nature of the HTTP protocol.
>>
>> At this point, we realized that an even better solution might be to
>> communicate directly to Aurora scheduler to get this data. Seeing as
>>Aurora
>> scripts are just Python, we could probably implement it in a reusable
>> fashionŠbut I'm curious if anyone has already gone down this path?
>>
>> Thank you,
>> James O
>>

Re: Sharded Service Coordination

Posted by Bill Farner <wf...@apache.org>.

FWIW at Twitter we do something that sounds like a mix between (1) and (2).
 We consume the (otherwise unused) Announcer configuration parameter [1] to
Job, and extend the executor to publish all allocated {{thermos.port[x]}}s
to ZooKeeper.  This is something we would love to open source, so please
nudge us to do so if you would like this feature!

[1]
https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/config/schema/base.py#L69

-=Bill


On Wed, Jul 9, 2014 at 10:24 AM, Oliver, James <Ja...@pegs.com>
wrote:

> Good morning,
>
> My company is in the process of adopting Apache's open source stack, and
> I've been tasked with building Aurora jobs to deploy a few popular open
> source technologies on Mesos. Aurora is an elegant scheduler and in our
> estimation will met our company's needs. However, we are struggling to meet
> some of the configuration requirements of some of the tools we wish to
> deploy.
>
> Scenario: When a distributed service is deployed, we need to
> programmatically determine all hosts selected for deployment and their
> reserved ports in order to properly configure the service. We've solved
> this in a few not-so-elegant ways:
>
>  1.  We wrote a Process to publish host/port information to a distributed
> file system, block until {{instances}} were written, read the information
> and finally configure the service. This works, but IMO it is not an elegant
> solution.
>  2.  Next, we designed a REST API for service registration (based off of
> the Aurora job key) and published this information to our ZooKeeper
> ensemble. This solution removes the dependency on a pre-configured
> distributed file system. However, some overhead is still required (multiple
> instances are necessary so as to not introduce a point of failure). Aurora
> jobs now require some initial configuration to be able to communicate with
> the service. Communication to this API is a little non-trivial because the
> REST service doesn't block until all information is communicated – this
> would be problematic due to the nature of the HTTP protocol.
>
> At this point, we realized that an even better solution might be to
> communicate directly to Aurora scheduler to get this data. Seeing as Aurora
> scripts are just Python, we could probably implement it in a reusable
> fashion…but I'm curious if anyone has already gone down this path?
>
> Thank you,
> James O
>