You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by Ankur Chauhan <an...@malloc64.com> on 2014/01/13 06:14:21 UTC

Re: Porting an app

Thanks everyone for all the help.
Marathon does seem like a good framework but my use case requires the app
to evaluate it's own health and scale up based on internal load stats (SLA
requirements) and I don't know if marathon supports that. This is the main
reason why i am looking at building out my own scheduler/executor. I will
give another go with Vinod's comments and have a look at the hadoop
scheduler.

Just a recommendation to any mesos experts out there, it would be super
helpful if there was a complete mock app with annotated code somewhere.
Another good page on the website would a good FAQ page.

I am still pretty n00b as far as mesos is concerned, so, pardon any stupid
comments/suggestions/questions.

-- Ankur


On Fri, Dec 27, 2013 at 10:16 AM, Abhishek Parolkar
<ab...@parolkar.com>wrote:

> @Ankur,
>   In case Marathon looks like direction you want to go with, I have a
> small demo in here if that helps
> http://www.youtube.com/watch?v=2YWVGMuMTrg
>
>   -parolkar
>
>
> On Sat, Dec 28, 2013 at 2:10 AM, Vinod Kone <vi...@gmail.com> wrote:
>
>> I can't really find an example that is an end-to-end use case. By that I
>>> mean, I would like to know how to put the scheduler and the executor in the
>>> correct places. Right now I have a single jar with can be run from the
>>> command line: java -jar target/collector.jar and that would take care of
>>> everything.
>>>
>>> This collector.jar can act as both scheduler and executor, presumably
>> based on command line flags? If yes, thats certainly doable. Typically the
>> scheduler and executor are split into separate jars. This makes it easy to
>> decouple the upgrade of scheduler and executor.
>>
>>
>>> My current train of thought is that the webapp jar would stay somewhere
>>> on an S3 url and the "CollectorScheduler" would "somehow" tell a mesos
>>> slave to run the "CollectorExecutor" which in turn fetch the jar from S3
>>> and run it.
>>>
>>>
>> Yes, you are on the right track. Mesos slave can download the jar for you
>> as long as it could be accessed via (http://, https://, ftp://, hdfs://
>> etc). This is how you do it:
>>
>> When you launch a task from the scheduler via 'launchTaks()' you give it
>> a 'vector<TaskInfo>' as one of the arguments. Since you are using a custom
>> executor you should set 'TaskInfo.ExecutorInfo' (see mesos.proto) to point
>> to your executor. To specify the S3 URL you would set
>> 'TaskInfo.ExecutorInfo.CommandInfo.URI.value'. To tell slave the command to
>> launch the executor after it downloads the the executor, you would set
>> 'TaskInfo.ExecutorInfo.CommandInfo.value'.
>>
>> You can find some examples here:
>>
>> Hadoop scheduler<https://github.com/mesos/hadoop/blob/master/src/main/java/org/apache/hadoop/mapred/ResourcePolicy.java>
>>
>> Example Java scheduler <https://github.com/apache/mesos/blob/master/src/examples/java/TestFramework.java>
>>
>> Hope that helps. Let us know if you have additional questions.
>>
>> Vinod
>>
>
>

Re: Porting an app

Posted by Benjamin Mahler <be...@gmail.com>.

On Sun, Jan 12, 2014 at 9:14 PM, Ankur Chauhan <an...@malloc64.com> wrote:

> Thanks everyone for all the help.
> Marathon does seem like a good framework but my use case requires the app
> to evaluate it's own health and scale up based on internal load stats (SLA
> requirements) and I don't know if marathon supports that. This is the main
> reason why i am looking at building out my own scheduler/executor. I will
> give another go with Vinod's comments and have a look at the hadoop
> scheduler.
>
> Just a recommendation to any mesos experts out there, it would be super
> helpful if there was a complete mock app with annotated code somewhere.
> Another good page on the website would a good FAQ page.
>
> I am still pretty n00b as far as mesos is concerned, so, pardon any stupid
> comments/suggestions/questions.
>

Apologies are not necessary! Questions like these are great for creating a
public discussion for others to learn from, especially given the current
lack of a FAQ document. Please reach out with further questions if you have
any.


> -- Ankur
>
>
> On Fri, Dec 27, 2013 at 10:16 AM, Abhishek Parolkar <abhishek@parolkar.com
> > wrote:
>
>> @Ankur,
>>   In case Marathon looks like direction you want to go with, I have a
>> small demo in here if that helps
>> http://www.youtube.com/watch?v=2YWVGMuMTrg
>>
>>   -parolkar
>>
>>
>> On Sat, Dec 28, 2013 at 2:10 AM, Vinod Kone <vi...@gmail.com> wrote:
>>
>>> I can't really find an example that is an end-to-end use case. By that I
>>>> mean, I would like to know how to put the scheduler and the executor in the
>>>> correct places. Right now I have a single jar with can be run from the
>>>> command line: java -jar target/collector.jar and that would take care of
>>>> everything.
>>>>
>>>> This collector.jar can act as both scheduler and executor, presumably
>>> based on command line flags? If yes, thats certainly doable. Typically the
>>> scheduler and executor are split into separate jars. This makes it easy to
>>> decouple the upgrade of scheduler and executor.
>>>
>>>
>>>> My current train of thought is that the webapp jar would stay somewhere
>>>> on an S3 url and the "CollectorScheduler" would "somehow" tell a mesos
>>>> slave to run the "CollectorExecutor" which in turn fetch the jar from S3
>>>> and run it.
>>>>
>>>>
>>> Yes, you are on the right track. Mesos slave can download the jar for
>>> you as long as it could be accessed via (http://, https://, ftp://,
>>> hdfs:// etc). This is how you do it:
>>>
>>> When you launch a task from the scheduler via 'launchTaks()' you give it
>>> a 'vector<TaskInfo>' as one of the arguments. Since you are using a custom
>>> executor you should set 'TaskInfo.ExecutorInfo' (see mesos.proto) to point
>>> to your executor. To specify the S3 URL you would set
>>> 'TaskInfo.ExecutorInfo.CommandInfo.URI.value'. To tell slave the command to
>>> launch the executor after it downloads the the executor, you would set
>>> 'TaskInfo.ExecutorInfo.CommandInfo.value'.
>>>
>>> You can find some examples here:
>>>
>>> Hadoop scheduler<https://github.com/mesos/hadoop/blob/master/src/main/java/org/apache/hadoop/mapred/ResourcePolicy.java>
>>>
>>> Example Java scheduler <https://github.com/apache/mesos/blob/master/src/examples/java/TestFramework.java>
>>>
>>> Hope that helps. Let us know if you have additional questions.
>>>
>>> Vinod
>>>
>>
>>
>

Re: Porting an app

Posted by Dave Lester <da...@ischool.berkeley.edu>.

On Sun, Jan 12, 2014 at 9:14 PM, Ankur Chauhan <an...@malloc64.com> wrote:

> Another good page on the website would a good FAQ page.
>

Just to follow-up, I've created a JIRA issue to track the creation of a FAQ
page for the project documentation and website:
https://issues.apache.org/jira/browse/MESOS-915

I encourage everyone to suggest questions and answers in the comments.

Thanks!

Re: Porting an app

Posted by Tobias Knaup <to...@knaup.me>.

Sounds like an exciting project! Looking forward to hearing how it turns
out.


On Mon, Jan 13, 2014 at 5:15 PM, Ankur Chauhan <an...@malloc64.com> wrote:

> Hi Tobias,
>
> Thanks for your reply, and the mesos-jetty project looks interesting. Let
> me describe my target app that should let you kind of get an idea about the
> use case and other scale up factors that I am talking about.
>
>    1. The target app is either a simple standalone java netty based web
>    server or a jetty web app.
>    2. They all listen to one or more ports and are always exposed to the
>    outside world via some loadbalancer (currently we use ELB).
>    3. The metrics are currently collected using yammer-metrics library
>    and published to graphite. This allows us to monitor some very
>    specific/application specific metrics.
>    4. The metrics may be something like memory load, average latency of
>    writes to a database, end to end latency, cpu load, request rate,
>    data-structure specific load factors, etc. These metrics are very specific
>    and contribute highly towards determining the core cluster size, scale up
>    size and the cool down period.
>    5. We currently setup scale up/down policies based of high/low
>    watermark of the metrics that we collect via amazon's cloud watch and
>    dynamically adjust the size of the cluster.
>    6. One of the very important things to keep in mind here is that some
>    applications require us to maintain some (small amount) of state, it is not
>    catastrophic to lose this data (such as in case of a node loss) but is is
>    very helpful if we could do a graceful shutdown/restart/scale down of the
>    instances themselves. Think of this as running through a commit log before
>    actually killing an app. So the life-cycle of a node is important.
>    7. Finally, deployments - we have a set of scripts that do a push
>    button deployment, i.e. v123 -> v124 -> v125 is generally 1 button click
>    each. The requirement here is that we either could stand a new cluster and
>    do a red-black deployment at the loadbalancer level or do a rolling
>    deployment. This is probably out of scope of what mesos want to get
>    involved with but having start -> running -> shutdown -> killed lifecycle
>    support would be a killer feature.
>
> I hope this kind of gives you an idea about what i am looking for as far
> as the requirements of the framework is concerned. As a suggestions,
> marathon could also think about supporting something like build packs (
> https://devcenter.heroku.com/articles/buildpacks). And using simple shell
> based interface to determine scale up requirement. Let me elaborate on that
> a bit. If we can assume that every app is a tgz file and has
>
> ./install.sh
>
> ./startup.sh
>
> ./status.sh
>
> ./shutdown.sh
>
> ./metrics.sh
>
>
> we can kind of build a pretty robust interface for deploying apps into
> containers with marathon. For example: in case of a jetty deployment we
> would have web-jetty.tgz
> the marathon executor first executes "install.sh", checks exit status, if
> it's okay it then goes on to execute ./startup.sh monitor the status using
> ./status.sh (periodically executed). The metrics.sh can be used to give
> back either a key,value pair of "monitored" metrics that the scheduler can
> use to determine current number of applications needed. All these are
> simple bash scripts that make it easy to test locally as well as on a
> cluster mesos setup.
>
> All this is just pretty fresh out of my head as I am writing/looking back
> at our deployment strategies, so I may not have been super clear about some
> stuff or over simplified some critical points. Please let me know what you
> think about this, I would love to contribute to marathon/mesos although I
> am pretty green as far as scala is concerned. Java is my strong suit.
>
> -- Ankur Chauhan
> achauhan@brightcove.com
>
>
>
> On Jan 13, 2014, 14:56:42, Tobias Knaup <to...@knaup.me> wrote:
> ------------------------------
>  Hey Ankur, your question is super timely, I've been working on a demo
> framework that shows exactly what you're trying to do with Jetty. The code
> is still a little rough and there are some hardcoded paths etc. but since
> you asked I just published it: https://github.com/guenter/jetty-mesos
> I'm also the main author of Marathon and auto scaling has been on my mind.
> The main question is what an interface for reporting load stats would look
> like. Curious what you think!
>
>
>
> On Sun, Jan 12, 2014 at 9:14 PM, Ankur Chauhan <an...@malloc64.com> wrote:
>
>> Thanks everyone for all the help.
>> Marathon does seem like a good framework but my use case requires the app
>> to evaluate it's own health and scale up based on internal load stats (SLA
>> requirements) and I don't know if marathon supports that. This is the main
>> reason why i am looking at building out my own scheduler/executor. I will
>> give another go with Vinod's comments and have a look at the hadoop
>> scheduler.
>>
>> Just a recommendation to any mesos experts out there, it would be super
>> helpful if there was a complete mock app with annotated code somewhere.
>> Another good page on the website would a good FAQ page.
>>
>> I am still pretty n00b as far as mesos is concerned, so, pardon any
>> stupid comments/suggestions/questions.
>>
>> -- Ankur
>>
>>
>> On Fri, Dec 27, 2013 at 10:16 AM, Abhishek Parolkar <
>> abhishek@parolkar.com> wrote:
>>
>>> @Ankur,
>>>   In case Marathon looks like direction you want to go with, I have a
>>> small demo in here if that helps
>>> http://www.youtube.com/watch?v=2YWVGMuMTrg
>>>
>>>   -parolkar
>>>
>>>
>>> On Sat, Dec 28, 2013 at 2:10 AM, Vinod Kone <vi...@gmail.com> wrote:
>>>
>>>>   I can't really find an example that is an end-to-end use case. By
>>>>> that I mean, I would like to know how to put the scheduler and the executor
>>>>> in the correct places. Right now I have a single jar with can be run from
>>>>> the command line: java -jar target/collector.jar and that would take care
>>>>> of everything.
>>>>>
>>>>>  This collector.jar can act as both scheduler and executor,
>>>> presumably based on command line flags? If yes, thats certainly doable.
>>>> Typically the scheduler and executor are split into separate jars. This
>>>> makes it easy to decouple the upgrade of scheduler and executor.
>>>>
>>>>
>>>>> My current train of thought is that the webapp jar would stay
>>>>> somewhere on an S3 url and the "CollectorScheduler" would "somehow" tell a
>>>>> mesos slave to run the "CollectorExecutor" which in turn fetch the jar from
>>>>> S3 and run it.
>>>>>
>>>>>
>>>>  Yes, you are on the right track. Mesos slave can download the jar for
>>>> you as long as it could be accessed via (http://, https://, ftp://,
>>>> hdfs:// etc). This is how you do it:
>>>>
>>>> When you launch a task from the scheduler via 'launchTaks()' you give
>>>> it a 'vector<TaskInfo>' as one of the arguments. Since you are using a
>>>> custom executor you should set 'TaskInfo.ExecutorInfo' (see mesos.proto) to
>>>> point to your executor. To specify the S3 URL you would set
>>>> 'TaskInfo.ExecutorInfo.CommandInfo.URI.value'. To tell slave the command to
>>>> launch the executor after it downloads the the executor, you would set
>>>> 'TaskInfo.ExecutorInfo.CommandInfo.value'.
>>>>
>>>> You can find some examples here:
>>>>
>>>> Hadoop scheduler<https://github.com/mesos/hadoop/blob/master/src/main/java/org/apache/hadoop/mapred/ResourcePolicy.java>
>>>>
>>>> Example Java scheduler <https://github.com/apache/mesos/blob/master/src/examples/java/TestFramework.java>
>>>>
>>>> Hope that helps. Let us know if you have additional questions.
>>>>
>>>> Vinod
>>>>
>>>
>>>
>>
>

Re: Porting an app

Posted by Ankur Chauhan <an...@malloc64.com>.

Hi Tobias, 

Thanks for your reply, and the mesos-jetty project looks interesting. Let me describe my target app that should let you kind of get an idea about the use case and other scale up factors that I am talking about. 
The target app is either a simple standalone java netty based web server or a jetty web app.
They all listen to one or more ports and are always exposed to the outside world via some loadbalancer (currently we use ELB).
The metrics are currently collected using yammer-metrics library and published to graphite. This allows us to monitor some very specific/application specific metrics. 
The metrics may be something like memory load, average latency of writes to a database, end to end latency, cpu load, request rate, data-structure specific load factors, etc. These metrics are very specific and contribute highly towards determining the core cluster size, scale up size and the cool down period.
We currently setup scale up/down policies based of high/low watermark of the metrics that we collect via amazon's cloud watch and dynamically adjust the size of the cluster. 
One of the very important things to keep in mind here is that some applications require us to maintain some (small amount) of state, it is not catastrophic to lose this data (such as in case of a node loss) but is is very helpful if we could do a graceful shutdown/restart/scale down of the instances themselves. Think of this as running through a commit log before actually killing an app. So the life-cycle of a node is important.
Finally, deployments - we have a set of scripts that do a push button deployment, i.e. v123 -> v124 -> v125 is generally 1 button click each. The requirement here is that we either could stand a new cluster and do a red-black deployment at the loadbalancer level or do a rolling deployment. This is probably out of scope of what mesos want to get involved with but having start -> running -> shutdown -> killed lifecycle support would be a killer feature.
I hope this kind of gives you an idea about what i am looking for as far as the requirements of the framework is concerned. As a suggestions, marathon could also think about supporting something like build packs (https://devcenter.heroku.com/articles/buildpacks). And using simple shell based interface to determine scale up requirement. Let me elaborate on that a bit. If we can assume that every app is a tgz file and has  

> ./install.sh  
> ./startup.sh 

> ./status.sh 

> ./shutdown.sh 

> ./metrics.sh 

we can kind of build a pretty robust interface for deploying apps into containers with marathon. For example: in case of a jetty deployment we would have web-jetty.tgz  
the marathon executor first executes "install.sh", checks exit status, if it's okay it then goes on to execute ./startup.sh monitor the status using ./status.sh (periodically executed). The metrics.sh can be used to give back either a key,value pair of "monitored" metrics that the scheduler can use to determine current number of applications needed. All these are simple bash scripts that make it easy to test locally as well as on a cluster mesos setup.

All this is just pretty fresh out of my head as I am writing/looking back at our deployment strategies, so I may not have been super clear about some stuff or over simplified some critical points. Please let me know what you think about this, I would love to contribute to marathon/mesos although I am pretty green as far as scala is concerned. Java is my strong suit. 

-- Ankur Chauhan 
achauhan@brightcove.com

On Jan 13, 2014, 14:56:42, Tobias Knaup <to...@knaup.me> wrote: 
Hey Ankur, your question is super timely, I've been working on a demo framework that shows exactly what you're trying to do with Jetty. The code is still a little rough and there are some hardcoded paths etc. but since you asked I just published it: https://github.com/guenter/jetty-mesos 
I'm also the main author of Marathon and auto scaling has been on my mind. The main question is what an interface for reporting load stats would look like. Curious what you think!

On Sun, Jan 12, 2014 at 9:14 PM, Ankur Chauhan <ankur@malloc64.com(mailto:ankur@malloc64.com)> wrote:
> Thanks everyone for all the help.  
> Marathon does seem like a good framework but my use case requires the app to evaluate it's own health and scale up based on internal load stats (SLA requirements) and I don't know if marathon supports that. This is the main reason why i am looking at building out my own scheduler/executor. I will give another go with Vinod's comments and have a look at the hadoop scheduler. 
> 
> Just a recommendation to any mesos experts out there, it would be super helpful if there was a complete mock app with annotated code somewhere. Another good page on the website would a good FAQ page. 
> 
> I am still pretty n00b as far as mesos is concerned, so, pardon any stupid comments/suggestions/questions. 
> 
> -- Ankur 
> 
> 
> On Fri, Dec 27, 2013 at 10:16 AM, Abhishek Parolkar <abhishek@parolkar.com(mailto:abhishek@parolkar.com)> wrote:
> > @Ankur, 
> >   In case Marathon looks like direction you want to go with, I have a small demo in here if that helps http://www.youtube.com/watch?v=2YWVGMuMTrg 
> > 
> >   -parolkar 
> > 
> > 
> > On Sat, Dec 28, 2013 at 2:10 AM, Vinod Kone <vinodkone@gmail.com(mailto:vinodkone@gmail.com)> wrote:
> > > > I can't really find an example that is an end-to-end use case. By that I mean, I would like to know how to put the scheduler and the executor in the correct places. Right now I have a single jar with can be run from the command line: java -jar target/collector.jar and that would take care of everything.
> > > > 
> > > This collector.jar can act as both scheduler and executor, presumably based on command line flags? If yes, thats certainly doable. Typically the scheduler and executor are split into separate jars. This makes it easy to decouple the upgrade of scheduler and executor. 
> > >  
> > > > My current train of thought is that the webapp jar would stay somewhere on an S3 url and the "CollectorScheduler" would "somehow" tell a mesos slave to run the "CollectorExecutor" which in turn fetch the jar from S3 and run it.
> > > > 
> > > 
> > > Yes, you are on the right track. Mesos slave can download the jar for you as long as it could be accessed via (http://, https://, ftp://, hdfs:// etc). This is how you do it: 
> > > 
> > > When you launch a task from the scheduler via 'launchTaks()' you give it a 'vector<TaskInfo>' as one of the arguments. Since you are using a custom executor you should set 'TaskInfo.ExecutorInfo' (see mesos.proto) to point to your executor. To specify the S3 URL you would set 'TaskInfo.ExecutorInfo.CommandInfo.URI.value'. To tell slave the command to launch the executor after it downloads the the executor, you would set 'TaskInfo.ExecutorInfo.CommandInfo.value'. 
> > > 
> > > You can find some examples here: 
> > > 
> > > Hadoop scheduler(https://github.com/mesos/hadoop/blob/master/src/main/java/org/apache/hadoop/mapred/ResourcePolicy.java) 
> > > 
> > > Example Java scheduler (https://github.com/apache/mesos/blob/master/src/examples/java/TestFramework.java) 
> > > 
> > > Hope that helps. Let us know if you have additional questions. 
> > > 
> > > Vinod 
>

Re: Porting an app

Posted by Tobias Knaup <to...@knaup.me>.

Hey Ankur, your question is super timely, I've been working on a demo
framework that shows exactly what you're trying to do with Jetty. The code
is still a little rough and there are some hardcoded paths etc. but since
you asked I just published it: https://github.com/guenter/jetty-mesos
I'm also the main author of Marathon and auto scaling has been on my mind.
The main question is what an interface for reporting load stats would look
like. Curious what you think!



On Sun, Jan 12, 2014 at 9:14 PM, Ankur Chauhan <an...@malloc64.com> wrote:

> Thanks everyone for all the help.
> Marathon does seem like a good framework but my use case requires the app
> to evaluate it's own health and scale up based on internal load stats (SLA
> requirements) and I don't know if marathon supports that. This is the main
> reason why i am looking at building out my own scheduler/executor. I will
> give another go with Vinod's comments and have a look at the hadoop
> scheduler.
>
> Just a recommendation to any mesos experts out there, it would be super
> helpful if there was a complete mock app with annotated code somewhere.
> Another good page on the website would a good FAQ page.
>
> I am still pretty n00b as far as mesos is concerned, so, pardon any stupid
> comments/suggestions/questions.
>
> -- Ankur
>
>
> On Fri, Dec 27, 2013 at 10:16 AM, Abhishek Parolkar <abhishek@parolkar.com
> > wrote:
>
>> @Ankur,
>>   In case Marathon looks like direction you want to go with, I have a
>> small demo in here if that helps
>> http://www.youtube.com/watch?v=2YWVGMuMTrg
>>
>>   -parolkar
>>
>>
>> On Sat, Dec 28, 2013 at 2:10 AM, Vinod Kone <vi...@gmail.com> wrote:
>>
>>> I can't really find an example that is an end-to-end use case. By that I
>>>> mean, I would like to know how to put the scheduler and the executor in the
>>>> correct places. Right now I have a single jar with can be run from the
>>>> command line: java -jar target/collector.jar and that would take care of
>>>> everything.
>>>>
>>>> This collector.jar can act as both scheduler and executor, presumably
>>> based on command line flags? If yes, thats certainly doable. Typically the
>>> scheduler and executor are split into separate jars. This makes it easy to
>>> decouple the upgrade of scheduler and executor.
>>>
>>>
>>>> My current train of thought is that the webapp jar would stay somewhere
>>>> on an S3 url and the "CollectorScheduler" would "somehow" tell a mesos
>>>> slave to run the "CollectorExecutor" which in turn fetch the jar from S3
>>>> and run it.
>>>>
>>>>
>>> Yes, you are on the right track. Mesos slave can download the jar for
>>> you as long as it could be accessed via (http://, https://, ftp://,
>>> hdfs:// etc). This is how you do it:
>>>
>>> When you launch a task from the scheduler via 'launchTaks()' you give it
>>> a 'vector<TaskInfo>' as one of the arguments. Since you are using a custom
>>> executor you should set 'TaskInfo.ExecutorInfo' (see mesos.proto) to point
>>> to your executor. To specify the S3 URL you would set
>>> 'TaskInfo.ExecutorInfo.CommandInfo.URI.value'. To tell slave the command to
>>> launch the executor after it downloads the the executor, you would set
>>> 'TaskInfo.ExecutorInfo.CommandInfo.value'.
>>>
>>> You can find some examples here:
>>>
>>> Hadoop scheduler<https://github.com/mesos/hadoop/blob/master/src/main/java/org/apache/hadoop/mapred/ResourcePolicy.java>
>>>
>>> Example Java scheduler <https://github.com/apache/mesos/blob/master/src/examples/java/TestFramework.java>
>>>
>>> Hope that helps. Let us know if you have additional questions.
>>>
>>> Vinod
>>>
>>
>>
>