You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@openwhisk.apache.org by Tyson Norris <tn...@adobe.com.INVALID> on 2017/07/01 15:24:53 UTC

Improving support for UI driven use cases

Hi -
I have added a wiki page to propose an approach to better support UI drive use cases:
https://cwiki.apache.org/confluence/display/OPENWHISK/UI+Driven+Use+Cases
(Thanks for letting me into the wiki Felix!)

Dragos and I have been discussing and working on this for some time, and there are several use cases that we are trying to solve for that would benefit, and I expect other users of OpenWhisk will appreciate this as well, especially if they are considering integrating OpenWhisk APIs into UI workflows where concurrent user load can be significant.

In general, the proposal is to allow concurrent activation processing within containers, for the benefit of improved system throughput that is less tied to concurrent user load, so that OpenWhisk HTTP triggers can be used in for UI use cases, where a user is waiting for response, and a response that times out will be an error, with no guarantee of later processing required, which is notably different than event driven use cases.

Given that the commenting on the wiki is problematic (I think people need to request individual access before commenting is allowed?), I guess email discussion would be the best way to provide feedback.

Thanks for your feedback!
Tyson

Re: Improving support for UI driven use cases

Posted by Nate DAmico <ka...@apache.org>.

Exciting to see use cases pushing the limits of openwhisk and the proposal
for UI and lower latency use cases.

A lot of points going on in this thread and assume more will be flushed out
as well as the on the wiki page, wanted to add one more consideration the
group should take into account, sorry if muddying the waters some.

With the group efforts to port openwhisk to kubernetes this low
latency/scale-out use case seems would be effected partially or greatly
depending on how the implementation approach comes out.  Anything that is
dealing with spin up/scale-out/scale-down and generally scheduling of
containers would be effected when running in kube or the like.  In some
cases the underlying orchestrator would provide some "batteries included"
that openwhisk could possibly leverage or get "for free", such as something
like kubernetes pod scaling using cpu or other whisk provided metrics for
managing the pool of request handling containers:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale

Anyways, didn't want to muddy the waters, figured as both kube support and
this are in proposal stage, talk of making parts of openwhisk more
pluggable for kube support could have an effect on this approach.  It would
be great if when both this and kube support are in one wouldn't have to run
"native whisk" in order to take advantage of this proposal vs being able to
run on the "kubernetes whisk" as well, but understand sometimes things
happen in multiple stages of design and implementation

Nate


On Thu, Jul 6, 2017 at 1:00 PM, Dascalita Dragos <dd...@gmail.com> wrote:

> > The prototype PR from Tyson was based on a fixed capacity of concurrent
> activations per container. From that, I presume once the limit is reached,
> the load balancer would roll over to allocate a new container.
> +1. This is indeed the intent in this proposal.
>
> > a much higher level of complexity and traditional behavior than what you
> described.
>
> Thanks for bringing up this point as well. IMO this also makes sense, and
> it's not in conflict with the current proposal, but rather an addition to
> it, for a later time.  if OW doesn't monitor the activity at the action
> container level, then it's going to be hard to ensure reliable resource
> allocation across action containers. Based on my tests Memory is the only
> parameter we can correctly allocate for Docker containers. For CPU, unless
> "--cpu-set" is used, CPU is a shared resource across actions; one action
> may impact other actions. For network, unless we implement custom network
> drivers for containers, the bandwidth is shared between actions; one action
> can congest the network, impacting other actions as well . Disk I/O, same
> problem.
>
> So my point is that without monitoring, resource isolation (beyond memory)
> remains theoretical at this point.
>
> In an ideal picture OW would monitor closely any available parameters when
> invoking actions, through Tracing, monitoring containers, etc, anything
> that's available. Then through machine learning OW can learn what's a
> normal "SLA" for an action, maybe by simply learning the normal
> distribution of response times, if CPU and other parameters are too much to
> analyze. Then if the action doesn't behave normally for an Nth percentile,
> take 2 courses of action:
> 1) observe if the action has been impacted by other actions, and
> re-schedule it on other VMs if that's the case. Today OW tries to achieve
> some isolation through load balancer and invoker settings, but the rules
> are not dynamic.
> 2) otherwise, notify the developer that an anomaly is happening for one of
> the actions
>
> These examples are out of the scope for the current proposal. I only shared
> them so that we don't take monitoring out of the picture later. It's worth
> a separate conversation on this DL, and it's not as pressing as the
> performance topic is right now.
>
> Dragos
>
>
> On Thu, Jul 6, 2017 at 4:40 AM Michael M Behrendt <
> Michaelbehrendt@de.ibm.com> wrote:
>
> > thx for clarifying, very helpful. The approach you described could be
> > really interesting. I was thrown off by Dragos' comment saying:
> >
> >  "What stops Openwhisk to be smart in observing the response times, CPU
> > consumption memory consumption of the running containers ? Doing so it
> > could learn automatically how many concurrent requests 1 action can
> > handle."
> >
> > ...which in my mind would have implied a much higher level of complexity
> > and traditional behavior than what you described.
> >
> > Dragos,
> > did I misinterpret you?
> >
> >
> >
> > Thanks & best regards
> > Michael
> >
> >
> >
> >
> > From:   Rodric Rabbah <ro...@gmail.com>
> > To:     dev@openwhisk.apache.org
> > Date:   07/06/2017 01:04 PM
> > Subject:        Re: Improving support for UI driven use cases
> >
> >
> >
> > The prototype PR from Tyson was based on a fixed capacity of concurrent
> > activations per container. From that, I presume once the limit is
> reached,
> > the load balancer would roll over to allocate a new container.
> >
> > -r
> >
> > > On Jul 6, 2017, at 6:09 AM, Michael M Behrendt
> > <Mi...@de.ibm.com> wrote:
> > >
> > > Hi Michael,
> > >
> > > thx for checking. I wasn't referring to adding/removing VMs, but rather
> > > activation contaIners. In today's model that is done intrinsically,
> > while
> > > I *think* in what Dragos described, the containers would have to be
> > > monitored somehow so this new component can decide (based on
> > > cpu/mem/io/etc load within the containers) when to add/remove
> > containers.
> > >
> > >
> > > Thanks & best regards
> > > Michael
> >
> >
> >
> >
> >
> >
>

Re: Improving support for UI driven use cases

Posted by Dascalita Dragos <dd...@gmail.com>.

> The prototype PR from Tyson was based on a fixed capacity of concurrent
activations per container. From that, I presume once the limit is reached,
the load balancer would roll over to allocate a new container.
+1. This is indeed the intent in this proposal.

> a much higher level of complexity and traditional behavior than what you
described.

Thanks for bringing up this point as well. IMO this also makes sense, and
it's not in conflict with the current proposal, but rather an addition to
it, for a later time.  if OW doesn't monitor the activity at the action
container level, then it's going to be hard to ensure reliable resource
allocation across action containers. Based on my tests Memory is the only
parameter we can correctly allocate for Docker containers. For CPU, unless
"--cpu-set" is used, CPU is a shared resource across actions; one action
may impact other actions. For network, unless we implement custom network
drivers for containers, the bandwidth is shared between actions; one action
can congest the network, impacting other actions as well . Disk I/O, same
problem.

So my point is that without monitoring, resource isolation (beyond memory)
remains theoretical at this point.

In an ideal picture OW would monitor closely any available parameters when
invoking actions, through Tracing, monitoring containers, etc, anything
that's available. Then through machine learning OW can learn what's a
normal "SLA" for an action, maybe by simply learning the normal
distribution of response times, if CPU and other parameters are too much to
analyze. Then if the action doesn't behave normally for an Nth percentile,
take 2 courses of action:
1) observe if the action has been impacted by other actions, and
re-schedule it on other VMs if that's the case. Today OW tries to achieve
some isolation through load balancer and invoker settings, but the rules
are not dynamic.
2) otherwise, notify the developer that an anomaly is happening for one of
the actions

These examples are out of the scope for the current proposal. I only shared
them so that we don't take monitoring out of the picture later. It's worth
a separate conversation on this DL, and it's not as pressing as the
performance topic is right now.

Dragos

On Thu, Jul 6, 2017 at 4:40 AM Michael M Behrendt <
Michaelbehrendt@de.ibm.com> wrote:

> thx for clarifying, very helpful. The approach you described could be
> really interesting. I was thrown off by Dragos' comment saying:
>
>  "What stops Openwhisk to be smart in observing the response times, CPU
> consumption memory consumption of the running containers ? Doing so it
> could learn automatically how many concurrent requests 1 action can
> handle."
>
> ...which in my mind would have implied a much higher level of complexity
> and traditional behavior than what you described.
>
> Dragos,
> did I misinterpret you?
>
>
>
> Thanks & best regards
> Michael
>
>
>
>
> From:   Rodric Rabbah <ro...@gmail.com>
> To:     dev@openwhisk.apache.org
> Date:   07/06/2017 01:04 PM
> Subject:        Re: Improving support for UI driven use cases
>
>
>
> The prototype PR from Tyson was based on a fixed capacity of concurrent
> activations per container. From that, I presume once the limit is reached,
> the load balancer would roll over to allocate a new container.
>
> -r
>
> > On Jul 6, 2017, at 6:09 AM, Michael M Behrendt
> <Mi...@de.ibm.com> wrote:
> >
> > Hi Michael,
> >
> > thx for checking. I wasn't referring to adding/removing VMs, but rather
> > activation contaIners. In today's model that is done intrinsically,
> while
> > I *think* in what Dragos described, the containers would have to be
> > monitored somehow so this new component can decide (based on
> > cpu/mem/io/etc load within the containers) when to add/remove
> containers.
> >
> >
> > Thanks & best regards
> > Michael
>
>
>
>
>
>

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.

thx for clarifying, very helpful. The approach you described could be 
really interesting. I was thrown off by Dragos' comment saying:

 "What stops Openwhisk to be smart in observing the response times, CPU 
consumption memory consumption of the running containers ? Doing so it 
could learn automatically how many concurrent requests 1 action can 
handle."

...which in my mind would have implied a much higher level of complexity 
and traditional behavior than what you described.

Dragos,
did I misinterpret you?

Thanks & best regards
Michael

From:   Rodric Rabbah <ro...@gmail.com>
To:     dev@openwhisk.apache.org
Date:   07/06/2017 01:04 PM
Subject:        Re: Improving support for UI driven use cases

The prototype PR from Tyson was based on a fixed capacity of concurrent 
activations per container. From that, I presume once the limit is reached, 
the load balancer would roll over to allocate a new container.

-r

> On Jul 6, 2017, at 6:09 AM, Michael M Behrendt 
<Mi...@de.ibm.com> wrote:
> 
> Hi Michael,
> 
> thx for checking. I wasn't referring to adding/removing VMs, but rather 
> activation contaIners. In today's model that is done intrinsically, 
while 
> I *think* in what Dragos described, the containers would have to be 
> monitored somehow so this new component can decide (based on 
> cpu/mem/io/etc load within the containers) when to add/remove 
containers.
> 
> 
> Thanks & best regards
> Michael

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

The prototype PR from Tyson was based on a fixed capacity of concurrent activations per container. From that, I presume once the limit is reached, the load balancer would roll over to allocate a new container.

-r

> On Jul 6, 2017, at 6:09 AM, Michael M Behrendt <Mi...@de.ibm.com> wrote:
> 
> Hi Michael,
> 
> thx for checking. I wasn't referring to adding/removing VMs, but rather 
> activation contaIners. In today's model that is done intrinsically, while 
> I *think* in what Dragos described, the containers would have to be 
> monitored somehow so this new component can decide (based on 
> cpu/mem/io/etc load within the containers) when to add/remove containers.
> 
> 
> Thanks & best regards
> Michael

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.

Hi Michael,

thx for checking. I wasn't referring to adding/removing VMs, but rather 
activation contaIners. In today's model that is done intrinsically, while 
I *think* in what Dragos described, the containers would have to be 
monitored somehow so this new component can decide (based on 
cpu/mem/io/etc load within the containers) when to add/remove containers.


Thanks & best regards
Michael

---------------------------
IBM Distinguished Engineer
Chief Architect, Serverless / FaaS & OpenWhisk
Mobile: +49-170-7993527
michaelbehrendt@de.ibm.com |  @michael_beh

IBM Deutschland Research & Development GmbH / Vorsitzender des 
Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp 
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, 
HRB 243294 



From:   Michael Marth <mm...@adobe.com.INVALID>
To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
Date:   07/05/2017 08:28 PM
Subject:        Re: Improving support for UI driven use cases



Hi Michael,

To make sure we mean the same thing with the word ?autoscaling? in the 
context of this thread and in the context of OpenWhisk: I refer to the 
(automated) increase/decrease of the VMs that run the action containers.
Is that what you also refer to?

If so, then the proposal at hand is orthogonal to autoscaling. At its core 
it is about increasing the density of executing actions within one 
container and in that sense independent of how many containers, VMs, etc 
there are in the system or how the system is shrunk/grown.

In practical terms there is still a connection between proposal and 
scaling the VMs: if the density of executing actions is increased by 
orders of magnitude then the topic of scaling the VMs becomes a much less 
pressing topic (at least for the types of workload I described 
previously). But this practical consideration should not be mistaken for 
this being a discussion of autoscaling.

Please let me know if I misunderstood your use of the term autoscaling or 
if the above does not explain well.

Thanks!
Michael 




On 05/07/17 16:57, "Michael M Behrendt" <Mi...@de.ibm.com> 
wrote:

>
>
>Hi Michael/Rodric,
>
>I'm struggling to understand how a separate invoker pool helps us 
avoiding
>to implement traditional autoscaling if we process multiple activations 
as
>threads within a shared process. Can you pls elaborate / provide an
>example?
>
>Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mm...@adobe.com.INVALID> 
wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <ro...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
>capacity (N invokers provide M containers per invoker). Once all those
>slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
>over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed 
as
>managing a different resource pool (and not the same pool of containers 
as
>ephemeral actions). Once you buy into that, generalization to other
>resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
><Mi...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
>what
>>>> are your thoughts on how this would help avoiding the 
reimplementation
>of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mm...@adobe.com.INVALID>
>>>> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:        Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about 
that"
>>>>
>>>> Again, the proposal at hand does not intend to change that at all. On
>the
>>>> contrary - in our mind it?s a requirement that the developer should 
not
>
>>>> change or that internals of the execution engines get exposed.
>>>>
>>>> I find Stephen?s comment about generalising the runtime behaviour 
very
>>>> exciting. It could open the door to very different types of workloads
>>>> (like training Tensorflow or running Spark jobs), but with the same
>value
>>>> prop: users do not have to care about the managing resources/servers.
>And
>>>> for providers of OW systems all the OW goodies would still apply 
(e.g.
>>>> running untrusted code). Moreover, if we split the Invoker into
>different
>>>> specialised Invokers then those different specialised workloads could
>live
>>>> independently from each other (in terms of code as well as resource
>>>> allocation in deployments).
>>>> You can probably tell I am really excited about Stephen's idea :) I
>think
>>>> it would be a great step forward in increasing the use cases for OW.
>>>>
>>>> Cheers
>>>> Michael
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com>
>>>> wrote:
>>>>
>>>>> Hi Dragos,
>>>>>
>>>>>> What stops
>>>>>> Openwhisk to be smart in observing the response times, CPU
>consumption,
>>>>>> memory consumption of the running containers ?
>>>>>
>>>>> What are your thoughts on how this approach would be different from
>the
>>>> many IaaS- and PaaS-centric autoscaling solutions that have been 
built
>>>> over the last years? All of them require relatively complex policies
>(eg
>>>> scale based on cpu or mem utilization, end-user response time, etc.?
>What
>>>> are the thresholds for when to add/remove capacity?), and a value 
prop
>of
>>>> serverless is that folks don't have to care about that.
>>>>>
>>>>> we should discuss more during the call, but wanted to get this out 
as
>>>> food for thought.
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> 
wrote:
>>>>>
>>>>>>> How could a developer understand how many requests per container 
to
>>>> set
>>>>>>
>>>>>> James, this is a good point, along with the other points in your
>email.
>>>>>>
>>>>>> I think the developer doesn't need to know this info actually. What
>>>> stops
>>>>>> Openwhisk to be smart in observing the response times, CPU
>consumption,
>>>>>> memory consumption of the running containers ? Doing so it could
>learn
>>>>>> automatically how many concurrent requests 1 action can handle. It
>>>> might be
>>>>>> easier to solve this problem efficiently, instead of the other
>problem
>>>>>> which pushes the entire system to its limits when a couple of 
actions
>
>>>> get a
>>>>>> lot of traffic.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas 
<jt...@gmail.com>
>>>> wrote:
>>>>>>>
>>>>>>> +1 on Markus' points about "crash safety" and "scaling". I can
>>>> understand
>>>>>>> the reasons behind exploring this change but from a developer
>>>> experience
>>>>>>> point of view this adds introduces a large amount of complexity to
>the
>>>>>>> programming model.
>>>>>>>
>>>>>>> If I have a concurrent container serving 100 requests and one of 
the
>>>>>>> requests triggers a fatal error how does that affect the other
>>>> requests?
>>>>>>> Tearing down the entire runtime environment will destroy all those
>>>>>>> requests.
>>>>>>>
>>>>>>> How could a developer understand how many requests per container 
to
>>>> set
>>>>>>> without a manual trial and error process? It also means you have 
to
>>>> start
>>>>>>> considering things like race conditions or other challenges of
>>>> concurrent
>>>>>>> code execution. This makes debugging and monitoring also more
>>>> challenging.
>>>>>>>
>>>>>>> Looking at the other serverless providers, I've not seen this
>featured
>>>>>>> requested before. Developers generally ask AWS to raise the
>concurrent
>>>>>>> invocations limit for their application. This keeps the platform
>doing
>>>> the
>>>>>>> hard task of managing resources and being efficient and allows 
them
>to
>>>> use
>>>>>>> the same programming model.
>>>>>>>
>>>>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com>
>>>> wrote:
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>> To Rodric's points I think there are two topics to speak about and
>>>> discuss:
>>>>>>>>
>>>>>>>> 1. The programming model: The current model encourages users to
>break
>>>>>>>> their actions apart in "functions" that take payload and return
>>>> payload.
>>>>>>>> Having a deployment model outlined could as noted encourage users
>to
>>>> use
>>>>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual 
webserver
>
>>>> based
>>>>>>>> applications. The current model is nice in that it solves a lot 
of
>>>>>>> problems
>>>>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>>>>
>>>>>>>> 2. Raw throughput of our deployment model: Setting the concerns
>aside
>>>> I
>>>>>>>> think it is valid to explore concurrent invocations of actions on
>the
>>>>>>> same
>>>>>>>> container. This does not necessarily mean that users start to
>deploy
>>>>>>>> monolithic apps as noted above, but it certainly could. Keeping 
our
>>>>>>>> JSON-in/JSON-out at least for now though, could encourage users 
to
>>>>>>> continue
>>>>>>>> to think in functions. Having a toggle per action which is 
disabled
>
>>>> by
>>>>>>>> default might be a good way to start here, since many users might
>>>> need to
>>>>>>>> change action code to support that notion and for some 
applications
>
>>>> it
>>>>>>>> might not be valid at all. I think it was also already noted, 
that
>>>> this
>>>>>>>> imposes some of the "old-fashioned" problems on the user, like: 
How
>
>>>> many
>>>>>>>> concurrent requests will my action be able to handle? That kinda
>>>> defeats
>>>>>>>> the seemless-scalability point of serverless.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Markus
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> James Thomas
>>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>

Re: Improving support for UI driven use cases

Posted by Michael Marth <mm...@adobe.com.INVALID>.

Hi Michael,

To make sure we mean the same thing with the word “autoscaling” in the context of this thread and in the context of OpenWhisk: I refer to the (automated) increase/decrease of the VMs that run the action containers.
Is that what you also refer to?

If so, then the proposal at hand is orthogonal to autoscaling. At its core it is about increasing the density of executing actions within one container and in that sense independent of how many containers, VMs, etc there are in the system or how the system is shrunk/grown.

In practical terms there is still a connection between proposal and scaling the VMs: if the density of executing actions is increased by orders of magnitude then the topic of scaling the VMs becomes a much less pressing topic (at least for the types of workload I described previously). But this practical consideration should not be mistaken for this being a discussion of autoscaling.

Please let me know if I misunderstood your use of the term autoscaling or if the above does not explain well.

Thanks!
Michael 




On 05/07/17 16:57, "Michael M Behrendt" <Mi...@de.ibm.com> wrote:

>
>
>Hi Michael/Rodric,
>
>I'm struggling to understand how a separate invoker pool helps us avoiding
>to implement traditional autoscaling if we process multiple activations as
>threads within a shared process. Can you pls elaborate / provide an
>example?
>
>Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mm...@adobe.com.INVALID> wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <ro...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
>capacity (N invokers provide M containers per invoker). Once all those
>slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
>over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed as
>managing a different resource pool (and not the same pool of containers as
>ephemeral actions). Once you buy into that, generalization to other
>resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
><Mi...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
>what
>>>> are your thoughts on how this would help avoiding the reimplementation
>of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mm...@adobe.com.INVALID>
>>>> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:        Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about that"
>>>>
>>>> Again, the proposal at hand does not intend to change that at all. On
>the
>>>> contrary - in our mind it?s a requirement that the developer should not
>
>>>> change or that internals of the execution engines get exposed.
>>>>
>>>> I find Stephen?s comment about generalising the runtime behaviour very
>>>> exciting. It could open the door to very different types of workloads
>>>> (like training Tensorflow or running Spark jobs), but with the same
>value
>>>> prop: users do not have to care about the managing resources/servers.
>And
>>>> for providers of OW systems all the OW goodies would still apply (e.g.
>>>> running untrusted code). Moreover, if we split the Invoker into
>different
>>>> specialised Invokers then those different specialised workloads could
>live
>>>> independently from each other (in terms of code as well as resource
>>>> allocation in deployments).
>>>> You can probably tell I am really excited about Stephen's idea :) I
>think
>>>> it would be a great step forward in increasing the use cases for OW.
>>>>
>>>> Cheers
>>>> Michael
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com>
>>>> wrote:
>>>>
>>>>> Hi Dragos,
>>>>>
>>>>>> What stops
>>>>>> Openwhisk to be smart in observing the response times, CPU
>consumption,
>>>>>> memory consumption of the running containers ?
>>>>>
>>>>> What are your thoughts on how this approach would be different from
>the
>>>> many IaaS- and PaaS-centric autoscaling solutions that have been built
>>>> over the last years? All of them require relatively complex policies
>(eg
>>>> scale based on cpu or mem utilization, end-user response time, etc.?
>What
>>>> are the thresholds for when to add/remove capacity?), and a value prop
>of
>>>> serverless is that folks don't have to care about that.
>>>>>
>>>>> we should discuss more during the call, but wanted to get this out as
>>>> food for thought.
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>>>>>
>>>>>>> How could a developer understand how many requests per container to
>>>> set
>>>>>>
>>>>>> James, this is a good point, along with the other points in your
>email.
>>>>>>
>>>>>> I think the developer doesn't need to know this info actually. What
>>>> stops
>>>>>> Openwhisk to be smart in observing the response times, CPU
>consumption,
>>>>>> memory consumption of the running containers ? Doing so it could
>learn
>>>>>> automatically how many concurrent requests 1 action can handle. It
>>>> might be
>>>>>> easier to solve this problem efficiently, instead of the other
>problem
>>>>>> which pushes the entire system to its limits when a couple of actions
>
>>>> get a
>>>>>> lot of traffic.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com>
>>>> wrote:
>>>>>>>
>>>>>>> +1 on Markus' points about "crash safety" and "scaling". I can
>>>> understand
>>>>>>> the reasons behind exploring this change but from a developer
>>>> experience
>>>>>>> point of view this adds introduces a large amount of complexity to
>the
>>>>>>> programming model.
>>>>>>>
>>>>>>> If I have a concurrent container serving 100 requests and one of the
>>>>>>> requests triggers a fatal error how does that affect the other
>>>> requests?
>>>>>>> Tearing down the entire runtime environment will destroy all those
>>>>>>> requests.
>>>>>>>
>>>>>>> How could a developer understand how many requests per container to
>>>> set
>>>>>>> without a manual trial and error process? It also means you have to
>>>> start
>>>>>>> considering things like race conditions or other challenges of
>>>> concurrent
>>>>>>> code execution. This makes debugging and monitoring also more
>>>> challenging.
>>>>>>>
>>>>>>> Looking at the other serverless providers, I've not seen this
>featured
>>>>>>> requested before. Developers generally ask AWS to raise the
>concurrent
>>>>>>> invocations limit for their application. This keeps the platform
>doing
>>>> the
>>>>>>> hard task of managing resources and being efficient and allows them
>to
>>>> use
>>>>>>> the same programming model.
>>>>>>>
>>>>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com>
>>>> wrote:
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>> To Rodric's points I think there are two topics to speak about and
>>>> discuss:
>>>>>>>>
>>>>>>>> 1. The programming model: The current model encourages users to
>break
>>>>>>>> their actions apart in "functions" that take payload and return
>>>> payload.
>>>>>>>> Having a deployment model outlined could as noted encourage users
>to
>>>> use
>>>>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver
>
>>>> based
>>>>>>>> applications. The current model is nice in that it solves a lot of
>>>>>>> problems
>>>>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>>>>
>>>>>>>> 2. Raw throughput of our deployment model: Setting the concerns
>aside
>>>> I
>>>>>>>> think it is valid to explore concurrent invocations of actions on
>the
>>>>>>> same
>>>>>>>> container. This does not necessarily mean that users start to
>deploy
>>>>>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>>>>> continue
>>>>>>>> to think in functions. Having a toggle per action which is disabled
>
>>>> by
>>>>>>>> default might be a good way to start here, since many users might
>>>> need to
>>>>>>>> change action code to support that notion and for some applications
>
>>>> it
>>>>>>>> might not be valid at all. I think it was also already noted, that
>>>> this
>>>>>>>> imposes some of the "old-fashioned" problems on the user, like: How
>
>>>> many
>>>>>>>> concurrent requests will my action be able to handle? That kinda
>>>> defeats
>>>>>>>> the seemless-scalability point of serverless.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Markus
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> James Thomas
>>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>

Re: Improving support for UI driven use cases

Posted by Michael Marth <mm...@adobe.com.INVALID>.

Hi Alex,

That is a very interesting question.
If the programming model and guarantees that are exposed to developers involve guarantees on amount of memory then I still see two options:
- reserve the full capacity (i.e. Current model)
- or “overbook” a container. (not exactly the spirit of the current proposal but would lead to similar results)
This leads into a more product-management-like discussion if asking the developers to specify the amount of RAM they desire is a good thing in the first place. In the spirit of “devs shall not care about infra” it might be preferable not even make devs think about that and just execute the code with just enough RAM (or whatever resources are needed).
I mean you can look at the fact that some serverless providers expose RAM, etc to the developers as actually breaking the abstraction and working against the core value prop.
TBH I am not sure if there is a “right” way to look at this topic. Might depend on circumstances of the OW deployment.

Michael






On 05/07/17 17:45, "Alex Glikson" <GL...@il.ibm.com> wrote:

>Once different 'flavors' of pools/invokers are supported, one could 
>implement whatever policy for resource allocation and/or isolation and/or 
>load balancing they want in an invoker (or group of invokers) - without 
>necessarily affecting the 'core' of OpenWhisk, as long as the programming 
>model remains the same.
>However, with containers handling multiple requests, I am not sure that 
>the latter will be still true -- in particular, whether the developer can 
>still assume dedicated resource allocation per action invocation 
>(primarily memory), or we would also need to surface heterogeneous 
>'flavors' of resources allocated for an action (which might be perceived 
>as a natural and good thing - or maybe the opposite, given that we are 
>trying to make the developer unaware of infrastructure).
>
>Regards,
>Alex
>
>
>
>
>From:   "Michael M Behrendt" <Mi...@de.ibm.com>
>To:     dev@openwhisk.apache.org
>Date:   05/07/2017 05:58 PM
>Subject:        Re: Improving support for UI driven use cases
>
>
>
>
>
>Hi Michael/Rodric,
>
>I'm struggling to understand how a separate invoker pool helps us avoiding
>to implement traditional autoscaling if we process multiple activations as
>threads within a shared process. Can you pls elaborate / provide an
>example?
>
>Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mm...@adobe.com.INVALID> 
>wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <ro...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
>capacity (N invokers provide M containers per invoker). Once all those
>slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
>over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed 
>as
>managing a different resource pool (and not the same pool of containers as
>ephemeral actions). Once you buy into that, generalization to other
>resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
><Mi...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
>what
>>>> are your thoughts on how this would help avoiding the reimplementation
>of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mm...@adobe.com.INVALID>
>>>> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:        Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about that"
>>>>
>>>> Again, the proposal at hand does not intend to change that at all. On
>the
>>>> contrary - in our mind it?s a requirement that the developer should 
>not
>
>>>> change or that internals of the execution engines get exposed.
>>>>
>>>> I find Stephen?s comment about generalising the runtime behaviour very
>>>> exciting. It could open the door to very different types of workloads
>>>> (like training Tensorflow or running Spark jobs), but with the same
>value
>>>> prop: users do not have to care about the managing resources/servers.
>And
>>>> for providers of OW systems all the OW goodies would still apply (e.g.
>>>> running untrusted code). Moreover, if we split the Invoker into
>different
>>>> specialised Invokers then those different specialised workloads could
>live
>>>> independently from each other (in terms of code as well as resource
>>>> allocation in deployments).
>>>> You can probably tell I am really excited about Stephen's idea :) I
>think
>>>> it would be a great step forward in increasing the use cases for OW.
>>>>
>>>> Cheers
>>>> Michael
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com>
>>>> wrote:
>>>>
>>>>> Hi Dragos,
>>>>>
>>>>>> What stops
>>>>>> Openwhisk to be smart in observing the response times, CPU
>consumption,
>>>>>> memory consumption of the running containers ?
>>>>>
>>>>> What are your thoughts on how this approach would be different from
>the
>>>> many IaaS- and PaaS-centric autoscaling solutions that have been built
>>>> over the last years? All of them require relatively complex policies
>(eg
>>>> scale based on cpu or mem utilization, end-user response time, etc.?
>What
>>>> are the thresholds for when to add/remove capacity?), and a value prop
>of
>>>> serverless is that folks don't have to care about that.
>>>>>
>>>>> we should discuss more during the call, but wanted to get this out as
>>>> food for thought.
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> 
>wrote:
>>>>>
>>>>>>> How could a developer understand how many requests per container to
>>>> set
>>>>>>
>>>>>> James, this is a good point, along with the other points in your
>email.
>>>>>>
>>>>>> I think the developer doesn't need to know this info actually. What
>>>> stops
>>>>>> Openwhisk to be smart in observing the response times, CPU
>consumption,
>>>>>> memory consumption of the running containers ? Doing so it could
>learn
>>>>>> automatically how many concurrent requests 1 action can handle. It
>>>> might be
>>>>>> easier to solve this problem efficiently, instead of the other
>problem
>>>>>> which pushes the entire system to its limits when a couple of 
>actions
>
>>>> get a
>>>>>> lot of traffic.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com>
>>>> wrote:
>>>>>>>
>>>>>>> +1 on Markus' points about "crash safety" and "scaling". I can
>>>> understand
>>>>>>> the reasons behind exploring this change but from a developer
>>>> experience
>>>>>>> point of view this adds introduces a large amount of complexity to
>the
>>>>>>> programming model.
>>>>>>>
>>>>>>> If I have a concurrent container serving 100 requests and one of 
>the
>>>>>>> requests triggers a fatal error how does that affect the other
>>>> requests?
>>>>>>> Tearing down the entire runtime environment will destroy all those
>>>>>>> requests.
>>>>>>>
>>>>>>> How could a developer understand how many requests per container to
>>>> set
>>>>>>> without a manual trial and error process? It also means you have to
>>>> start
>>>>>>> considering things like race conditions or other challenges of
>>>> concurrent
>>>>>>> code execution. This makes debugging and monitoring also more
>>>> challenging.
>>>>>>>
>>>>>>> Looking at the other serverless providers, I've not seen this
>featured
>>>>>>> requested before. Developers generally ask AWS to raise the
>concurrent
>>>>>>> invocations limit for their application. This keeps the platform
>doing
>>>> the
>>>>>>> hard task of managing resources and being efficient and allows them
>to
>>>> use
>>>>>>> the same programming model.
>>>>>>>
>>>>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com>
>>>> wrote:
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>> To Rodric's points I think there are two topics to speak about and
>>>> discuss:
>>>>>>>>
>>>>>>>> 1. The programming model: The current model encourages users to
>break
>>>>>>>> their actions apart in "functions" that take payload and return
>>>> payload.
>>>>>>>> Having a deployment model outlined could as noted encourage users
>to
>>>> use
>>>>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual 
>webserver
>
>>>> based
>>>>>>>> applications. The current model is nice in that it solves a lot of
>>>>>>> problems
>>>>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>>>>
>>>>>>>> 2. Raw throughput of our deployment model: Setting the concerns
>aside
>>>> I
>>>>>>>> think it is valid to explore concurrent invocations of actions on
>the
>>>>>>> same
>>>>>>>> container. This does not necessarily mean that users start to
>deploy
>>>>>>>> monolithic apps as noted above, but it certainly could. Keeping 
>our
>>>>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>>>>> continue
>>>>>>>> to think in functions. Having a toggle per action which is 
>disabled
>
>>>> by
>>>>>>>> default might be a good way to start here, since many users might
>>>> need to
>>>>>>>> change action code to support that notion and for some 
>applications
>
>>>> it
>>>>>>>> might not be valid at all. I think it was also already noted, that
>>>> this
>>>>>>>> imposes some of the "old-fashioned" problems on the user, like: 
>How
>
>>>> many
>>>>>>>> concurrent requests will my action be able to handle? That kinda
>>>> defeats
>>>>>>>> the seemless-scalability point of serverless.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Markus
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> James Thomas
>>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>
>
>
>

Re: Improving support for UI driven use cases

Posted by Alex Glikson <GL...@il.ibm.com>.

Once different 'flavors' of pools/invokers are supported, one could 
implement whatever policy for resource allocation and/or isolation and/or 
load balancing they want in an invoker (or group of invokers) - without 
necessarily affecting the 'core' of OpenWhisk, as long as the programming 
model remains the same.
However, with containers handling multiple requests, I am not sure that 
the latter will be still true -- in particular, whether the developer can 
still assume dedicated resource allocation per action invocation 
(primarily memory), or we would also need to surface heterogeneous 
'flavors' of resources allocated for an action (which might be perceived 
as a natural and good thing - or maybe the opposite, given that we are 
trying to make the developer unaware of infrastructure).

Regards,
Alex




From:   "Michael M Behrendt" <Mi...@de.ibm.com>
To:     dev@openwhisk.apache.org
Date:   05/07/2017 05:58 PM
Subject:        Re: Improving support for UI driven use cases





Hi Michael/Rodric,

I'm struggling to understand how a separate invoker pool helps us avoiding
to implement traditional autoscaling if we process multiple activations as
threads within a shared process. Can you pls elaborate / provide an
example?

Sent from my iPhone

> On 5. Jul 2017, at 16:53, Michael Marth <mm...@adobe.com.INVALID> 
wrote:
>
> Michael B,
> Re your question: exactly what Rodric said :)
>
>
>
>> On 05/07/17 12:32, "Rodric Rabbah" <ro...@gmail.com> wrote:
>>
>> The issue at hand is precisely because there isn't any autoscaling of
capacity (N invokers provide M containers per invoker). Once all those
slots are consumed any new requests are queued - as previously discussed.
>>
>> Adding more density per vm is one way of providing additional capacity
over finite resources. This is the essence of the initial proposal.
>>
>> As noted in previous discussions on this topic, this should be viewed 
as
managing a different resource pool (and not the same pool of containers as
ephemeral actions). Once you buy into that, generalization to other
resource pools becomes natural.
>>
>> Going further, serverless becomes the new PaaS.
>>
>> -r
>>
>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
<Mi...@de.ibm.com> wrote:
>>>
>>> Hi Michael,
>>>
>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>
>>> I might not yet have fully gotten my head around Steve's proposal --
what
>>> are your thoughts on how this would help avoiding the reimplementation
of
>>> an autoscaling / feedback loop mechanism, as we know it from more
>>> traditional runtime platforms?
>>>
>>>
>>> Thanks & best regards
>>> Michael
>>>
>>>
>>>
>>> From:   Michael Marth <mm...@adobe.com.INVALID>
>>> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
>>> Date:   07/05/2017 11:25 AM
>>> Subject:        Re: Improving support for UI driven use cases
>>>
>>>
>>>
>>> Hi Michael,
>>>
>>> Totally agree with your statement
>>> ?value prop of serverless is that folks don't have to care about that"
>>>
>>> Again, the proposal at hand does not intend to change that at all. On
the
>>> contrary - in our mind it?s a requirement that the developer should 
not

>>> change or that internals of the execution engines get exposed.
>>>
>>> I find Stephen?s comment about generalising the runtime behaviour very
>>> exciting. It could open the door to very different types of workloads
>>> (like training Tensorflow or running Spark jobs), but with the same
value
>>> prop: users do not have to care about the managing resources/servers.
And
>>> for providers of OW systems all the OW goodies would still apply (e.g.
>>> running untrusted code). Moreover, if we split the Invoker into
different
>>> specialised Invokers then those different specialised workloads could
live
>>> independently from each other (in terms of code as well as resource
>>> allocation in deployments).
>>> You can probably tell I am really excited about Stephen's idea :) I
think
>>> it would be a great step forward in increasing the use cases for OW.
>>>
>>> Cheers
>>> Michael
>>>
>>>
>>>
>>>
>>>
>>> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com>
>>> wrote:
>>>
>>>> Hi Dragos,
>>>>
>>>>> What stops
>>>>> Openwhisk to be smart in observing the response times, CPU
consumption,
>>>>> memory consumption of the running containers ?
>>>>
>>>> What are your thoughts on how this approach would be different from
the
>>> many IaaS- and PaaS-centric autoscaling solutions that have been built
>>> over the last years? All of them require relatively complex policies
(eg
>>> scale based on cpu or mem utilization, end-user response time, etc.?
What
>>> are the thresholds for when to add/remove capacity?), and a value prop
of
>>> serverless is that folks don't have to care about that.
>>>>
>>>> we should discuss more during the call, but wanted to get this out as
>>> food for thought.
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> 
wrote:
>>>>
>>>>>> How could a developer understand how many requests per container to
>>> set
>>>>>
>>>>> James, this is a good point, along with the other points in your
email.
>>>>>
>>>>> I think the developer doesn't need to know this info actually. What
>>> stops
>>>>> Openwhisk to be smart in observing the response times, CPU
consumption,
>>>>> memory consumption of the running containers ? Doing so it could
learn
>>>>> automatically how many concurrent requests 1 action can handle. It
>>> might be
>>>>> easier to solve this problem efficiently, instead of the other
problem
>>>>> which pushes the entire system to its limits when a couple of 
actions

>>> get a
>>>>> lot of traffic.
>>>>>
>>>>>
>>>>>
>>>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com>
>>> wrote:
>>>>>>
>>>>>> +1 on Markus' points about "crash safety" and "scaling". I can
>>> understand
>>>>>> the reasons behind exploring this change but from a developer
>>> experience
>>>>>> point of view this adds introduces a large amount of complexity to
the
>>>>>> programming model.
>>>>>>
>>>>>> If I have a concurrent container serving 100 requests and one of 
the
>>>>>> requests triggers a fatal error how does that affect the other
>>> requests?
>>>>>> Tearing down the entire runtime environment will destroy all those
>>>>>> requests.
>>>>>>
>>>>>> How could a developer understand how many requests per container to
>>> set
>>>>>> without a manual trial and error process? It also means you have to
>>> start
>>>>>> considering things like race conditions or other challenges of
>>> concurrent
>>>>>> code execution. This makes debugging and monitoring also more
>>> challenging.
>>>>>>
>>>>>> Looking at the other serverless providers, I've not seen this
featured
>>>>>> requested before. Developers generally ask AWS to raise the
concurrent
>>>>>> invocations limit for their application. This keeps the platform
doing
>>> the
>>>>>> hard task of managing resources and being efficient and allows them
to
>>> use
>>>>>> the same programming model.
>>>>>>
>>>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com>
>>> wrote:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>
>>>>>>>
>>>>>> To Rodric's points I think there are two topics to speak about and
>>> discuss:
>>>>>>>
>>>>>>> 1. The programming model: The current model encourages users to
break
>>>>>>> their actions apart in "functions" that take payload and return
>>> payload.
>>>>>>> Having a deployment model outlined could as noted encourage users
to
>>> use
>>>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual 
webserver

>>> based
>>>>>>> applications. The current model is nice in that it solves a lot of
>>>>>> problems
>>>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>>>
>>>>>>> 2. Raw throughput of our deployment model: Setting the concerns
aside
>>> I
>>>>>>> think it is valid to explore concurrent invocations of actions on
the
>>>>>> same
>>>>>>> container. This does not necessarily mean that users start to
deploy
>>>>>>> monolithic apps as noted above, but it certainly could. Keeping 
our
>>>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>>>> continue
>>>>>>> to think in functions. Having a toggle per action which is 
disabled

>>> by
>>>>>>> default might be a good way to start here, since many users might
>>> need to
>>>>>>> change action code to support that notion and for some 
applications

>>> it
>>>>>>> might not be valid at all. I think it was also already noted, that
>>> this
>>>>>>> imposes some of the "old-fashioned" problems on the user, like: 
How

>>> many
>>>>>>> concurrent requests will my action be able to handle? That kinda
>>> defeats
>>>>>>> the seemless-scalability point of serverless.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Markus
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> James Thomas
>>>>>>
>>>>
>>>
>>>
>>>
>>>

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.


Hi Michael/Rodric,

I'm struggling to understand how a separate invoker pool helps us avoiding
to implement traditional autoscaling if we process multiple activations as
threads within a shared process. Can you pls elaborate / provide an
example?

Sent from my iPhone

> On 5. Jul 2017, at 16:53, Michael Marth <mm...@adobe.com.INVALID> wrote:
>
> Michael B,
> Re your question: exactly what Rodric said :)
>
>
>
>> On 05/07/17 12:32, "Rodric Rabbah" <ro...@gmail.com> wrote:
>>
>> The issue at hand is precisely because there isn't any autoscaling of
capacity (N invokers provide M containers per invoker). Once all those
slots are consumed any new requests are queued - as previously discussed.
>>
>> Adding more density per vm is one way of providing additional capacity
over finite resources. This is the essence of the initial proposal.
>>
>> As noted in previous discussions on this topic, this should be viewed as
managing a different resource pool (and not the same pool of containers as
ephemeral actions). Once you buy into that, generalization to other
resource pools becomes natural.
>>
>> Going further, serverless becomes the new PaaS.
>>
>> -r
>>
>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
<Mi...@de.ibm.com> wrote:
>>>
>>> Hi Michael,
>>>
>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>
>>> I might not yet have fully gotten my head around Steve's proposal --
what
>>> are your thoughts on how this would help avoiding the reimplementation
of
>>> an autoscaling / feedback loop mechanism, as we know it from more
>>> traditional runtime platforms?
>>>
>>>
>>> Thanks & best regards
>>> Michael
>>>
>>>
>>>
>>> From:   Michael Marth <mm...@adobe.com.INVALID>
>>> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
>>> Date:   07/05/2017 11:25 AM
>>> Subject:        Re: Improving support for UI driven use cases
>>>
>>>
>>>
>>> Hi Michael,
>>>
>>> Totally agree with your statement
>>> ?value prop of serverless is that folks don't have to care about that"
>>>
>>> Again, the proposal at hand does not intend to change that at all. On
the
>>> contrary - in our mind it?s a requirement that the developer should not

>>> change or that internals of the execution engines get exposed.
>>>
>>> I find Stephen?s comment about generalising the runtime behaviour very
>>> exciting. It could open the door to very different types of workloads
>>> (like training Tensorflow or running Spark jobs), but with the same
value
>>> prop: users do not have to care about the managing resources/servers.
And
>>> for providers of OW systems all the OW goodies would still apply (e.g.
>>> running untrusted code). Moreover, if we split the Invoker into
different
>>> specialised Invokers then those different specialised workloads could
live
>>> independently from each other (in terms of code as well as resource
>>> allocation in deployments).
>>> You can probably tell I am really excited about Stephen's idea :) I
think
>>> it would be a great step forward in increasing the use cases for OW.
>>>
>>> Cheers
>>> Michael
>>>
>>>
>>>
>>>
>>>
>>> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com>
>>> wrote:
>>>
>>>> Hi Dragos,
>>>>
>>>>> What stops
>>>>> Openwhisk to be smart in observing the response times, CPU
consumption,
>>>>> memory consumption of the running containers ?
>>>>
>>>> What are your thoughts on how this approach would be different from
the
>>> many IaaS- and PaaS-centric autoscaling solutions that have been built
>>> over the last years? All of them require relatively complex policies
(eg
>>> scale based on cpu or mem utilization, end-user response time, etc.?
What
>>> are the thresholds for when to add/remove capacity?), and a value prop
of
>>> serverless is that folks don't have to care about that.
>>>>
>>>> we should discuss more during the call, but wanted to get this out as
>>> food for thought.
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>>>>
>>>>>> How could a developer understand how many requests per container to
>>> set
>>>>>
>>>>> James, this is a good point, along with the other points in your
email.
>>>>>
>>>>> I think the developer doesn't need to know this info actually. What
>>> stops
>>>>> Openwhisk to be smart in observing the response times, CPU
consumption,
>>>>> memory consumption of the running containers ? Doing so it could
learn
>>>>> automatically how many concurrent requests 1 action can handle. It
>>> might be
>>>>> easier to solve this problem efficiently, instead of the other
problem
>>>>> which pushes the entire system to its limits when a couple of actions

>>> get a
>>>>> lot of traffic.
>>>>>
>>>>>
>>>>>
>>>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com>
>>> wrote:
>>>>>>
>>>>>> +1 on Markus' points about "crash safety" and "scaling". I can
>>> understand
>>>>>> the reasons behind exploring this change but from a developer
>>> experience
>>>>>> point of view this adds introduces a large amount of complexity to
the
>>>>>> programming model.
>>>>>>
>>>>>> If I have a concurrent container serving 100 requests and one of the
>>>>>> requests triggers a fatal error how does that affect the other
>>> requests?
>>>>>> Tearing down the entire runtime environment will destroy all those
>>>>>> requests.
>>>>>>
>>>>>> How could a developer understand how many requests per container to
>>> set
>>>>>> without a manual trial and error process? It also means you have to
>>> start
>>>>>> considering things like race conditions or other challenges of
>>> concurrent
>>>>>> code execution. This makes debugging and monitoring also more
>>> challenging.
>>>>>>
>>>>>> Looking at the other serverless providers, I've not seen this
featured
>>>>>> requested before. Developers generally ask AWS to raise the
concurrent
>>>>>> invocations limit for their application. This keeps the platform
doing
>>> the
>>>>>> hard task of managing resources and being efficient and allows them
to
>>> use
>>>>>> the same programming model.
>>>>>>
>>>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com>
>>> wrote:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>
>>>>>>>
>>>>>> To Rodric's points I think there are two topics to speak about and
>>> discuss:
>>>>>>>
>>>>>>> 1. The programming model: The current model encourages users to
break
>>>>>>> their actions apart in "functions" that take payload and return
>>> payload.
>>>>>>> Having a deployment model outlined could as noted encourage users
to
>>> use
>>>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver

>>> based
>>>>>>> applications. The current model is nice in that it solves a lot of
>>>>>> problems
>>>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>>>
>>>>>>> 2. Raw throughput of our deployment model: Setting the concerns
aside
>>> I
>>>>>>> think it is valid to explore concurrent invocations of actions on
the
>>>>>> same
>>>>>>> container. This does not necessarily mean that users start to
deploy
>>>>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>>>> continue
>>>>>>> to think in functions. Having a toggle per action which is disabled

>>> by
>>>>>>> default might be a good way to start here, since many users might
>>> need to
>>>>>>> change action code to support that notion and for some applications

>>> it
>>>>>>> might not be valid at all. I think it was also already noted, that
>>> this
>>>>>>> imposes some of the "old-fashioned" problems on the user, like: How

>>> many
>>>>>>> concurrent requests will my action be able to handle? That kinda
>>> defeats
>>>>>>> the seemless-scalability point of serverless.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Markus
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> James Thomas
>>>>>>
>>>>
>>>
>>>
>>>
>>>

Re: Improving support for UI driven use cases

Posted by Michael Marth <mm...@adobe.com.INVALID>.

Michael B,
Re your question: exactly what Rodric said :)



On 05/07/17 12:32, "Rodric Rabbah" <ro...@gmail.com> wrote:

>The issue at hand is precisely because there isn't any autoscaling of capacity (N invokers provide M containers per invoker). Once all those slots are consumed any new requests are queued - as previously discussed. 
>
>Adding more density per vm is one way of providing additional capacity over finite resources. This is the essence of the initial proposal.
>
>As noted in previous discussions on this topic, this should be viewed as managing a different resource pool (and not the same pool of containers as ephemeral actions). Once you buy into that, generalization to other resource pools becomes natural.
>
>Going further, serverless becomes the new PaaS. 
>
>-r
>
>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt <Mi...@de.ibm.com> wrote:
>> 
>> Hi Michael,
>> 
>> thanks for the feedback -- glad you like my stmt re value prop :-)
>> 
>> I might not yet have fully gotten my head around Steve's proposal -- what 
>> are your thoughts on how this would help avoiding the reimplementation of 
>> an autoscaling / feedback loop mechanism, as we know it from more 
>> traditional runtime platforms?
>> 
>> 
>> Thanks & best regards
>> Michael
>> 
>> 
>> 
>> From:   Michael Marth <mm...@adobe.com.INVALID>
>> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
>> Date:   07/05/2017 11:25 AM
>> Subject:        Re: Improving support for UI driven use cases
>> 
>> 
>> 
>> Hi Michael,
>> 
>> Totally agree with your statement
>> ?value prop of serverless is that folks don't have to care about that"
>> 
>> Again, the proposal at hand does not intend to change that at all. On the 
>> contrary - in our mind it?s a requirement that the developer should not 
>> change or that internals of the execution engines get exposed.
>> 
>> I find Stephen?s comment about generalising the runtime behaviour very 
>> exciting. It could open the door to very different types of workloads 
>> (like training Tensorflow or running Spark jobs), but with the same value 
>> prop: users do not have to care about the managing resources/servers. And 
>> for providers of OW systems all the OW goodies would still apply (e.g. 
>> running untrusted code). Moreover, if we split the Invoker into different 
>> specialised Invokers then those different specialised workloads could live 
>> independently from each other (in terms of code as well as resource 
>> allocation in deployments).
>> You can probably tell I am really excited about Stephen's idea :) I think 
>> it would be a great step forward in increasing the use cases for OW.
>> 
>> Cheers
>> Michael
>> 
>> 
>> 
>> 
>> 
>> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com> 
>> wrote:
>> 
>>> Hi Dragos,
>>> 
>>>> What stops
>>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>>> memory consumption of the running containers ? 
>>> 
>>> What are your thoughts on how this approach would be different from the 
>> many IaaS- and PaaS-centric autoscaling solutions that have been built 
>> over the last years? All of them require relatively complex policies (eg 
>> scale based on cpu or mem utilization, end-user response time, etc.? What 
>> are the thresholds for when to add/remove capacity?), and a value prop of 
>> serverless is that folks don't have to care about that.
>>> 
>>> we should discuss more during the call, but wanted to get this out as 
>> food for thought.
>>> 
>>> Sent from my iPhone
>>> 
>>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>>> 
>>>>> How could a developer understand how many requests per container to 
>> set
>>>> 
>>>> James, this is a good point, along with the other points in your email.
>>>> 
>>>> I think the developer doesn't need to know this info actually. What 
>> stops
>>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>>> memory consumption of the running containers ? Doing so it could learn
>>>> automatically how many concurrent requests 1 action can handle. It 
>> might be
>>>> easier to solve this problem efficiently, instead of the other problem
>>>> which pushes the entire system to its limits when a couple of actions 
>> get a
>>>> lot of traffic.
>>>> 
>>>> 
>>>> 
>>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com> 
>> wrote:
>>>>> 
>>>>> +1 on Markus' points about "crash safety" and "scaling". I can 
>> understand
>>>>> the reasons behind exploring this change but from a developer 
>> experience
>>>>> point of view this adds introduces a large amount of complexity to the
>>>>> programming model.
>>>>> 
>>>>> If I have a concurrent container serving 100 requests and one of the
>>>>> requests triggers a fatal error how does that affect the other 
>> requests?
>>>>> Tearing down the entire runtime environment will destroy all those
>>>>> requests.
>>>>> 
>>>>> How could a developer understand how many requests per container to 
>> set
>>>>> without a manual trial and error process? It also means you have to 
>> start
>>>>> considering things like race conditions or other challenges of 
>> concurrent
>>>>> code execution. This makes debugging and monitoring also more 
>> challenging.
>>>>> 
>>>>> Looking at the other serverless providers, I've not seen this featured
>>>>> requested before. Developers generally ask AWS to raise the concurrent
>>>>> invocations limit for their application. This keeps the platform doing 
>> the
>>>>> hard task of managing resources and being efficient and allows them to 
>> use
>>>>> the same programming model.
>>>>> 
>>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> 
>> wrote:
>>>>>> 
>>>>>> ...
>>>>>> 
>>>>> 
>>>>>> 
>>>>> To Rodric's points I think there are two topics to speak about and 
>> discuss:
>>>>>> 
>>>>>> 1. The programming model: The current model encourages users to break
>>>>>> their actions apart in "functions" that take payload and return 
>> payload.
>>>>>> Having a deployment model outlined could as noted encourage users to 
>> use
>>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver 
>> based
>>>>>> applications. The current model is nice in that it solves a lot of
>>>>> problems
>>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>> 
>>>>>> 2. Raw throughput of our deployment model: Setting the concerns aside 
>> I
>>>>>> think it is valid to explore concurrent invocations of actions on the
>>>>> same
>>>>>> container. This does not necessarily mean that users start to deploy
>>>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>>> continue
>>>>>> to think in functions. Having a toggle per action which is disabled 
>> by
>>>>>> default might be a good way to start here, since many users might 
>> need to
>>>>>> change action code to support that notion and for some applications 
>> it
>>>>>> might not be valid at all. I think it was also already noted, that 
>> this
>>>>>> imposes some of the "old-fashioned" problems on the user, like: How 
>> many
>>>>>> concurrent requests will my action be able to handle? That kinda 
>> defeats
>>>>>> the seemless-scalability point of serverless.
>>>>>> 
>>>>>> Cheers,
>>>>>> Markus
>>>>>> 
>>>>>> 
>>>>> --
>>>>> Regards,
>>>>> James Thomas
>>>>> 
>>> 
>> 
>> 
>> 
>>

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

The issue at hand is precisely because there isn't any autoscaling of capacity (N invokers provide M containers per invoker). Once all those slots are consumed any new requests are queued - as previously discussed. 

Adding more density per vm is one way of providing additional capacity over finite resources. This is the essence of the initial proposal.

As noted in previous discussions on this topic, this should be viewed as managing a different resource pool (and not the same pool of containers as ephemeral actions). Once you buy into that, generalization to other resource pools becomes natural.

Going further, serverless becomes the new PaaS. 

-r

> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt <Mi...@de.ibm.com> wrote:
> 
> Hi Michael,
> 
> thanks for the feedback -- glad you like my stmt re value prop :-)
> 
> I might not yet have fully gotten my head around Steve's proposal -- what 
> are your thoughts on how this would help avoiding the reimplementation of 
> an autoscaling / feedback loop mechanism, as we know it from more 
> traditional runtime platforms?
> 
> 
> Thanks & best regards
> Michael
> 
> 
> 
> From:   Michael Marth <mm...@adobe.com.INVALID>
> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
> Date:   07/05/2017 11:25 AM
> Subject:        Re: Improving support for UI driven use cases
> 
> 
> 
> Hi Michael,
> 
> Totally agree with your statement
> ?value prop of serverless is that folks don't have to care about that"
> 
> Again, the proposal at hand does not intend to change that at all. On the 
> contrary - in our mind it?s a requirement that the developer should not 
> change or that internals of the execution engines get exposed.
> 
> I find Stephen?s comment about generalising the runtime behaviour very 
> exciting. It could open the door to very different types of workloads 
> (like training Tensorflow or running Spark jobs), but with the same value 
> prop: users do not have to care about the managing resources/servers. And 
> for providers of OW systems all the OW goodies would still apply (e.g. 
> running untrusted code). Moreover, if we split the Invoker into different 
> specialised Invokers then those different specialised workloads could live 
> independently from each other (in terms of code as well as resource 
> allocation in deployments).
> You can probably tell I am really excited about Stephen's idea :) I think 
> it would be a great step forward in increasing the use cases for OW.
> 
> Cheers
> Michael
> 
> 
> 
> 
> 
> On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com> 
> wrote:
> 
>> Hi Dragos,
>> 
>>> What stops
>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>> memory consumption of the running containers ? 
>> 
>> What are your thoughts on how this approach would be different from the 
> many IaaS- and PaaS-centric autoscaling solutions that have been built 
> over the last years? All of them require relatively complex policies (eg 
> scale based on cpu or mem utilization, end-user response time, etc.? What 
> are the thresholds for when to add/remove capacity?), and a value prop of 
> serverless is that folks don't have to care about that.
>> 
>> we should discuss more during the call, but wanted to get this out as 
> food for thought.
>> 
>> Sent from my iPhone
>> 
>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>> 
>>>> How could a developer understand how many requests per container to 
> set
>>> 
>>> James, this is a good point, along with the other points in your email.
>>> 
>>> I think the developer doesn't need to know this info actually. What 
> stops
>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>> memory consumption of the running containers ? Doing so it could learn
>>> automatically how many concurrent requests 1 action can handle. It 
> might be
>>> easier to solve this problem efficiently, instead of the other problem
>>> which pushes the entire system to its limits when a couple of actions 
> get a
>>> lot of traffic.
>>> 
>>> 
>>> 
>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com> 
> wrote:
>>>> 
>>>> +1 on Markus' points about "crash safety" and "scaling". I can 
> understand
>>>> the reasons behind exploring this change but from a developer 
> experience
>>>> point of view this adds introduces a large amount of complexity to the
>>>> programming model.
>>>> 
>>>> If I have a concurrent container serving 100 requests and one of the
>>>> requests triggers a fatal error how does that affect the other 
> requests?
>>>> Tearing down the entire runtime environment will destroy all those
>>>> requests.
>>>> 
>>>> How could a developer understand how many requests per container to 
> set
>>>> without a manual trial and error process? It also means you have to 
> start
>>>> considering things like race conditions or other challenges of 
> concurrent
>>>> code execution. This makes debugging and monitoring also more 
> challenging.
>>>> 
>>>> Looking at the other serverless providers, I've not seen this featured
>>>> requested before. Developers generally ask AWS to raise the concurrent
>>>> invocations limit for their application. This keeps the platform doing 
> the
>>>> hard task of managing resources and being efficient and allows them to 
> use
>>>> the same programming model.
>>>> 
>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> 
> wrote:
>>>>> 
>>>>> ...
>>>>> 
>>>> 
>>>>> 
>>>> To Rodric's points I think there are two topics to speak about and 
> discuss:
>>>>> 
>>>>> 1. The programming model: The current model encourages users to break
>>>>> their actions apart in "functions" that take payload and return 
> payload.
>>>>> Having a deployment model outlined could as noted encourage users to 
> use
>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver 
> based
>>>>> applications. The current model is nice in that it solves a lot of
>>>> problems
>>>>> for the customer in terms of scalability and "crash safeness".
>>>>> 
>>>>> 2. Raw throughput of our deployment model: Setting the concerns aside 
> I
>>>>> think it is valid to explore concurrent invocations of actions on the
>>>> same
>>>>> container. This does not necessarily mean that users start to deploy
>>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>> continue
>>>>> to think in functions. Having a toggle per action which is disabled 
> by
>>>>> default might be a good way to start here, since many users might 
> need to
>>>>> change action code to support that notion and for some applications 
> it
>>>>> might not be valid at all. I think it was also already noted, that 
> this
>>>>> imposes some of the "old-fashioned" problems on the user, like: How 
> many
>>>>> concurrent requests will my action be able to handle? That kinda 
> defeats
>>>>> the seemless-scalability point of serverless.
>>>>> 
>>>>> Cheers,
>>>>> Markus
>>>>> 
>>>>> 
>>>> --
>>>> Regards,
>>>> James Thomas
>>>> 
>> 
> 
> 
> 
>

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.

Hi Michael,

thanks for the feedback -- glad you like my stmt re value prop :-)

I might not yet have fully gotten my head around Steve's proposal -- what 
are your thoughts on how this would help avoiding the reimplementation of 
an autoscaling / feedback loop mechanism, as we know it from more 
traditional runtime platforms?


Thanks & best regards
Michael



From:   Michael Marth <mm...@adobe.com.INVALID>
To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
Date:   07/05/2017 11:25 AM
Subject:        Re: Improving support for UI driven use cases



Hi Michael,

Totally agree with your statement
?value prop of serverless is that folks don't have to care about that"

Again, the proposal at hand does not intend to change that at all. On the 
contrary - in our mind it?s a requirement that the developer should not 
change or that internals of the execution engines get exposed.

I find Stephen?s comment about generalising the runtime behaviour very 
exciting. It could open the door to very different types of workloads 
(like training Tensorflow or running Spark jobs), but with the same value 
prop: users do not have to care about the managing resources/servers. And 
for providers of OW systems all the OW goodies would still apply (e.g. 
running untrusted code). Moreover, if we split the Invoker into different 
specialised Invokers then those different specialised workloads could live 
independently from each other (in terms of code as well as resource 
allocation in deployments).
You can probably tell I am really excited about Stephen's idea :) I think 
it would be a great step forward in increasing the use cases for OW.

Cheers
Michael





On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com> 
wrote:

>Hi Dragos,
>
>> What stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? 
>
>What are your thoughts on how this approach would be different from the 
many IaaS- and PaaS-centric autoscaling solutions that have been built 
over the last years? All of them require relatively complex policies (eg 
scale based on cpu or mem utilization, end-user response time, etc.? What 
are the thresholds for when to add/remove capacity?), and a value prop of 
serverless is that folks don't have to care about that.
>
>we should discuss more during the call, but wanted to get this out as 
food for thought.
>
>Sent from my iPhone
>
>On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>
>>> How could a developer understand how many requests per container to 
set
>> 
>> James, this is a good point, along with the other points in your email.
>> 
>> I think the developer doesn't need to know this info actually. What 
stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? Doing so it could learn
>> automatically how many concurrent requests 1 action can handle. It 
might be
>> easier to solve this problem efficiently, instead of the other problem
>> which pushes the entire system to its limits when a couple of actions 
get a
>> lot of traffic.
>> 
>> 
>> 
>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com> 
wrote:
>>> 
>>> +1 on Markus' points about "crash safety" and "scaling". I can 
understand
>>> the reasons behind exploring this change but from a developer 
experience
>>> point of view this adds introduces a large amount of complexity to the
>>> programming model.
>>> 
>>> If I have a concurrent container serving 100 requests and one of the
>>> requests triggers a fatal error how does that affect the other 
requests?
>>> Tearing down the entire runtime environment will destroy all those
>>> requests.
>>> 
>>> How could a developer understand how many requests per container to 
set
>>> without a manual trial and error process? It also means you have to 
start
>>> considering things like race conditions or other challenges of 
concurrent
>>> code execution. This makes debugging and monitoring also more 
challenging.
>>> 
>>> Looking at the other serverless providers, I've not seen this featured
>>> requested before. Developers generally ask AWS to raise the concurrent
>>> invocations limit for their application. This keeps the platform doing 
the
>>> hard task of managing resources and being efficient and allows them to 
use
>>> the same programming model.
>>> 
>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> 
wrote:
>>>> 
>>>> ...
>>>> 
>>> 
>>>> 
>>> To Rodric's points I think there are two topics to speak about and 
discuss:
>>>> 
>>>> 1. The programming model: The current model encourages users to break
>>>> their actions apart in "functions" that take payload and return 
payload.
>>>> Having a deployment model outlined could as noted encourage users to 
use
>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver 
based
>>>> applications. The current model is nice in that it solves a lot of
>>> problems
>>>> for the customer in terms of scalability and "crash safeness".
>>>> 
>>>> 2. Raw throughput of our deployment model: Setting the concerns aside 
I
>>>> think it is valid to explore concurrent invocations of actions on the
>>> same
>>>> container. This does not necessarily mean that users start to deploy
>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>> continue
>>>> to think in functions. Having a toggle per action which is disabled 
by
>>>> default might be a good way to start here, since many users might 
need to
>>>> change action code to support that notion and for some applications 
it
>>>> might not be valid at all. I think it was also already noted, that 
this
>>>> imposes some of the "old-fashioned" problems on the user, like: How 
many
>>>> concurrent requests will my action be able to handle? That kinda 
defeats
>>>> the seemless-scalability point of serverless.
>>>> 
>>>> Cheers,
>>>> Markus
>>>> 
>>>> 
>>> --
>>> Regards,
>>> James Thomas
>>> 
>

Re: Improving support for UI driven use cases

Posted by Michael Marth <mm...@adobe.com.INVALID>.

Hi Michael,

Totally agree with your statement
“value prop of serverless is that folks don't have to care about that"

Again, the proposal at hand does not intend to change that at all. On the contrary - in our mind it’s a requirement that the developer should not change or that internals of the execution engines get exposed.

I find Stephen’s comment about generalising the runtime behaviour very exciting. It could open the door to very different types of workloads (like training Tensorflow or running Spark jobs), but with the same value prop: users do not have to care about the managing resources/servers. And for providers of OW systems all the OW goodies would still apply (e.g. running untrusted code). Moreover, if we split the Invoker into different specialised Invokers then those different specialised workloads could live independently from each other (in terms of code as well as resource allocation in deployments).
You can probably tell I am really excited about Stephen's idea :) I think it would be a great step forward in increasing the use cases for OW.

Cheers
Michael





On 04/07/17 20:15, "Michael M Behrendt" <Mi...@de.ibm.com> wrote:

>Hi Dragos,
>
>> What stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? 
>
>What are your thoughts on how this approach would be different from the many IaaS- and PaaS-centric autoscaling solutions that have been built over the last years? All of them require relatively complex policies (eg scale based on cpu or mem utilization, end-user response time, etc.? What are the thresholds for when to add/remove capacity?), and a value prop of serverless is that folks don't have to care about that.
>
>we should discuss more during the call, but wanted to get this out as food for thought.
>
>Sent from my iPhone
>
>On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>
>>> How could a developer understand how many requests per container to set
>> 
>> James, this is a good point, along with the other points in your email.
>> 
>> I think the developer doesn't need to know this info actually. What stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? Doing so it could learn
>> automatically how many concurrent requests 1 action can handle. It might be
>> easier to solve this problem efficiently, instead of the other problem
>> which pushes the entire system to its limits when a couple of actions get a
>> lot of traffic.
>> 
>> 
>> 
>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com> wrote:
>>> 
>>> +1 on Markus' points about "crash safety" and "scaling". I can understand
>>> the reasons behind exploring this change but from a developer experience
>>> point of view this adds introduces a large amount of complexity to the
>>> programming model.
>>> 
>>> If I have a concurrent container serving 100 requests and one of the
>>> requests triggers a fatal error how does that affect the other requests?
>>> Tearing down the entire runtime environment will destroy all those
>>> requests.
>>> 
>>> How could a developer understand how many requests per container to set
>>> without a manual trial and error process? It also means you have to start
>>> considering things like race conditions or other challenges of concurrent
>>> code execution. This makes debugging and monitoring also more challenging.
>>> 
>>> Looking at the other serverless providers, I've not seen this featured
>>> requested before. Developers generally ask AWS to raise the concurrent
>>> invocations limit for their application. This keeps the platform doing the
>>> hard task of managing resources and being efficient and allows them to use
>>> the same programming model.
>>> 
>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> wrote:
>>>> 
>>>> ...
>>>> 
>>> 
>>>> 
>>> To Rodric's points I think there are two topics to speak about and discuss:
>>>> 
>>>> 1. The programming model: The current model encourages users to break
>>>> their actions apart in "functions" that take payload and return payload.
>>>> Having a deployment model outlined could as noted encourage users to use
>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
>>>> applications. The current model is nice in that it solves a lot of
>>> problems
>>>> for the customer in terms of scalability and "crash safeness".
>>>> 
>>>> 2. Raw throughput of our deployment model: Setting the concerns aside I
>>>> think it is valid to explore concurrent invocations of actions on the
>>> same
>>>> container. This does not necessarily mean that users start to deploy
>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>> continue
>>>> to think in functions. Having a toggle per action which is disabled by
>>>> default might be a good way to start here, since many users might need to
>>>> change action code to support that notion and for some applications it
>>>> might not be valid at all. I think it was also already noted, that this
>>>> imposes some of the "old-fashioned" problems on the user, like: How many
>>>> concurrent requests will my action be able to handle? That kinda defeats
>>>> the seemless-scalability point of serverless.
>>>> 
>>>> Cheers,
>>>> Markus
>>>> 
>>>> 
>>> --
>>> Regards,
>>> James Thomas
>>> 
>

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.


To Adrian's definition (which I also like) -- if the instance only lives
half a second, how does that fit with the autoscaling behavior you outlined
below, which i _think_ relies onmulti-threaded long-running processes?

Sent from my iPhone

On 4. Jul 2017, at 23:18, Dascalita Dragos <dd...@gmail.com> wrote:

>> how this approach would be different from the many IaaS- and
PaaS-centric
>
> I like Adrian Cockcroft's response (
> https://twitter.com/intent/like?tweet_id=736553530689998848 ) to this:
> *"...If your PaaS can efficiently start instances in 20ms that run for
half
> a second, then call it serverless..."*
>
> I think none of us here imagines that we're building a PaaS experience
for
> developers, nor does the current proposal intends to suggest we should. I
> also assume that none of us imagines to run in production a scalable
system
> with millions of concurrent users with the current setup that demands an
> incredible amount of resources.
>
> Quoting Michael M, which said it nicely, the intent is to make  "*the
> current problem ... not be a problem in reality anymore (and simply
remain
> as a theoretical problem)*".
>
> I think we have a way to be pragmatic about the current limitations, make
> it so that developers don't suffer b/c of this, and buy us enough time to
> implement the better model that should be used for serverless, where
> monitoring, "crash safety", "scaling" , and all of the concerns listed
> previously in this thread are addressed better, but at the same time, the
> performance doesn't have to suffer so much. This is the intent of this
> proposal.
>
>
>
> On Tue, Jul 4, 2017 at 11:15 AM Michael M Behrendt <
> Michaelbehrendt@de.ibm.com> wrote:
>
>> Hi Dragos,
>>
>>> What stops
>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>> memory consumption of the running containers ?
>>
>> What are your thoughts on how this approach would be different from the
>> many IaaS- and PaaS-centric autoscaling solutions that have been built
over
>> the last years? All of them require relatively complex policies (eg
scale
>> based on cpu or mem utilization, end-user response time, etc.? What are
the
>> thresholds for when to add/remove capacity?), and a value prop of
>> serverless is that folks don't have to care about that.
>>
>> we should discuss more during the call, but wanted to get this out as
food
>> for thought.
>>
>> Sent from my iPhone
>>
>> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>>
>>>> How could a developer understand how many requests per container to
set
>>>
>>> James, this is a good point, along with the other points in your email.
>>>
>>> I think the developer doesn't need to know this info actually. What
stops
>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>> memory consumption of the running containers ? Doing so it could learn
>>> automatically how many concurrent requests 1 action can handle. It
might
>> be
>>> easier to solve this problem efficiently, instead of the other problem
>>> which pushes the entire system to its limits when a couple of actions
>> get a
>>> lot of traffic.
>>>
>>>
>>>
>>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com>
>> wrote:
>>>>
>>>> +1 on Markus' points about "crash safety" and "scaling". I can
>> understand
>>>> the reasons behind exploring this change but from a developer
experience
>>>> point of view this adds introduces a large amount of complexity to the
>>>> programming model.
>>>>
>>>> If I have a concurrent container serving 100 requests and one of the
>>>> requests triggers a fatal error how does that affect the other
requests?
>>>> Tearing down the entire runtime environment will destroy all those
>>>> requests.
>>>>
>>>> How could a developer understand how many requests per container to
set
>>>> without a manual trial and error process? It also means you have to
>> start
>>>> considering things like race conditions or other challenges of
>> concurrent
>>>> code execution. This makes debugging and monitoring also more
>> challenging.
>>>>
>>>> Looking at the other serverless providers, I've not seen this featured
>>>> requested before. Developers generally ask AWS to raise the concurrent
>>>> invocations limit for their application. This keeps the platform doing
>> the
>>>> hard task of managing resources and being efficient and allows them to
>> use
>>>> the same programming model.
>>>>
>>>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com>
wrote:
>>>>>
>>>>> ...
>>>>>
>>>>
>>>>>
>>>> To Rodric's points I think there are two topics to speak about and
>> discuss:
>>>>>
>>>>> 1. The programming model: The current model encourages users to break
>>>>> their actions apart in "functions" that take payload and return
>> payload.
>>>>> Having a deployment model outlined could as noted encourage users to
>> use
>>>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver
>> based
>>>>> applications. The current model is nice in that it solves a lot of
>>>> problems
>>>>> for the customer in terms of scalability and "crash safeness".
>>>>>
>>>>> 2. Raw throughput of our deployment model: Setting the concerns aside
I
>>>>> think it is valid to explore concurrent invocations of actions on the
>>>> same
>>>>> container. This does not necessarily mean that users start to deploy
>>>>> monolithic apps as noted above, but it certainly could. Keeping our
>>>>> JSON-in/JSON-out at least for now though, could encourage users to
>>>> continue
>>>>> to think in functions. Having a toggle per action which is disabled
by
>>>>> default might be a good way to start here, since many users might
need
>> to
>>>>> change action code to support that notion and for some applications
it
>>>>> might not be valid at all. I think it was also already noted, that
this
>>>>> imposes some of the "old-fashioned" problems on the user, like: How
>> many
>>>>> concurrent requests will my action be able to handle? That kinda
>> defeats
>>>>> the seemless-scalability point of serverless.
>>>>>
>>>>> Cheers,
>>>>> Markus
>>>>>
>>>>>
>>>> --
>>>> Regards,
>>>> James Thomas
>>>>
>>
>>

Re: Improving support for UI driven use cases

Posted by Dascalita Dragos <dd...@gmail.com>.

>  how this approach would be different from the many IaaS- and PaaS-centric

I like Adrian Cockcroft's response (
https://twitter.com/intent/like?tweet_id=736553530689998848 ) to this:
*"...If your PaaS can efficiently start instances in 20ms that run for half
a second, then call it serverless..."*

I think none of us here imagines that we're building a PaaS experience for
developers, nor does the current proposal intends to suggest we should. I
also assume that none of us imagines to run in production a scalable system
with millions of concurrent users with the current setup that demands an
incredible amount of resources.

Quoting Michael M, which said it nicely, the intent is to make  "*the
current problem ... not be a problem in reality anymore (and simply remain
as a theoretical problem)*".

I think we have a way to be pragmatic about the current limitations, make
it so that developers don't suffer b/c of this, and buy us enough time to
implement the better model that should be used for serverless, where
monitoring, "crash safety", "scaling" , and all of the concerns listed
previously in this thread are addressed better, but at the same time, the
performance doesn't have to suffer so much. This is the intent of this
proposal.



On Tue, Jul 4, 2017 at 11:15 AM Michael M Behrendt <
Michaelbehrendt@de.ibm.com> wrote:

> Hi Dragos,
>
> > What stops
> > Openwhisk to be smart in observing the response times, CPU consumption,
> > memory consumption of the running containers ?
>
> What are your thoughts on how this approach would be different from the
> many IaaS- and PaaS-centric autoscaling solutions that have been built over
> the last years? All of them require relatively complex policies (eg scale
> based on cpu or mem utilization, end-user response time, etc.? What are the
> thresholds for when to add/remove capacity?), and a value prop of
> serverless is that folks don't have to care about that.
>
> we should discuss more during the call, but wanted to get this out as food
> for thought.
>
> Sent from my iPhone
>
> On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:
>
> >> How could a developer understand how many requests per container to set
> >
> > James, this is a good point, along with the other points in your email.
> >
> > I think the developer doesn't need to know this info actually. What stops
> > Openwhisk to be smart in observing the response times, CPU consumption,
> > memory consumption of the running containers ? Doing so it could learn
> > automatically how many concurrent requests 1 action can handle. It might
> be
> > easier to solve this problem efficiently, instead of the other problem
> > which pushes the entire system to its limits when a couple of actions
> get a
> > lot of traffic.
> >
> >
> >
> >> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com>
> wrote:
> >>
> >> +1 on Markus' points about "crash safety" and "scaling". I can
> understand
> >> the reasons behind exploring this change but from a developer experience
> >> point of view this adds introduces a large amount of complexity to the
> >> programming model.
> >>
> >> If I have a concurrent container serving 100 requests and one of the
> >> requests triggers a fatal error how does that affect the other requests?
> >> Tearing down the entire runtime environment will destroy all those
> >> requests.
> >>
> >> How could a developer understand how many requests per container to set
> >> without a manual trial and error process? It also means you have to
> start
> >> considering things like race conditions or other challenges of
> concurrent
> >> code execution. This makes debugging and monitoring also more
> challenging.
> >>
> >> Looking at the other serverless providers, I've not seen this featured
> >> requested before. Developers generally ask AWS to raise the concurrent
> >> invocations limit for their application. This keeps the platform doing
> the
> >> hard task of managing resources and being efficient and allows them to
> use
> >> the same programming model.
> >>
> >>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> wrote:
> >>>
> >>> ...
> >>>
> >>
> >>>
> >> To Rodric's points I think there are two topics to speak about and
> discuss:
> >>>
> >>> 1. The programming model: The current model encourages users to break
> >>> their actions apart in "functions" that take payload and return
> payload.
> >>> Having a deployment model outlined could as noted encourage users to
> use
> >>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver
> based
> >>> applications. The current model is nice in that it solves a lot of
> >> problems
> >>> for the customer in terms of scalability and "crash safeness".
> >>>
> >>> 2. Raw throughput of our deployment model: Setting the concerns aside I
> >>> think it is valid to explore concurrent invocations of actions on the
> >> same
> >>> container. This does not necessarily mean that users start to deploy
> >>> monolithic apps as noted above, but it certainly could. Keeping our
> >>> JSON-in/JSON-out at least for now though, could encourage users to
> >> continue
> >>> to think in functions. Having a toggle per action which is disabled by
> >>> default might be a good way to start here, since many users might need
> to
> >>> change action code to support that notion and for some applications it
> >>> might not be valid at all. I think it was also already noted, that this
> >>> imposes some of the "old-fashioned" problems on the user, like: How
> many
> >>> concurrent requests will my action be able to handle? That kinda
> defeats
> >>> the seemless-scalability point of serverless.
> >>>
> >>> Cheers,
> >>> Markus
> >>>
> >>>
> >> --
> >> Regards,
> >> James Thomas
> >>
>
>

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.

Hi Dragos,

> What stops
> Openwhisk to be smart in observing the response times, CPU consumption,
> memory consumption of the running containers ? 

What are your thoughts on how this approach would be different from the many IaaS- and PaaS-centric autoscaling solutions that have been built over the last years? All of them require relatively complex policies (eg scale based on cpu or mem utilization, end-user response time, etc.? What are the thresholds for when to add/remove capacity?), and a value prop of serverless is that folks don't have to care about that.

we should discuss more during the call, but wanted to get this out as food for thought.

Sent from my iPhone

On 4. Jul 2017, at 18:50, Dascalita Dragos <dd...@gmail.com> wrote:

>> How could a developer understand how many requests per container to set
> 
> James, this is a good point, along with the other points in your email.
> 
> I think the developer doesn't need to know this info actually. What stops
> Openwhisk to be smart in observing the response times, CPU consumption,
> memory consumption of the running containers ? Doing so it could learn
> automatically how many concurrent requests 1 action can handle. It might be
> easier to solve this problem efficiently, instead of the other problem
> which pushes the entire system to its limits when a couple of actions get a
> lot of traffic.
> 
> 
> 
>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com> wrote:
>> 
>> +1 on Markus' points about "crash safety" and "scaling". I can understand
>> the reasons behind exploring this change but from a developer experience
>> point of view this adds introduces a large amount of complexity to the
>> programming model.
>> 
>> If I have a concurrent container serving 100 requests and one of the
>> requests triggers a fatal error how does that affect the other requests?
>> Tearing down the entire runtime environment will destroy all those
>> requests.
>> 
>> How could a developer understand how many requests per container to set
>> without a manual trial and error process? It also means you have to start
>> considering things like race conditions or other challenges of concurrent
>> code execution. This makes debugging and monitoring also more challenging.
>> 
>> Looking at the other serverless providers, I've not seen this featured
>> requested before. Developers generally ask AWS to raise the concurrent
>> invocations limit for their application. This keeps the platform doing the
>> hard task of managing resources and being efficient and allows them to use
>> the same programming model.
>> 
>>> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> wrote:
>>> 
>>> ...
>>> 
>> 
>>> 
>> To Rodric's points I think there are two topics to speak about and discuss:
>>> 
>>> 1. The programming model: The current model encourages users to break
>>> their actions apart in "functions" that take payload and return payload.
>>> Having a deployment model outlined could as noted encourage users to use
>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
>>> applications. The current model is nice in that it solves a lot of
>> problems
>>> for the customer in terms of scalability and "crash safeness".
>>> 
>>> 2. Raw throughput of our deployment model: Setting the concerns aside I
>>> think it is valid to explore concurrent invocations of actions on the
>> same
>>> container. This does not necessarily mean that users start to deploy
>>> monolithic apps as noted above, but it certainly could. Keeping our
>>> JSON-in/JSON-out at least for now though, could encourage users to
>> continue
>>> to think in functions. Having a toggle per action which is disabled by
>>> default might be a good way to start here, since many users might need to
>>> change action code to support that notion and for some applications it
>>> might not be valid at all. I think it was also already noted, that this
>>> imposes some of the "old-fashioned" problems on the user, like: How many
>>> concurrent requests will my action be able to handle? That kinda defeats
>>> the seemless-scalability point of serverless.
>>> 
>>> Cheers,
>>> Markus
>>> 
>>> 
>> --
>> Regards,
>> James Thomas
>>

Re: Improving support for UI driven use cases

Posted by Dascalita Dragos <dd...@gmail.com>.

> How could a developer understand how many requests per container to set

James, this is a good point, along with the other points in your email.

I think the developer doesn't need to know this info actually. What stops
Openwhisk to be smart in observing the response times, CPU consumption,
memory consumption of the running containers ? Doing so it could learn
automatically how many concurrent requests 1 action can handle. It might be
easier to solve this problem efficiently, instead of the other problem
which pushes the entire system to its limits when a couple of actions get a
lot of traffic.



On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jt...@gmail.com> wrote:

> +1 on Markus' points about "crash safety" and "scaling". I can understand
> the reasons behind exploring this change but from a developer experience
> point of view this adds introduces a large amount of complexity to the
> programming model.
>
> If I have a concurrent container serving 100 requests and one of the
> requests triggers a fatal error how does that affect the other requests?
> Tearing down the entire runtime environment will destroy all those
> requests.
>
> How could a developer understand how many requests per container to set
> without a manual trial and error process? It also means you have to start
> considering things like race conditions or other challenges of concurrent
> code execution. This makes debugging and monitoring also more challenging.
>
> Looking at the other serverless providers, I've not seen this featured
> requested before. Developers generally ask AWS to raise the concurrent
> invocations limit for their application. This keeps the platform doing the
> hard task of managing resources and being efficient and allows them to use
> the same programming model.
>
> On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> wrote:
>
> > ...
> >
>
> >
> To Rodric's points I think there are two topics to speak about and discuss:
> >
> > 1. The programming model: The current model encourages users to break
> > their actions apart in "functions" that take payload and return payload.
> > Having a deployment model outlined could as noted encourage users to use
> > OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
> > applications. The current model is nice in that it solves a lot of
> problems
> > for the customer in terms of scalability and "crash safeness".
> >
> > 2. Raw throughput of our deployment model: Setting the concerns aside I
> > think it is valid to explore concurrent invocations of actions on the
> same
> > container. This does not necessarily mean that users start to deploy
> > monolithic apps as noted above, but it certainly could. Keeping our
> > JSON-in/JSON-out at least for now though, could encourage users to
> continue
> > to think in functions. Having a toggle per action which is disabled by
> > default might be a good way to start here, since many users might need to
> > change action code to support that notion and for some applications it
> > might not be valid at all. I think it was also already noted, that this
> > imposes some of the "old-fashioned" problems on the user, like: How many
> > concurrent requests will my action be able to handle? That kinda defeats
> > the seemless-scalability point of serverless.
> >
> > Cheers,
> > Markus
> >
> >
> --
> Regards,
> James Thomas
>

Re: Improving support for UI driven use cases

Posted by James Thomas <jt...@gmail.com>.

+1 on Markus' points about "crash safety" and "scaling". I can understand
the reasons behind exploring this change but from a developer experience
point of view this adds introduces a large amount of complexity to the
programming model.

If I have a concurrent container serving 100 requests and one of the
requests triggers a fatal error how does that affect the other requests?
Tearing down the entire runtime environment will destroy all those
requests.

How could a developer understand how many requests per container to set
without a manual trial and error process? It also means you have to start
considering things like race conditions or other challenges of concurrent
code execution. This makes debugging and monitoring also more challenging.

Looking at the other serverless providers, I've not seen this featured
requested before. Developers generally ask AWS to raise the concurrent
invocations limit for their application. This keeps the platform doing the
hard task of managing resources and being efficient and allows them to use
the same programming model.

On 2 July 2017 at 11:05, Markus Thömmes <ma...@me.com> wrote:

> ...
>

>
To Rodric's points I think there are two topics to speak about and discuss:
>
> 1. The programming model: The current model encourages users to break
> their actions apart in "functions" that take payload and return payload.
> Having a deployment model outlined could as noted encourage users to use
> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
> applications. The current model is nice in that it solves a lot of problems
> for the customer in terms of scalability and "crash safeness".
>
> 2. Raw throughput of our deployment model: Setting the concerns aside I
> think it is valid to explore concurrent invocations of actions on the same
> container. This does not necessarily mean that users start to deploy
> monolithic apps as noted above, but it certainly could. Keeping our
> JSON-in/JSON-out at least for now though, could encourage users to continue
> to think in functions. Having a toggle per action which is disabled by
> default might be a good way to start here, since many users might need to
> change action code to support that notion and for some applications it
> might not be valid at all. I think it was also already noted, that this
> imposes some of the "old-fashioned" problems on the user, like: How many
> concurrent requests will my action be able to handle? That kinda defeats
> the seemless-scalability point of serverless.
>
> Cheers,
> Markus
>
>
-- 
Regards,
James Thomas

Re: Improving support for UI driven use cases

Posted by Dascalita Dragos <dd...@gmail.com>.

Michael , +1 to how you summarized the problem.

> I’d suggest that the first step is to support “multiple heterogeneous
resource pools”

I'd like to reinforce Stephen's idea on "multiple resource pools". We've
been already using this idea in production systems successfully in other
setups, with Mesos, isolating the Spark workloads requiring state, from
other stateless workloads, or from GPU workloads. This idea would be a
perfect fit for Openwhisk. It can also be extended beyond the Invokers, to
other cluster managers like Mesos, and Kube.


On Tue, Jul 4, 2017 at 7:05 AM Stephen Fink <fi...@gmail.com> wrote:

> Hi all,
>
> I’ve been lurking a bit on this thread, but haven’t had time to fully
> digest all the issues.
>
> I’d suggest that the first step is to support “multiple heterogeneous
> resource pools”, where a resource pool is a set of invokers managed by a
> load balancer.  There are lots of reasons we may want to support invokers
> with different flavors:  long-running actions, invokers in a VPN, invokers
> with GPUs,  invokers with big memory, invokers which support concurrent
> execution, etc…  .      If we had a general way to plug in a new resource
> pool, folks could feel free to experiment with any new flavors they like
> without having to debate the implications on other flavors.
>
> I tend to doubt that there is a “one size fits all” solution here, so I’d
> suggest we bite the bullet and engineer for heterogeneity.
>
> SJF
>
>
> > On Jul 4, 2017, at 9:55 AM, Michael Marth <mm...@adobe.com.INVALID>
> wrote:
> >
> > Hi Jeremias, all,
> >
> > Tyson and Dragos are travelling this week, so that I don’t know by when
> they get to respond. I have worked with them on this topic, so let me jump
> in and comment until they are able to reply.
> >
> > From my POV having a call like you suggest is a really good idea. Let’s
> wait for Tyson & Dragos to chime in to find a date.
> >
> > As you mention the discussion so far was jumping across different
> topics, especially the use case, the problem to be solved and the proposed
> solution. In preparation of the call I think we can clarify use case and
> problem on the list. Here’s my view:
> >
> > Use Case
> >
> > For us the use case can be summarised with “dynamic, high performance
> websites/mobile apps”. This implies:
> > 1 High concurrency, i.e. Many requests coming in at the same time
> > 2 The code to be executed is the same code across these different
> requests (as opposed to a long tail distribution of many different actions
> being executed concurrently). In our case “many” would mean “hundreds” or a
> few thousand.
> > 3 The latency (time to start execution) matters, because human users are
> waiting for the response. Ideally, in these order of magnitudes of
> concurrent requests the latency should not change much.
> >
> > All 3 requirements need to be satisfied for this use case.
> > In the discussion so far it was mentioned that there are other use cases
> which might have similar requirements. That’s great and I do not want to
> rule them out, obviously. The above is just to make it clear from where we
> are coming from.
> >
> > At this point I would like to mention that it is my understanding that
> this use case is within OpenWhisk’s strike zone, i.e. Something that we all
> think is reasonable to support. Please speak up if you disagree.
> >
> > The Problem
> >
> > One can look at the problem in two ways:
> > Either you keep the resources of the OW system constant (i.e. No
> scaling). In that case latency increases very quickly as demonstrated by
> Tyson’s tests.
> > Or you increase the system’s capacity. In that case the amount of
> machines to satisfy this use case quickly becomes prohibitively expensive
> to run for the OW operator – where expensive is defined as “compared to
> traditional web servers” (in our case a standard Node.js server). Meaning,
> you need 100-1000 concurrent action containers to serve what can be served
> by 1 or 2 Node.js containers.
> >
> > Of course, the proposed solution is not a fundamental “fix” for the
> above. It would only move the needle ~2 orders of magnitude – so that the
> current problem would not be a problem in reality anymore (and simply
> remain as a theoretical problem). For me that would be good enough.
> >
> > The solution approach
> >
> > Would not like to comment on the proposed solution’s details (and leave
> that to Dragos and Tyson). However, it was mentioned that the approach
> would change the programming model for users:
> > Our mindset and approach was that we explicitly do not want  to change
> how OpenWhisk exposes itself to users. Meaning, users should still be able
> to use NPMs, etc  - i.e. This would be an internal implementation detail
> that is not visible for users. (we can make things more explicit to users
> and e.g. Have them requests a special concurrent runtime if we wish to do
> so – so far we tried to make it transparent to users, though).
> >
> > Many thanks
> > Michael
> >
> >
> >
> > On 03/07/17 14:48, "Jeremias Werner" <jeremias.werner@gmail.com<mailto:
> jeremias.werner@gmail.com>> wrote:
> >
> > Hi
> >
> > Thanks for the write-up and the proposal. I think this is a nice idea and
> > sounds like a nice way of increasing throughput. Reading through the
> thread
> > it feels like there are different topics/problems mixed-up and the
> > discussion is becoming very complex already.
> >
> > Therefore I would like to suggest that we streamline the discussion a
> bit,
> > maybe in a zoom.us session where we first give Tyson and Dragos the
> chance
> > to walk through the proposal and clarify questions of the audience. Once
> we
> > are all on the same page we could think of a discussion about the
> benefits
> > (improved throughput, latency) vs. challanges (resource sharing, crash
> > model, container lifetime, programming model) on the core of the
> proposal:
> > running multiple activations in a single user container. Once we have a
> > common understanding on that part we could step-up in the architecture
> and
> > discuss what's needed on higher components like invoker/load-balancer to
> > get this integrated.
> >
> > (I said zoom.us session since I liked the one we had a few weeks ago. It
> > was efficient and interactive. If you like I could volunteer to setup the
> > session and/or writing the script/summary)
> >
> > what do you think?
> >
> > Many thanks in advance!
> >
> > Jeremias
> >
> >
> > On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <rodric@gmail.com<mailto:
> rodric@gmail.com>> wrote:
> >
> > You're discounting with event driven all use cases that are still latency
> > sensitive because they complete a response by call back or actuation at
> > completion. IoT, chatbots, notifications, all examples in addition to ui
> > which are latency sensitive and having uniform expectations on queuing
> time
> > is of value.
> >
> > -r
> >
>
>

Re: Improving support for UI driven use cases

Posted by Michael Marth <mm...@adobe.com.INVALID>.

I like that approach a lot!




On 04/07/17 16:05, "Stephen Fink" <fi...@gmail.com> wrote:

>Hi all,
>
>I’ve been lurking a bit on this thread, but haven’t had time to fully digest all the issues.
>
>I’d suggest that the first step is to support “multiple heterogeneous resource pools”, where a resource pool is a set of invokers managed by a load balancer.  There are lots of reasons we may want to support invokers with different flavors:  long-running actions, invokers in a VPN, invokers with GPUs,  invokers with big memory, invokers which support concurrent execution, etc…  .      If we had a general way to plug in a new resource pool, folks could feel free to experiment with any new flavors they like without having to debate the implications on other flavors.
>
>I tend to doubt that there is a “one size fits all” solution here, so I’d suggest we bite the bullet and engineer for heterogeneity.
>
>SJF
>
>
>> On Jul 4, 2017, at 9:55 AM, Michael Marth <mm...@adobe.com.INVALID> wrote:
>> 
>> Hi Jeremias, all,
>> 
>> Tyson and Dragos are travelling this week, so that I don’t know by when they get to respond. I have worked with them on this topic, so let me jump in and comment until they are able to reply.
>> 
>> From my POV having a call like you suggest is a really good idea. Let’s wait for Tyson & Dragos to chime in to find a date.
>> 
>> As you mention the discussion so far was jumping across different topics, especially the use case, the problem to be solved and the proposed solution. In preparation of the call I think we can clarify use case and problem on the list. Here’s my view:
>> 
>> Use Case
>> 
>> For us the use case can be summarised with “dynamic, high performance websites/mobile apps”. This implies:
>> 1 High concurrency, i.e. Many requests coming in at the same time
>> 2 The code to be executed is the same code across these different requests (as opposed to a long tail distribution of many different actions being executed concurrently). In our case “many” would mean “hundreds” or a few thousand.
>> 3 The latency (time to start execution) matters, because human users are waiting for the response. Ideally, in these order of magnitudes of concurrent requests the latency should not change much.
>> 
>> All 3 requirements need to be satisfied for this use case.
>> In the discussion so far it was mentioned that there are other use cases which might have similar requirements. That’s great and I do not want to rule them out, obviously. The above is just to make it clear from where we are coming from.
>> 
>> At this point I would like to mention that it is my understanding that this use case is within OpenWhisk’s strike zone, i.e. Something that we all think is reasonable to support. Please speak up if you disagree.
>> 
>> The Problem
>> 
>> One can look at the problem in two ways:
>> Either you keep the resources of the OW system constant (i.e. No scaling). In that case latency increases very quickly as demonstrated by Tyson’s tests.
>> Or you increase the system’s capacity. In that case the amount of machines to satisfy this use case quickly becomes prohibitively expensive to run for the OW operator – where expensive is defined as “compared to traditional web servers” (in our case a standard Node.js server). Meaning, you need 100-1000 concurrent action containers to serve what can be served by 1 or 2 Node.js containers.
>> 
>> Of course, the proposed solution is not a fundamental “fix” for the above. It would only move the needle ~2 orders of magnitude – so that the current problem would not be a problem in reality anymore (and simply remain as a theoretical problem). For me that would be good enough.
>> 
>> The solution approach
>> 
>> Would not like to comment on the proposed solution’s details (and leave that to Dragos and Tyson). However, it was mentioned that the approach would change the programming model for users:
>> Our mindset and approach was that we explicitly do not want  to change how OpenWhisk exposes itself to users. Meaning, users should still be able to use NPMs, etc  - i.e. This would be an internal implementation detail that is not visible for users. (we can make things more explicit to users and e.g. Have them requests a special concurrent runtime if we wish to do so – so far we tried to make it transparent to users, though).
>> 
>> Many thanks
>> Michael
>> 
>> 
>> 
>> On 03/07/17 14:48, "Jeremias Werner" <je...@gmail.com>> wrote:
>> 
>> Hi
>> 
>> Thanks for the write-up and the proposal. I think this is a nice idea and
>> sounds like a nice way of increasing throughput. Reading through the thread
>> it feels like there are different topics/problems mixed-up and the
>> discussion is becoming very complex already.
>> 
>> Therefore I would like to suggest that we streamline the discussion a bit,
>> maybe in a zoom.us session where we first give Tyson and Dragos the chance
>> to walk through the proposal and clarify questions of the audience. Once we
>> are all on the same page we could think of a discussion about the benefits
>> (improved throughput, latency) vs. challanges (resource sharing, crash
>> model, container lifetime, programming model) on the core of the proposal:
>> running multiple activations in a single user container. Once we have a
>> common understanding on that part we could step-up in the architecture and
>> discuss what's needed on higher components like invoker/load-balancer to
>> get this integrated.
>> 
>> (I said zoom.us session since I liked the one we had a few weeks ago. It
>> was efficient and interactive. If you like I could volunteer to setup the
>> session and/or writing the script/summary)
>> 
>> what do you think?
>> 
>> Many thanks in advance!
>> 
>> Jeremias
>> 
>> 
>> On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <ro...@gmail.com>> wrote:
>> 
>> You're discounting with event driven all use cases that are still latency
>> sensitive because they complete a response by call back or actuation at
>> completion. IoT, chatbots, notifications, all examples in addition to ui
>> which are latency sensitive and having uniform expectations on queuing time
>> is of value.
>> 
>> -r
>> 
>

Re: Improving support for UI driven use cases

Posted by Stephen Fink <fi...@gmail.com>.

Hi all,

I’ve been lurking a bit on this thread, but haven’t had time to fully digest all the issues.

I’d suggest that the first step is to support “multiple heterogeneous resource pools”, where a resource pool is a set of invokers managed by a load balancer.  There are lots of reasons we may want to support invokers with different flavors:  long-running actions, invokers in a VPN, invokers with GPUs,  invokers with big memory, invokers which support concurrent execution, etc…  .      If we had a general way to plug in a new resource pool, folks could feel free to experiment with any new flavors they like without having to debate the implications on other flavors.

I tend to doubt that there is a “one size fits all” solution here, so I’d suggest we bite the bullet and engineer for heterogeneity.

SJF


> On Jul 4, 2017, at 9:55 AM, Michael Marth <mm...@adobe.com.INVALID> wrote:
> 
> Hi Jeremias, all,
> 
> Tyson and Dragos are travelling this week, so that I don’t know by when they get to respond. I have worked with them on this topic, so let me jump in and comment until they are able to reply.
> 
> From my POV having a call like you suggest is a really good idea. Let’s wait for Tyson & Dragos to chime in to find a date.
> 
> As you mention the discussion so far was jumping across different topics, especially the use case, the problem to be solved and the proposed solution. In preparation of the call I think we can clarify use case and problem on the list. Here’s my view:
> 
> Use Case
> 
> For us the use case can be summarised with “dynamic, high performance websites/mobile apps”. This implies:
> 1 High concurrency, i.e. Many requests coming in at the same time
> 2 The code to be executed is the same code across these different requests (as opposed to a long tail distribution of many different actions being executed concurrently). In our case “many” would mean “hundreds” or a few thousand.
> 3 The latency (time to start execution) matters, because human users are waiting for the response. Ideally, in these order of magnitudes of concurrent requests the latency should not change much.
> 
> All 3 requirements need to be satisfied for this use case.
> In the discussion so far it was mentioned that there are other use cases which might have similar requirements. That’s great and I do not want to rule them out, obviously. The above is just to make it clear from where we are coming from.
> 
> At this point I would like to mention that it is my understanding that this use case is within OpenWhisk’s strike zone, i.e. Something that we all think is reasonable to support. Please speak up if you disagree.
> 
> The Problem
> 
> One can look at the problem in two ways:
> Either you keep the resources of the OW system constant (i.e. No scaling). In that case latency increases very quickly as demonstrated by Tyson’s tests.
> Or you increase the system’s capacity. In that case the amount of machines to satisfy this use case quickly becomes prohibitively expensive to run for the OW operator – where expensive is defined as “compared to traditional web servers” (in our case a standard Node.js server). Meaning, you need 100-1000 concurrent action containers to serve what can be served by 1 or 2 Node.js containers.
> 
> Of course, the proposed solution is not a fundamental “fix” for the above. It would only move the needle ~2 orders of magnitude – so that the current problem would not be a problem in reality anymore (and simply remain as a theoretical problem). For me that would be good enough.
> 
> The solution approach
> 
> Would not like to comment on the proposed solution’s details (and leave that to Dragos and Tyson). However, it was mentioned that the approach would change the programming model for users:
> Our mindset and approach was that we explicitly do not want  to change how OpenWhisk exposes itself to users. Meaning, users should still be able to use NPMs, etc  - i.e. This would be an internal implementation detail that is not visible for users. (we can make things more explicit to users and e.g. Have them requests a special concurrent runtime if we wish to do so – so far we tried to make it transparent to users, though).
> 
> Many thanks
> Michael
> 
> 
> 
> On 03/07/17 14:48, "Jeremias Werner" <je...@gmail.com>> wrote:
> 
> Hi
> 
> Thanks for the write-up and the proposal. I think this is a nice idea and
> sounds like a nice way of increasing throughput. Reading through the thread
> it feels like there are different topics/problems mixed-up and the
> discussion is becoming very complex already.
> 
> Therefore I would like to suggest that we streamline the discussion a bit,
> maybe in a zoom.us session where we first give Tyson and Dragos the chance
> to walk through the proposal and clarify questions of the audience. Once we
> are all on the same page we could think of a discussion about the benefits
> (improved throughput, latency) vs. challanges (resource sharing, crash
> model, container lifetime, programming model) on the core of the proposal:
> running multiple activations in a single user container. Once we have a
> common understanding on that part we could step-up in the architecture and
> discuss what's needed on higher components like invoker/load-balancer to
> get this integrated.
> 
> (I said zoom.us session since I liked the one we had a few weeks ago. It
> was efficient and interactive. If you like I could volunteer to setup the
> session and/or writing the script/summary)
> 
> what do you think?
> 
> Many thanks in advance!
> 
> Jeremias
> 
> 
> On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <ro...@gmail.com>> wrote:
> 
> You're discounting with event driven all use cases that are still latency
> sensitive because they complete a response by call back or actuation at
> completion. IoT, chatbots, notifications, all examples in addition to ui
> which are latency sensitive and having uniform expectations on queuing time
> is of value.
> 
> -r
>

Re: Scheduling for follow up to improving support for UI driven use cases

Posted by Matt Rutkowski <mr...@us.ibm.com>.

Hi Jeremias,

I have scheduled a call on my Zoom (paid) for our use tomorrow if you like 
at the 8am PDT time suggested...  here is the info:

---

Topic: Apache OpenWhisk Zoom Meeting
Time: Jul 13, 2017 10:00 AM Central Time (US and Canada)

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/5043933185

Or iPhone one-tap (US Toll):  +16465588656,,5043933185# or 
+14086380968,,5043933185#

Or Telephone:
    Dial: +1 646 558 8656 (US Toll) or +1 408 638 0968 (US Toll)
    Meeting ID: 504 393 3185
    International numbers available: 
https://zoom.us/zoomconference?m=T4Ycc4hVdhvPkjU0ZjPjwtzOUOtbAH5W
---

Feel free to copy it / forward it to Slack or wherever needed.

Kind regards,
Matt 



From:   Jeremias Werner <je...@gmail.com>
To:     dev@openwhisk.apache.org
Date:   07/12/2017 11:07 AM
Subject:        Re: Scheduling for follow up to improving support for UI 
driven use cases



Hi Tyson,

I could setup the meeting. Unfortunately I've only 40min free account,
while managing to get a pro license requested which would allow longer
meetings. If no one else is able to open the call with a pro license, I
will share mine and setup the call right before the meeting.

thanks

Jeremias

On Wed, Jul 12, 2017 at 7:44 AM, Tyson Norris <tn...@adobe.com.invalid>
wrote:

> Hi Jememias -
> Thanks - if you're able to attend and setup the zoom.us details I would
> appreciate the help. If not, let me know and I will get it setup.
> I'm traveling tomorrow, so may be out of touch, but will plan to join 
the
> meeting Thursday.
> Hopefully other interested folks can join as well, and regardless we can
> startup a new email thread on the topic when I'm back full time on 
Monday
> (with any notes from the meeting).
>
> Thanks
> Tyson
>
> > On Jul 10, 2017, at 4:04 AM, Jeremias Werner 
<je...@gmail.com>
> wrote:
> >
> > Hi Tyson,
> >
> > having the call at 8am PDT on July 13th sounds good. Further, I like 
to
> > focus discussion on the technical items you listed. looking forward.
> >
> > thanks!
> >
> > Jeremias
> >
> > On Sat, Jul 8, 2017 at 4:43 PM, Tyson Norris 
<tn...@adobe.com.invalid>
> > wrote:
> >
> >> I haven't heard anything - any takers for 8am  PDT July 13th call?
> >>
> >> If not, do people want a different day, or just don't want to join 
the
> >> call?
> >>
> >> We would discuss specifically:
> >> - concurrent activation processing
> >> - implementing support at the load balancer layer
> >> - need the spi support PR to move forward
> >>
> >> Thanks
> >> Tyson
> >>
> >>
> >>
> >>> On Jul 6, 2017, at 5:34 PM, Tyson Norris <ty...@gmail.com>
> wrote:
> >>>
> >>> Sure
> >>> Dragos and I are only available at 8am PDT on July 13th, if people 
are
> >> not available then, let me know and we can schedule for July 17th or
> later
> >> via doodle.
> >>>
> >>>> On Jul 6, 2017, at 10:24 AM, Rodric Rabbah <ro...@gmail.com> 
wrote:
> >>>>
> >>>> May I suggest creating a new thread to coordinate times to avoid
> >> polluting the discussion? Perhaps a doodle poll is in order.
> >>>>
> >>>> I likely won't be able to participate and look forward to minutes 
from
> >> the call and a replay.
> >>>>
> >>>> -r
> >>>>
> >>>>> On Jul 6, 2017, at 1:04 PM, Tyson Norris 
<tn...@adobe.com.INVALID>
> >> wrote:
> >>>>>
> >>>>> For a call, can people make July 13 8am PDT work? I think July 17 
is
> >> the next available time slot for us.
> >>>>>
> >>>>> Thanks
> >>>>> Tyson
> >>
>

Re: Scheduling for follow up to improving support for UI driven use cases

Posted by Jeremias Werner <je...@gmail.com>.

Hi Tyson,

I could setup the meeting. Unfortunately I've only 40min free account,
while managing to get a pro license requested which would allow longer
meetings. If no one else is able to open the call with a pro license, I
will share mine and setup the call right before the meeting.

thanks

Jeremias

On Wed, Jul 12, 2017 at 7:44 AM, Tyson Norris <tn...@adobe.com.invalid>
wrote:

> Hi Jememias -
> Thanks - if you're able to attend and setup the zoom.us details I would
> appreciate the help. If not, let me know and I will get it setup.
> I'm traveling tomorrow, so may be out of touch, but will plan to join the
> meeting Thursday.
> Hopefully other interested folks can join as well, and regardless we can
> startup a new email thread on the topic when I'm back full time on Monday
> (with any notes from the meeting).
>
> Thanks
> Tyson
>
> > On Jul 10, 2017, at 4:04 AM, Jeremias Werner <je...@gmail.com>
> wrote:
> >
> > Hi Tyson,
> >
> > having the call at 8am PDT on July 13th sounds good. Further, I like to
> > focus discussion on the technical items you listed. looking forward.
> >
> > thanks!
> >
> > Jeremias
> >
> > On Sat, Jul 8, 2017 at 4:43 PM, Tyson Norris <tn...@adobe.com.invalid>
> > wrote:
> >
> >> I haven't heard anything - any takers for 8am  PDT July 13th call?
> >>
> >> If not, do people want a different day, or just don't want to join the
> >> call?
> >>
> >> We would discuss specifically:
> >> - concurrent activation processing
> >> - implementing support at the load balancer layer
> >> - need the spi support PR to move forward
> >>
> >> Thanks
> >> Tyson
> >>
> >>
> >>
> >>> On Jul 6, 2017, at 5:34 PM, Tyson Norris <ty...@gmail.com>
> wrote:
> >>>
> >>> Sure
> >>> Dragos and I are only available at 8am PDT on July 13th, if people are
> >> not available then, let me know and we can schedule for July 17th or
> later
> >> via doodle.
> >>>
> >>>> On Jul 6, 2017, at 10:24 AM, Rodric Rabbah <ro...@gmail.com> wrote:
> >>>>
> >>>> May I suggest creating a new thread to coordinate times to avoid
> >> polluting the discussion? Perhaps a doodle poll is in order.
> >>>>
> >>>> I likely won't be able to participate and look forward to minutes from
> >> the call and a replay.
> >>>>
> >>>> -r
> >>>>
> >>>>> On Jul 6, 2017, at 1:04 PM, Tyson Norris <tn...@adobe.com.INVALID>
> >> wrote:
> >>>>>
> >>>>> For a call, can people make July 13 8am PDT work? I think July 17 is
> >> the next available time slot for us.
> >>>>>
> >>>>> Thanks
> >>>>> Tyson
> >>
>

Re: Scheduling for follow up to improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

Hi Jememias -
Thanks - if you're able to attend and setup the zoom.us details I would appreciate the help. If not, let me know and I will get it setup.
I'm traveling tomorrow, so may be out of touch, but will plan to join the meeting Thursday.
Hopefully other interested folks can join as well, and regardless we can startup a new email thread on the topic when I'm back full time on Monday (with any notes from the meeting). 

Thanks
Tyson

> On Jul 10, 2017, at 4:04 AM, Jeremias Werner <je...@gmail.com> wrote:
> 
> Hi Tyson,
> 
> having the call at 8am PDT on July 13th sounds good. Further, I like to
> focus discussion on the technical items you listed. looking forward.
> 
> thanks!
> 
> Jeremias
> 
> On Sat, Jul 8, 2017 at 4:43 PM, Tyson Norris <tn...@adobe.com.invalid>
> wrote:
> 
>> I haven't heard anything - any takers for 8am  PDT July 13th call?
>> 
>> If not, do people want a different day, or just don't want to join the
>> call?
>> 
>> We would discuss specifically:
>> - concurrent activation processing
>> - implementing support at the load balancer layer
>> - need the spi support PR to move forward
>> 
>> Thanks
>> Tyson
>> 
>> 
>> 
>>> On Jul 6, 2017, at 5:34 PM, Tyson Norris <ty...@gmail.com> wrote:
>>> 
>>> Sure
>>> Dragos and I are only available at 8am PDT on July 13th, if people are
>> not available then, let me know and we can schedule for July 17th or later
>> via doodle.
>>> 
>>>> On Jul 6, 2017, at 10:24 AM, Rodric Rabbah <ro...@gmail.com> wrote:
>>>> 
>>>> May I suggest creating a new thread to coordinate times to avoid
>> polluting the discussion? Perhaps a doodle poll is in order.
>>>> 
>>>> I likely won't be able to participate and look forward to minutes from
>> the call and a replay.
>>>> 
>>>> -r
>>>> 
>>>>> On Jul 6, 2017, at 1:04 PM, Tyson Norris <tn...@adobe.com.INVALID>
>> wrote:
>>>>> 
>>>>> For a call, can people make July 13 8am PDT work? I think July 17 is
>> the next available time slot for us.
>>>>> 
>>>>> Thanks
>>>>> Tyson
>>

Re: Scheduling for follow up to improving support for UI driven use cases

Posted by Jeremias Werner <je...@gmail.com>.

Hi Tyson,

having the call at 8am PDT on July 13th sounds good. Further, I like to
focus discussion on the technical items you listed. looking forward.

thanks!

Jeremias

On Sat, Jul 8, 2017 at 4:43 PM, Tyson Norris <tn...@adobe.com.invalid>
wrote:

> I haven't heard anything - any takers for 8am  PDT July 13th call?
>
> If not, do people want a different day, or just don't want to join the
> call?
>
> We would discuss specifically:
> - concurrent activation processing
> - implementing support at the load balancer layer
> - need the spi support PR to move forward
>
> Thanks
> Tyson
>
>
>
> > On Jul 6, 2017, at 5:34 PM, Tyson Norris <ty...@gmail.com> wrote:
> >
> > Sure
> > Dragos and I are only available at 8am PDT on July 13th, if people are
> not available then, let me know and we can schedule for July 17th or later
> via doodle.
> >
> >> On Jul 6, 2017, at 10:24 AM, Rodric Rabbah <ro...@gmail.com> wrote:
> >>
> >> May I suggest creating a new thread to coordinate times to avoid
> polluting the discussion? Perhaps a doodle poll is in order.
> >>
> >> I likely won't be able to participate and look forward to minutes from
> the call and a replay.
> >>
> >> -r
> >>
> >>> On Jul 6, 2017, at 1:04 PM, Tyson Norris <tn...@adobe.com.INVALID>
> wrote:
> >>>
> >>> For a call, can people make July 13 8am PDT work? I think July 17 is
> the next available time slot for us.
> >>>
> >>> Thanks
> >>> Tyson
>

Re: Scheduling for follow up to improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

I haven't heard anything - any takers for 8am  PDT July 13th call?

If not, do people want a different day, or just don't want to join the call?

We would discuss specifically:
- concurrent activation processing
- implementing support at the load balancer layer
- need the spi support PR to move forward 

Thanks
Tyson



> On Jul 6, 2017, at 5:34 PM, Tyson Norris <ty...@gmail.com> wrote:
> 
> Sure
> Dragos and I are only available at 8am PDT on July 13th, if people are not available then, let me know and we can schedule for July 17th or later via doodle.
> 
>> On Jul 6, 2017, at 10:24 AM, Rodric Rabbah <ro...@gmail.com> wrote:
>> 
>> May I suggest creating a new thread to coordinate times to avoid polluting the discussion? Perhaps a doodle poll is in order. 
>> 
>> I likely won't be able to participate and look forward to minutes from the call and a replay. 
>> 
>> -r
>> 
>>> On Jul 6, 2017, at 1:04 PM, Tyson Norris <tn...@adobe.com.INVALID> wrote:
>>> 
>>> For a call, can people make July 13 8am PDT work? I think July 17 is the next available time slot for us.
>>> 
>>> Thanks
>>> Tyson

Re: Scheduling for follow up to improving support for UI driven use cases

Posted by Tyson Norris <ty...@gmail.com>.

Sure
Dragos and I are only available at 8am PDT on July 13th, if people are not available then, let me know and we can schedule for July 17th or later via doodle.

> On Jul 6, 2017, at 10:24 AM, Rodric Rabbah <ro...@gmail.com> wrote:
> 
> May I suggest creating a new thread to coordinate times to avoid polluting the discussion? Perhaps a doodle poll is in order. 
> 
> I likely won't be able to participate and look forward to minutes from the call and a replay. 
> 
> -r
> 
>> On Jul 6, 2017, at 1:04 PM, Tyson Norris <tn...@adobe.com.INVALID> wrote:
>> 
>> For a call, can people make July 13 8am PDT work? I think July 17 is the next available time slot for us.
>> 
>> Thanks
>> Tyson

Scheduling for follow up to improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

May I suggest creating a new thread to coordinate times to avoid polluting the discussion? Perhaps a doodle poll is in order. 

I likely won't be able to participate and look forward to minutes from the call and a replay. 

-r

> On Jul 6, 2017, at 1:04 PM, Tyson Norris <tn...@adobe.com.INVALID> wrote:
> 
> For a call, can people make July 13 8am PDT work? I think July 17 is the next available time slot for us.
> 
> Thanks
> Tyson

Re: Improving support for UI driven use cases

Posted by Michael M Behrendt <Mi...@de.ibm.com>.

07/13 would work for me, while 9 am would be better.

Would that work for you?

Sent from my iPhone

> On 6. Jul 2017, at 19:05, Tyson Norris <tn...@adobe.com.INVALID> wrote:
>
>
>> On Jul 5, 2017, at 10:44 PM, Tyson Norris <tn...@adobe.com.INVALID>
wrote:
>>
>> I meant to add : I will work out with Dragos a time to propose asap, and
get back to the group so that we can negotiate a meeting time that will
work for everyone who wants to attend in realtime.
>>
>
> For a call, can people make July 13 8am PDT work? I think July 17 is the
next available time slot for us.
>
> Thanks
> Tyson

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

> On Jul 5, 2017, at 10:44 PM, Tyson Norris <tn...@adobe.com.INVALID> wrote:
> 
> I meant to add : I will work out with Dragos a time to propose asap, and get back to the group so that we can negotiate a meeting time that will work for everyone who wants to attend in realtime.
> 

For a call, can people make July 13 8am PDT work? I think July 17 is the next available time slot for us.

Thanks
Tyson

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

I meant to add : I will work out with Dragos a time to propose asap, and get back to the group so that we can negotiate a meeting time that will work for everyone who wants to attend in realtime.

Thanks
Tyson

On Jul 5, 2017, at 10:42 PM, Tyson Norris <tn...@adobe.com>> wrote:

Thanks everyone for the feedback.

I’d be happy to join a call -

A couple of details on the proposal that may or may not be clear:
- no changes to existing behavior without explicit adoption by the action developer or function client (e.g. developer would have to “allow” the function to receive concurrent activation)
- integrate this support at the load balancer level - instead of publishing to a Kafka topic for an invoker, publish to a container that was launched by an invoker. There is also no reason that multiple load balancers cannot be active, lending to “no changes to existing behavior”

On Jul 4, 2017, at 6:55 AM, Michael Marth <mm...@adobe.com.INVALID>> wrote:

Hi Jeremias, all,

Tyson and Dragos are travelling this week, so that I don’t know by when they get to respond. I have worked with them on this topic, so let me jump in and comment until they are able to reply.

From my POV having a call like you suggest is a really good idea. Let’s wait for Tyson & Dragos to chime in to find a date.

As you mention the discussion so far was jumping across different topics, especially the use case, the problem to be solved and the proposed solution. In preparation of the call I think we can clarify use case and problem on the list. Here’s my view:

Use Case

For us the use case can be summarised with “dynamic, high performance websites/mobile apps”. This implies:
1 High concurrency, i.e. Many requests coming in at the same time
2 The code to be executed is the same code across these different requests (as opposed to a long tail distribution of many different actions being executed concurrently). In our case “many” would mean “hundreds” or a few thousand.
3 The latency (time to start execution) matters, because human users are waiting for the response. Ideally, in these order of magnitudes of concurrent requests the latency should not change much.

All 3 requirements need to be satisfied for this use case.
In the discussion so far it was mentioned that there are other use cases which might have similar requirements. That’s great and I do not want to rule them out, obviously. The above is just to make it clear from where we are coming from.

At this point I would like to mention that it is my understanding that this use case is within OpenWhisk’s strike zone, i.e. Something that we all think is reasonable to support. Please speak up if you disagree.

The Problem

One can look at the problem in two ways:
Either you keep the resources of the OW system constant (i.e. No scaling). In that case latency increases very quickly as demonstrated by Tyson’s tests.
Or you increase the system’s capacity. In that case the amount of machines to satisfy this use case quickly becomes prohibitively expensive to run for the OW operator – where expensive is defined as “compared to traditional web servers” (in our case a standard Node.js server). Meaning, you need 100-1000 concurrent action containers to serve what can be served by 1 or 2 Node.js containers.

Of course, the proposed solution is not a fundamental “fix” for the above. It would only move the needle ~2 orders of magnitude – so that the current problem would not be a problem in reality anymore (and simply remain as a theoretical problem). For me that would be good enough.

The solution approach

Would not like to comment on the proposed solution’s details (and leave that to Dragos and Tyson). However, it was mentioned that the approach would change the programming model for users:
Our mindset and approach was that we explicitly do not want to change how OpenWhisk exposes itself to users. Meaning, users should still be able to use NPMs, etc - i.e. This would be an internal implementation detail that is not visible for users. (we can make things more explicit to users and e.g. Have them requests a special concurrent runtime if we wish to do so – so far we tried to make it transparent to users, though).

Many thanks
Michael

On 03/07/17 14:48, "Jeremias Werner" <je...@gmail.com>> wrote:

Thanks for the write-up and the proposal. I think this is a nice idea and
sounds like a nice way of increasing throughput. Reading through the thread
it feels like there are different topics/problems mixed-up and the
discussion is becoming very complex already.

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us<http://zoom.us/> session where we first give Tyson and Dragos the chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we could step-up in the architecture and
discuss what's needed on higher components like invoker/load-balancer to
get this integrated.

(I said zoom.us<http://zoom.us/> session since I liked the one we had a few weeks ago. It
was efficient and interactive. If you like I could volunteer to setup the
session and/or writing the script/summary)

what do you think?

Many thanks in advance!

Jeremias

On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <ro...@gmail.com>> wrote:

You're discounting with event driven all use cases that are still latency
sensitive because they complete a response by call back or actuation at
completion. IoT, chatbots, notifications, all examples in addition to ui
which are latency sensitive and having uniform expectations on queuing time
is of value.

-r

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

Thanks everyone for the feedback.

I’d be happy to join a call -

On Jul 4, 2017, at 6:55 AM, Michael Marth <mm...@adobe.com.INVALID>> wrote:

Hi Jeremias, all,

Tyson and Dragos are travelling this week, so that I don’t know by when they get to respond. I have worked with them on this topic, so let me jump in and comment until they are able to reply.

From my POV having a call like you suggest is a really good idea. Let’s wait for Tyson & Dragos to chime in to find a date.

Use Case

The Problem

The solution approach

Many thanks
Michael

On 03/07/17 14:48, "Jeremias Werner" <je...@gmail.com>> wrote:

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us<http://zoom.us> session where we first give Tyson and Dragos the chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we could step-up in the architecture and
discuss what's needed on higher components like invoker/load-balancer to
get this integrated.

(I said zoom.us<http://zoom.us> session since I liked the one we had a few weeks ago. It
was efficient and interactive. If you like I could volunteer to setup the
session and/or writing the script/summary)

what do you think?

Many thanks in advance!

Jeremias

On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <ro...@gmail.com>> wrote:

-r

Re: Improving support for UI driven use cases

Posted by Michael Marth <mm...@adobe.com.INVALID>.

Hi Jeremias, all,

Tyson and Dragos are travelling this week, so that I don’t know by when they get to respond. I have worked with them on this topic, so let me jump in and comment until they are able to reply.

From my POV having a call like you suggest is a really good idea. Let’s wait for Tyson & Dragos to chime in to find a date.

Use Case

The Problem

The solution approach

Many thanks
Michael

On 03/07/17 14:48, "Jeremias Werner" <je...@gmail.com>> wrote:

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us session where we first give Tyson and Dragos the chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we could step-up in the architecture and
discuss what's needed on higher components like invoker/load-balancer to
get this integrated.

(I said zoom.us session since I liked the one we had a few weeks ago. It
was efficient and interactive. If you like I could volunteer to setup the
session and/or writing the script/summary)

what do you think?

Many thanks in advance!

Jeremias

On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <ro...@gmail.com>> wrote:

-r

Re: Improving support for UI driven use cases

Posted by Jeremias Werner <je...@gmail.com>.

Hi

Thanks for the write-up and the proposal. I think this is a nice idea and
sounds like a nice way of increasing throughput. Reading through the thread
it feels like there are different topics/problems mixed-up and the
discussion is becoming very complex already.

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us session where we first give Tyson and Dragos the chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we could step-up in the architecture and
discuss what's needed on higher components like invoker/load-balancer to
get this integrated.

(I said zoom.us session since I liked the one we had a few weeks ago. It
was efficient and interactive. If you like I could volunteer to setup the
session and/or writing the script/summary)

what do you think?

Many thanks in advance!

Jeremias

On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <ro...@gmail.com> wrote:

> You're discounting with event driven all use cases that are still latency
> sensitive because they complete a response by call back or actuation at
> completion. IoT, chatbots, notifications, all examples in addition to ui
> which are latency sensitive and having uniform expectations on queuing time
> is of value.
>
> -r

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

You're discounting with event driven all use cases that are still latency sensitive because they complete a response by call back or actuation at completion. IoT, chatbots, notifications, all examples in addition to ui which are latency sensitive and having uniform expectations on queuing time is of value.

-r

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

> On Jul 2, 2017, at 3:05 AM, Markus Thömmes <ma...@me.com> wrote:
> 
> Right, I think the UI workflows are just an example of apps that are latency sensitive in general.
> 
> I had a discussion with Stephen Fink on the matter of detecting ourselves that an action is latency sensitive by using the blocking parameter or as mentioned the user's configuration in terms of web-action vs. non-web action. The conclusion there was, that we probably cannot reliably detect latency sensitivity without asking the user to do so. Having such an option has implications on other aspects of the platform: Why would one not choose that option?
> 

Because a) your use case is event driven and the client trigger simply doesn’t care about the response or b) you want a guarantee that the activation will be processed even if the client stops listening for the response (e.g. they received a 202 instead of 200 after a timeout)

> To Rodric's points I think there are two topics to speak about and discuss:
> 
> 1. The programming model: The current model encourages users to break their actions apart in "functions" that take payload and return payload. Having a deployment model outlined could as noted encourage users to use OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based applications. The current model is nice in that it solves a lot of problems for the customer in terms of scalability and "crash safeness".
> 

But if you require use of the programming model to always achieve scalability, you prevent use of libraries that may not be ported to that programming model. Consider a npm module that is used to wrap twitter API calls. I use that in my action to produce tweets. Is my only option for making my action scale (better than 1 user : 1 container) to reproduce the npm module in terms of OpenWhisk functions for each HTTP call and compute operation?  

> 2. Raw throughput of our deployment model: Setting the concerns aside I think it is valid to explore concurrent invocations of actions on the same container. This does not necessarily mean that users start to deploy monolithic apps as noted above, but it certainly could. Keeping our JSON-in/JSON-out at least for now though, could encourage users to continue to think in functions. Having a toggle per action which is disabled by default might be a good way to start here, since many users might need to change action code to support that notion and for some applications it might not be valid at all. I think it was also already noted, that this imposes some of the "old-fashioned" problems on the user, like: How many concurrent requests will my action be able to handle? That kinda defeats the seemless-scalability point of serverless.

I’m not suggesting changing any programming model, only that the programming model stops at the point that I depend on libraries for anything, so relying on the programming model to achieve throughput scalability will not be practical in many cases. I pointed out that both: the problems are old fashioned yes, and that concurrency is (still) a reasonable way to address them, Also that doing so it is not defeating any scalability provisions of a serverless mantra: additional containers can still be started per action, just not *1 per concurrent user*. You still need to provide some estimate of resource usage of your action. The only difference is that your approach to determining that estimate changes. e.g. if I can estimate that my action operates well at 100 rps with 500 concurrent users, and worse with more concurrent users, then I can configure the system to start more containers once 500 concurrent activations is hit, and stop those containers when it decreases below 500. 

How do you today estimate resource requirements of actions that are single-user workflows? Maybe that is something we can discuss to clarify how it would be done in a concurrent activation model. 

Thanks
Tyson
>

Re: Improving support for UI driven use cases

Posted by Markus Thömmes <ma...@me.com>.

Right, I think the UI workflows are just an example of apps that are latency sensitive in general.

I had a discussion with Stephen Fink on the matter of detecting ourselves that an action is latency sensitive by using the blocking parameter or as mentioned the user's configuration in terms of web-action vs. non-web action. The conclusion there was, that we probably cannot reliably detect latency sensitivity without asking the user to do so. Having such an option has implications on other aspects of the platform: Why would one not choose that option?

To Rodric's points I think there are two topics to speak about and discuss:

1. The programming model: The current model encourages users to break their actions apart in "functions" that take payload and return payload. Having a deployment model outlined could as noted encourage users to use OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based applications. The current model is nice in that it solves a lot of problems for the customer in terms of scalability and "crash safeness".

2. Raw throughput of our deployment model: Setting the concerns aside I think it is valid to explore concurrent invocations of actions on the same container. This does not necessarily mean that users start to deploy monolithic apps as noted above, but it certainly could. Keeping our JSON-in/JSON-out at least for now though, could encourage users to continue to think in functions. Having a toggle per action which is disabled by default might be a good way to start here, since many users might need to change action code to support that notion and for some applications it might not be valid at all. I think it was also already noted, that this imposes some of the "old-fashioned" problems on the user, like: How many concurrent requests will my action be able to handle? That kinda defeats the seemless-scalability point of serverless.

Cheers,
Markus

Am 02. Juli 2017 um 10:42 schrieb Rodric Rabbah <ro...@gmail.com>:

The thoughts I shared around how to realize better packing with intrinsic actions are aligned with the your goals: getting more compute density with a smaller number of machines. This is a very worthwhile goal.

I noted earlier that packing more activations into a single container warrants a different resource manager with its own container life cycle management (e.g., it's almost at the level of: provision a container for me quickly and let me have it to run my monolithic code for as long as I want).

Already some challenges were mentioned, wrt sharing state, resource leaks and possible data races. Perhaps defining the resource isolation model intra container - processes, threads, "node vm", ... - is helpful as you refine your proposal. This can address how one might deal with intra container noisy neighbors as well.

Hence in terms of resource management as the platform level, I think it would be a mistake to treat intra container concurrency the same way as ephemeral activations, that are run and done. Once the architecture and scheduler supports a heterogenous mix of resources, then treating some actions as intrinsic operations becomes easier to realize; in other words complementary to the overall proposed direction if the architecture is done right.

To Alex's point, when you're optimizing for latency, you don't need to be constrained to UI applications. Maybe this is more of a practical motivation based on your workloads.

-r

On Jul 2, 2017, at 2:32 AM, Dascalita Dragos <dd...@gmail.com> wrote:

I think the opportunities for packing computation at finer granularity
will be there. In your approach you're tending, it seems, toward taking
monolithic codes and overlapping their computation. I tend to think this
will work better with another approach.

+1 to making the serverless system smarter in managing and running the code
at scale. I don't think the current state is there right now. There are
limitations which could be improved by simply allowing developers to
control which action can be invoked concurrently. We could also consider
designing the system to "learn" this intent by observing how the action is
configured by the developer: if it's an HTTP endpoint, or an event handler.

As long as today we can improve the performance by allowing concurrency in
actions, and by invoking them faster, why would we not benefit from this
now, and update the implementation later, once the system improves ? Or are
there better ways available now to match this performance that are not
captured in the proposal ?

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

The thoughts I shared around how to realize better packing with intrinsic actions are aligned with the your goals: getting more compute density with a smaller number of machines. This is a very worthwhile goal.

I noted earlier that packing more activations into a single container warrants a different resource manager with its own container life cycle management (e.g., it's almost at the level of: provision a container for me quickly and let me have it to run my monolithic code for as long as I want). 

Already some challenges were mentioned, wrt sharing state, resource leaks and possible data races. Perhaps defining the resource isolation model intra container - processes, threads, "node vm", ... - is helpful as you refine your proposal. This can address how one might deal with intra container noisy neighbors as well. 

Hence in terms of resource management as the platform level, I think it would be a mistake to treat intra container concurrency the same way as ephemeral activations, that are run and done. Once the architecture and scheduler supports a heterogenous mix of resources, then treating some actions as intrinsic operations becomes easier to realize; in other words complementary to the overall proposed direction if the architecture is done right.

To Alex's point, when you're optimizing for latency, you don't need to be constrained to UI applications. Maybe this is more of a practical motivation based on your workloads.

-r

On Jul 2, 2017, at 2:32 AM, Dascalita Dragos <dd...@gmail.com> wrote:

>> I think the opportunities for packing computation at finer granularity
> will be there. In your approach you're tending, it seems, toward taking
> monolithic codes and overlapping their computation. I tend to think this
> will work better with another approach.
> 
> +1 to making the serverless system smarter in managing and running the code
> at scale. I don't think the current state is there right now. There are
> limitations which could be improved by simply allowing developers to
> control which action can be invoked concurrently. We could also consider
> designing the system to "learn" this intent by observing how the action is
> configured by the developer: if it's an HTTP endpoint, or an event handler.
> 
> As long as today we can improve the performance by allowing concurrency in
> actions, and by invoking them faster, why would we not benefit from this
> now, and update the implementation later, once the system improves ? Or are
> there better ways available now to match this performance that are not
> captured in the proposal ?

Re: Improving support for UI driven use cases

Posted by Dascalita Dragos <dd...@gmail.com>.

>  I think the opportunities for packing computation at finer granularity
will be there. In your approach you're tending, it seems, toward taking
monolithic codes and overlapping their computation. I tend to think this
will work better with another approach.

+1 to making the serverless system smarter in managing and running the code
at scale. I don't think the current state is there right now. There are
limitations which could be improved by simply allowing developers to
control which action can be invoked concurrently. We could also consider
designing the system to "learn" this intent by observing how the action is
configured by the developer: if it's an HTTP endpoint, or an event handler.

As long as today we can improve the performance by allowing concurrency in
actions, and by invoking them faster, why would we not benefit from this
now, and update the implementation later, once the system improves ? Or are
there better ways available now to match this performance that are not
captured in the proposal ?


On Sat, Jul 1, 2017 at 10:29 PM Alex Glikson <GL...@il.ibm.com> wrote:

> My main point is - interactive Web applications is certainly not the only
> case which is sensitive to latency (or throughput) under variable load.
> Think of an event that a person presses 'emergency' button in an elevator,
> and we need to respond immediately (it might be even more important than
> occasionally getting a timeout on a web page). So, ideally, the solution
> should address *any* (or as many as possible of) such applications.
>
> Regards,
> Alex
>
>
>
> From:   Tyson Norris <tn...@adobe.com.INVALID>
> To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
> Date:   02/07/2017 01:35 AM
> Subject:        Re: Improving support for UI driven use cases
>
>
>
>
> > On Jul 1, 2017, at 2:07 PM, Alex Glikson <GL...@il.ibm.com> wrote:
> >
> >> a burst of users will quickly exhaust the system, which is only fine
> for
> > event handling cases, and not fine at all for UI use cases.
> >
> > Can you explain why is it fine for event handling cases?
> > I would assume that the key criteria would be, for example, around
> > throughput and/or latency (and their tradeoff with capacity), and not
> > necessarily the nature of the application per se.
> >
> > Regards,
> > Alex
>
> Sure - with event handling, where blocking=false, or where a timeout
> response of 202 (and fetch the response later) is tolerable,  exhausting
> container resources will simply mean that the latency goes up based on the
> number of events generated after the point of saturation.  If you can only
> process 100 events at one time, an arrival of 1000 events at the same time
> means that the second 100 events will only be processed after the first
> 100 (twice normal latency), third 100 events after that (3 times normal
> latency), 4th 100 events after that (4 times normal latency) etc. But if
> no user is sitting at a browser waiting for a response, it is unlikely
> they care whether the processing occurs 10ms or 10min after the triggering
> event. (This is exaggerating, but you get the point)
>
> In the case a user is staring at a browser waiting for response, such a
> variance in latency just due to the raw number of users in the system
> directly relating to the raw number of containers in the system, will not
> be usable. Consider concurrency not as a panacea for exhausting container
> pool resources, but rather a way to dampen the graph of user traffic
> increase vs required container pool increase, making it something like
> 1000:1 (1000 concurrent users requires 1 container) instead of it being a
> 1:1 relationship.
>
> Thanks
> Tyson
>
>
>
>
>

Re: Improving support for UI driven use cases

Posted by Alex Glikson <GL...@il.ibm.com>.

My main point is - interactive Web applications is certainly not the only 
case which is sensitive to latency (or throughput) under variable load. 
Think of an event that a person presses 'emergency' button in an elevator, 
and we need to respond immediately (it might be even more important than 
occasionally getting a timeout on a web page). So, ideally, the solution 
should address *any* (or as many as possible of) such applications.

Regards,
Alex

From:   Tyson Norris <tn...@adobe.com.INVALID>
To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
Date:   02/07/2017 01:35 AM
Subject:        Re: Improving support for UI driven use cases

> On Jul 1, 2017, at 2:07 PM, Alex Glikson <GL...@il.ibm.com> wrote:
> 
>> a burst of users will quickly exhaust the system, which is only fine 
for 
> event handling cases, and not fine at all for UI use cases.
> 
> Can you explain why is it fine for event handling cases?
> I would assume that the key criteria would be, for example, around 
> throughput and/or latency (and their tradeoff with capacity), and not 
> necessarily the nature of the application per se.
> 
> Regards,
> Alex

Sure - with event handling, where blocking=false, or where a timeout 
response of 202 (and fetch the response later) is tolerable,  exhausting 
container resources will simply mean that the latency goes up based on the 
number of events generated after the point of saturation.  If you can only 
process 100 events at one time, an arrival of 1000 events at the same time 
means that the second 100 events will only be processed after the first 
100 (twice normal latency), third 100 events after that (3 times normal 
latency), 4th 100 events after that (4 times normal latency) etc. But if 
no user is sitting at a browser waiting for a response, it is unlikely 
they care whether the processing occurs 10ms or 10min after the triggering 
event. (This is exaggerating, but you get the point)

In the case a user is staring at a browser waiting for response, such a 
variance in latency just due to the raw number of users in the system 
directly relating to the raw number of containers in the system, will not 
be usable. Consider concurrency not as a panacea for exhausting container 
pool resources, but rather a way to dampen the graph of user traffic 
increase vs required container pool increase, making it something like 
1000:1 (1000 concurrent users requires 1 container) instead of it being a 
1:1 relationship. 

Thanks
Tyson

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

> On Jul 1, 2017, at 2:07 PM, Alex Glikson <GL...@il.ibm.com> wrote:
> 
>> a burst of users will quickly exhaust the system, which is only fine for 
> event handling cases, and not fine at all for UI use cases.
> 
> Can you explain why is it fine for event handling cases?
> I would assume that the key criteria would be, for example, around 
> throughput and/or latency (and their tradeoff with capacity), and not 
> necessarily the nature of the application per se.
> 
> Regards,
> Alex

Sure - with event handling, where blocking=false, or where a timeout response of 202 (and fetch the response later) is tolerable,  exhausting container resources will simply mean that the latency goes up based on the number of events generated after the point of saturation.  If you can only process 100 events at one time, an arrival of 1000 events at the same time means that the second 100 events will only be processed after the first 100 (twice normal latency), third 100 events after that (3 times normal latency), 4th 100 events after that (4 times normal latency) etc. But if no user is sitting at a browser waiting for a response, it is unlikely they care whether the processing occurs 10ms or 10min after the triggering event. (This is exaggerating, but you get the point)

In the case a user is staring at a browser waiting for response, such a variance in latency just due to the raw number of users in the system directly relating to the raw number of containers in the system, will not be usable. Consider concurrency not as a panacea for exhausting container pool resources, but rather a way to dampen the graph of user traffic increase vs required container pool increase, making it something like 1000:1 (1000 concurrent users requires 1 container) instead of it being a 1:1 relationship. 

Thanks
Tyson

Re: Improving support for UI driven use cases

Posted by Alex Glikson <GL...@il.ibm.com>.

> a burst of users will quickly exhaust the system, which is only fine for 
event handling cases, and not fine at all for UI use cases.

Can you explain why is it fine for event handling cases?
I would assume that the key criteria would be, for example, around 
throughput and/or latency (and their tradeoff with capacity), and not 
necessarily the nature of the application per se.

Regards,
Alex

From:   Tyson Norris <tn...@adobe.com.INVALID>
To:     "dev@openwhisk.apache.org" <de...@openwhisk.apache.org>
Date:   01/07/2017 10:18 PM
Subject:        Re: Improving support for UI driven use cases

Sure - what I mean is that once N containers are launched, and servicing N 
activations, the N+1th activation is queued and processed sequential to 
some particular one of the previous activations. And N is directly related 
to concurrent users (and actions), so a burst of users will quickly 
exhaust the system, which is only fine for event handling cases, and not 
fine at all for UI use cases. 

So ?sequential? is not quite accurate, but once concurrent activations max 
out the container pool, it behaves as a queue compared to a system that 
concurrently processed activations in a single container - which will have 
its own point of exhaustion admittedly, but I think it is quite common, 
for example to run nodejs applications that happily serve hundreds or 
thousands of concurrent users, so we are taking about adding orders of 
magnitude to the number of concurrent users that can be handled using the 
same pool of resources. 

Thanks
Tyson

> On Jul 1, 2017, at 11:41 AM, Rodric Rabbah <ro...@gmail.com> wrote:
> 
>> the concurrency issue is currently entangled with the controller
> discussion, because sequential processing is enforced
> 
> how so? if you invoke N actions they don't run sequentially - each is 
its
> own activation, unless you actually invokes a sequence. Can you clarify
> this point?
> 
> -r

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

> I'm not sure how you would split out these network vs compute items without action devs taking that responsibility (and not using libraries) or how it would be done generically across runtimes.

You don't think this is already happening? When you use promises and chain promises together, you've already decomposed your computation into smaller operations. It is precisely this that makes the asynchronous model of computing work - even in a single threaded runtime. I think this can be exploited for a serverless polyglot composition. That in itself is a separate topic from the one your raised initially. I think the opportunities for packing computation at finer granularity will be there. In your approach you're tending, it seems, toward taking monolithic codes and overlapping their computation. I tend to think this will work better with another approach. In either case, figuring out how to manage several granularities of concurrent activations applies.

-r

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

On Jul 1, 2017, at 3:31 PM, Rodric Rabbah <ro...@gmail.com> wrote:

>> I’m not sure it would be worth it to force developers down a path of configuring actions based on the network ops of the code within, compared to simply allowing concurrency. 
> 
> I think it will happen naturally: imagine a sequence where the first operation is a request and the rest is the processing. All such request ops can be packed densely. It's way of increasing compute density without exposing users to details. It was just an example of where I can see his paying off. If all the actions are compute heavy there won't be any concurrency in a nodejs runtime. Also among the thoughts I have is how one can generalize this to apply to other language runtimes.
> 

Decomposing a set of network operations to a sequence of requests and operations implies not using any dependencies that use network operations. One convenience of allowing existing languages (instead of a language of sequenced steps) is that devs can use existing libraries that implement these operations in conventional ways. So while it may be possible to provide new ways to organize network and compute operations to maximize efficiency, there won't be a lot of ways to use conventional libraries without scaling based on conventional methods, like concurrent access to runtimes.

I'm not sure how you would split out these network vs compute items without action devs taking that responsibility (and not using libraries) or how it would be done generically across runtimes.

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

> I’m not sure it would be worth it to force developers down a path of configuring actions based on the network ops of the code within, compared to simply allowing concurrency. 

I think it will happen naturally: imagine a sequence where the first operation is a request and the rest is the processing. All such request ops can be packed densely. It's way of increasing compute density without exposing users to details. It was just an example of where I can see his paying off. If all the actions are compute heavy there won't be any concurrency in a nodejs runtime. Also among the thoughts I have is how one can generalize this to apply to other language runtimes.

-r

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

> On Jul 1, 2017, at 1:51 PM, Rodric Rabbah <ro...@gmail.com> wrote:
> 
> 
>> it is quite common, for example to run nodejs applications that happily serve hundreds or thousands of concurrent users, 
> 
> I can see opportunities for treating certain actions as intrinsic for which this kind of gain can be realized. Specifically actions which are performing network operations then computing on the result. One can split the asynchronous io, and pack it into a more granular concurrency engine. Intuitively, I think it's more likely to benefit this way and protect against concurrency bugs and the like but using higher level action primitives.
> 
> -r

I’m not sure what you’re getting at - are you suggesting splitting the network operations out and running them “not as actions” or “separate from compute portion of the action"? I’m sure that would be possible, but it would seem like diminished returns in complexity for 90% of cases where “computing on the result” is lightweight compared to the waiting for IO completion, I’m not sure it would be worth it to force developers down a path of configuring actions based on the network ops of the code within, compared to simply allowing concurrency. 

Thanks
Tyson

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

> it is quite common, for example to run nodejs applications that happily serve hundreds or thousands of concurrent users, 

I can see opportunities for treating certain actions as intrinsic for which this kind of gain can be realized. Specifically actions which are performing network operations then computing on the result. One can split the asynchronous io, and pack it into a more granular concurrency engine. Intuitively, I think it's more likely to benefit this way and protect against concurrency bugs and the like but using higher level action primitives.

-r

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

This is a general problem with finite capacity. Increasing parallelism inside the container is attractive from your point of view because it's a way to increase compute density but has its down sides. In any case: lacking the ability to elastically scale and add new capacity the system will always be subject to queuing.

-r

> On Jul 1, 2017, at 3:17 PM, Tyson Norris <tn...@adobe.com.INVALID> wrote:
> 
> Sure - what I mean is that once N containers are launched, and servicing N activations, the N+1th activation is queued and processed sequential to some particular one of the previous activations. And N is directly related to concurrent users (and actions), so a burst of users will quickly exhaust the system, which is only fine for event handling cases, and not fine at all for UI use cases. 
> 
> So “sequential” is not quite accurate, but once concurrent activations max out the container pool, it behaves as a queue compared to a system that concurrently processed activations in a single container - which will have its own point of exhaustion admittedly, but I think it is quite common, for example to run nodejs applications that happily serve hundreds or thousands of concurrent users, so we are taking about adding orders of magnitude to the number of concurrent users that can be handled using the same pool of resources. 
> 
> Thanks
> Tyson

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

Sure - what I mean is that once N containers are launched, and servicing N activations, the N+1th activation is queued and processed sequential to some particular one of the previous activations. And N is directly related to concurrent users (and actions), so a burst of users will quickly exhaust the system, which is only fine for event handling cases, and not fine at all for UI use cases. 

So “sequential” is not quite accurate, but once concurrent activations max out the container pool, it behaves as a queue compared to a system that concurrently processed activations in a single container - which will have its own point of exhaustion admittedly, but I think it is quite common, for example to run nodejs applications that happily serve hundreds or thousands of concurrent users, so we are taking about adding orders of magnitude to the number of concurrent users that can be handled using the same pool of resources. 

Thanks
Tyson

> On Jul 1, 2017, at 11:41 AM, Rodric Rabbah <ro...@gmail.com> wrote:
> 
>> the concurrency issue is currently entangled with the controller
> discussion, because sequential processing is enforced
> 
> how so? if you invoke N actions they don't run sequentially - each is its
> own activation, unless you actually invokes a sequence. Can you clarify
> this point?
> 
> -r

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

> the concurrency issue is currently entangled with the controller
discussion, because sequential processing is enforced

how so? if you invoke N actions they don't run sequentially - each is its
own activation, unless you actually invokes a sequence. Can you clarify
this point?

-r

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

Point taken on the broad topic :)  On that, I want to point out that the concurrency issue is currently entangled with the controller discussion, because sequential processing is enforced (and therefore concurrent processing prevented) at multiple layers, so for better or worse, it is a bit tricky to talk about them completely in isolation. That said, I agree it will be good to get some focus on the individual items - I will make an attempt to itemize this list and the components implicated for each. 

Thanks!
Tyson


> On Jul 1, 2017, at 10:41 AM, Markus Thömmes <ma...@me.com> wrote:
> 
> Thanks for the veeeery detailed writeup! Great job!
> 
> One thing I'd note: As Rodric pointed out we should break the issues apart and address them one by one.
> 
> For instance the proposed loadbalancer changes (Controllers knows of Containers downstream) is desirable for each workload and not necessary bound to the core of the proposal, which I'd say is the concurrency discussion. I agree that 100% warm container usage is crucial there, but each load will benefit from high container reuse.
> 
> Just not to get the discussion too broad and unfocused.
> 
> On the topic itself: I think its a great idea to further push the performance and, as pointed out, make operating OpenWhisk more efficient. Most issue I see have already been pointed out so I won't repeat them. But I'm quite sure we can figure those out.
> 
> Have great weekend!
> 
> Von meinem iPhone gesendet
> 
>> Am 01.07.2017 um 19:26 schrieb Tyson Norris <tn...@adobe.com.INVALID>:
>> 
>> RE: separate policies - I agree that it would make sense for separating container pools in some way by “event driven” and “ui driven” - I don’t think anything precludes that from happening, but its different than the notion of “long running” and “short running”. e.g. if events flow into the system, the container would be equally long running as if users are continually using the system. I’m not suggesting changing the cold/warm start behavior, rather the response and concurrency behavior, to be more in line with UI drive use cases. In fact the “first user” experience would be exactly the same, its the “99 users that arrive before the first user is complete” experience that would be different. (It may also be appealing to have a notion of “prehot” containers, but I think existing “prewarm” is good enough for many cases). If it's useful to cap the pool usage for either of these cases, nothing prevents that from happening, but I would start with a (simpler) case where there is.a single pool that supports both usages - currently there is no pool that reliably supports the ui case, since “bursty” traffic is immediately excessively latent.
>> 
>> RE: concurrent requests resource usage: I would argue that you would determine resource allocation in (nearly) the same way you should with single-tasked containers. i.e. the only pragmatic way to estimate resource usage is to measure it. In single task case, you might use curl to simulate a single user. In concurrent tasks case, you might use wrk or gatling (or something else) to simulate multiple users. Regardless of the tool, analyzing code etc will not get close enough to accurate measurements compared to empirical testing.
>> 
>> RE: motivation compared to having a number of hot containers - efficiency of resource usage for the *OpenWhisk operator*. No one will be able to afford to run open whisk if they have to run 100 containers *per action* to support a burst of 100 users using any particular action. Consider a burst of 1000 or 10000 users, and a 1000 actions. If a single container can handle the burst of 100 users, it will solve a lot of low-medium use cases efficiently, and in the case of 10000 users, running 100 containers will be more efficient than the 10000 containers you would have to run as single task.
>> 
>> WDYT?
>> 
>> Thanks for the feedback!
>> Tyson
>> 
>> 
>> On Jul 1, 2017, at 9:36 AM, Alex Glikson <GL...@il.ibm.com>> wrote:
>> 
>> Having different policies for different container pools certainly makes
>> sense. Moreover, enhancing the design/implementation so that there is more
>> concurrency and less bottlenecks also sounds like an excellent idea.
>> However, I am unsure specifically regarding the idea of handling multiple
>> requests concurrently by the same container. For example, I wonder how one
>> would determine the desired resource allocation for such container?
>> Wouldn't this re-introduce issues related to sizing, scaling and
>> fragmentation of resources - nicely avoided with single-tasked containers?
>> Also, I wonder what would be the main motivation to implement such a
>> policy compared to just having a number of hot containers, ready to
>> process incoming requests?
>> 
>> Regards,
>> Alex
>> 
>> 
>> 
>> From:   Rodric Rabbah <ro...@gmail.com>>
>> To:     dev@openwhisk.apache.org<ma...@openwhisk.apache.org>
>> Cc:     Dragos Dascalita Haut <dd...@adobe.com>>
>> Date:   01/07/2017 06:56 PM
>> Subject:        Re: Improving support for UI driven use cases
>> 
>> 
>> 
>> Summarizing the wiki notes:
>> 
>> 1. separate control and data plane so that data plane is routed directly
>> to
>> the container
>> 2. desire multiple concurrent function activations in the same container
>> 
>> On 1, I think this is inline with an outstanding desire and corresponding
>> issues to take the data flow out of the system and off the control message
>> critical path. As you pointed out, this has a lot of benefits - including
>> one you didn't mention: streaming (web socket style) in/out of the
>> container. Related issues although not complete in [1][2].
>> 
>> On 2, I think you are starting to see some of the issues as you think
>> through the limits on the action and its container and what that means for
>> the user flow and experience. Both in terms of "time limit" and "memory
>> limit". I think the logging issue can be solved to disentangle
>> activations.
>> But I also wonder if these are going to be longer running "actions" and
>> hence, the model is different: short running vs long running container for
>> which there are different life cycles and hence different scheduling
>> decisions from different container pools.
>> 
>> [1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F788&data=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386&sdata=0I8dFYoUNO%2BA1q2CSMbxCl2ODIERO0letLKUZTuUNtA%3D&reserved=0
>> [2] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F254&data=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386&sdata=G9vNfGDW7cHffU%2FotAyj1US7Hxd6Mqp7pvbwD4xG%2FZ8%3D&reserved=0
>>

Re: Improving support for UI driven use cases

Posted by Markus Thömmes <ma...@me.com>.

Thanks for the veeeery detailed writeup! Great job!

One thing I'd note: As Rodric pointed out we should break the issues apart and address them one by one.

For instance the proposed loadbalancer changes (Controllers knows of Containers downstream) is desirable for each workload and not necessary bound to the core of the proposal, which I'd say is the concurrency discussion. I agree that 100% warm container usage is crucial there, but each load will benefit from high container reuse.

Just not to get the discussion too broad and unfocused.

On the topic itself: I think its a great idea to further push the performance and, as pointed out, make operating OpenWhisk more efficient. Most issue I see have already been pointed out so I won't repeat them. But I'm quite sure we can figure those out.

Have great weekend!

Von meinem iPhone gesendet

> Am 01.07.2017 um 19:26 schrieb Tyson Norris <tn...@adobe.com.INVALID>:
> 
> RE: separate policies - I agree that it would make sense for separating container pools in some way by “event driven” and “ui driven” - I don’t think anything precludes that from happening, but its different than the notion of “long running” and “short running”. e.g. if events flow into the system, the container would be equally long running as if users are continually using the system. I’m not suggesting changing the cold/warm start behavior, rather the response and concurrency behavior, to be more in line with UI drive use cases. In fact the “first user” experience would be exactly the same, its the “99 users that arrive before the first user is complete” experience that would be different. (It may also be appealing to have a notion of “prehot” containers, but I think existing “prewarm” is good enough for many cases). If it's useful to cap the pool usage for either of these cases, nothing prevents that from happening, but I would start with a (simpler) case where there is.a single pool that supports both usages - currently there is no pool that reliably supports the ui case, since “bursty” traffic is immediately excessively latent.
> 
> RE: concurrent requests resource usage: I would argue that you would determine resource allocation in (nearly) the same way you should with single-tasked containers. i.e. the only pragmatic way to estimate resource usage is to measure it. In single task case, you might use curl to simulate a single user. In concurrent tasks case, you might use wrk or gatling (or something else) to simulate multiple users. Regardless of the tool, analyzing code etc will not get close enough to accurate measurements compared to empirical testing.
> 
> RE: motivation compared to having a number of hot containers - efficiency of resource usage for the *OpenWhisk operator*. No one will be able to afford to run open whisk if they have to run 100 containers *per action* to support a burst of 100 users using any particular action. Consider a burst of 1000 or 10000 users, and a 1000 actions. If a single container can handle the burst of 100 users, it will solve a lot of low-medium use cases efficiently, and in the case of 10000 users, running 100 containers will be more efficient than the 10000 containers you would have to run as single task.
> 
> WDYT?
> 
> Thanks for the feedback!
> Tyson
> 
> 
> On Jul 1, 2017, at 9:36 AM, Alex Glikson <GL...@il.ibm.com>> wrote:
> 
> Having different policies for different container pools certainly makes
> sense. Moreover, enhancing the design/implementation so that there is more
> concurrency and less bottlenecks also sounds like an excellent idea.
> However, I am unsure specifically regarding the idea of handling multiple
> requests concurrently by the same container. For example, I wonder how one
> would determine the desired resource allocation for such container?
> Wouldn't this re-introduce issues related to sizing, scaling and
> fragmentation of resources - nicely avoided with single-tasked containers?
> Also, I wonder what would be the main motivation to implement such a
> policy compared to just having a number of hot containers, ready to
> process incoming requests?
> 
> Regards,
> Alex
> 
> 
> 
> From:   Rodric Rabbah <ro...@gmail.com>>
> To:     dev@openwhisk.apache.org<ma...@openwhisk.apache.org>
> Cc:     Dragos Dascalita Haut <dd...@adobe.com>>
> Date:   01/07/2017 06:56 PM
> Subject:        Re: Improving support for UI driven use cases
> 
> 
> 
> Summarizing the wiki notes:
> 
> 1. separate control and data plane so that data plane is routed directly
> to
> the container
> 2. desire multiple concurrent function activations in the same container
> 
> On 1, I think this is inline with an outstanding desire and corresponding
> issues to take the data flow out of the system and off the control message
> critical path. As you pointed out, this has a lot of benefits - including
> one you didn't mention: streaming (web socket style) in/out of the
> container. Related issues although not complete in [1][2].
> 
> On 2, I think you are starting to see some of the issues as you think
> through the limits on the action and its container and what that means for
> the user flow and experience. Both in terms of "time limit" and "memory
> limit". I think the logging issue can be solved to disentangle
> activations.
> But I also wonder if these are going to be longer running "actions" and
> hence, the model is different: short running vs long running container for
> which there are different life cycles and hence different scheduling
> decisions from different container pools.
> 
> [1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F788&data=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386&sdata=0I8dFYoUNO%2BA1q2CSMbxCl2ODIERO0letLKUZTuUNtA%3D&reserved=0
> [2] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F254&data=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386&sdata=G9vNfGDW7cHffU%2FotAyj1US7Hxd6Mqp7pvbwD4xG%2FZ8%3D&reserved=0
>

Re: Improving support for UI driven use cases

Posted by Tyson Norris <tn...@adobe.com.INVALID>.

RE: separate policies - I agree that it would make sense for separating container pools in some way by “event driven” and “ui driven” - I don’t think anything precludes that from happening, but its different than the notion of “long running” and “short running”. e.g. if events flow into the system, the container would be equally long running as if users are continually using the system. I’m not suggesting changing the cold/warm start behavior, rather the response and concurrency behavior, to be more in line with UI drive use cases. In fact the “first user” experience would be exactly the same, its the “99 users that arrive before the first user is complete” experience that would be different. (It may also be appealing to have a notion of “prehot” containers, but I think existing “prewarm” is good enough for many cases). If it's useful to cap the pool usage for either of these cases, nothing prevents that from happening, but I would start with a (simpler) case where there is.a single pool that supports both usages - currently there is no pool that reliably supports the ui case, since “bursty” traffic is immediately excessively latent.

RE: concurrent requests resource usage: I would argue that you would determine resource allocation in (nearly) the same way you should with single-tasked containers. i.e. the only pragmatic way to estimate resource usage is to measure it. In single task case, you might use curl to simulate a single user. In concurrent tasks case, you might use wrk or gatling (or something else) to simulate multiple users. Regardless of the tool, analyzing code etc will not get close enough to accurate measurements compared to empirical testing.

RE: motivation compared to having a number of hot containers - efficiency of resource usage for the *OpenWhisk operator*. No one will be able to afford to run open whisk if they have to run 100 containers *per action* to support a burst of 100 users using any particular action. Consider a burst of 1000 or 10000 users, and a 1000 actions. If a single container can handle the burst of 100 users, it will solve a lot of low-medium use cases efficiently, and in the case of 10000 users, running 100 containers will be more efficient than the 10000 containers you would have to run as single task.

WDYT?

Thanks for the feedback!
Tyson


On Jul 1, 2017, at 9:36 AM, Alex Glikson <GL...@il.ibm.com>> wrote:

Having different policies for different container pools certainly makes
sense. Moreover, enhancing the design/implementation so that there is more
concurrency and less bottlenecks also sounds like an excellent idea.
However, I am unsure specifically regarding the idea of handling multiple
requests concurrently by the same container. For example, I wonder how one
would determine the desired resource allocation for such container?
Wouldn't this re-introduce issues related to sizing, scaling and
fragmentation of resources - nicely avoided with single-tasked containers?
Also, I wonder what would be the main motivation to implement such a
policy compared to just having a number of hot containers, ready to
process incoming requests?

Regards,
Alex



From:   Rodric Rabbah <ro...@gmail.com>>
To:     dev@openwhisk.apache.org<ma...@openwhisk.apache.org>
Cc:     Dragos Dascalita Haut <dd...@adobe.com>>
Date:   01/07/2017 06:56 PM
Subject:        Re: Improving support for UI driven use cases



Summarizing the wiki notes:

1. separate control and data plane so that data plane is routed directly
to
the container
2. desire multiple concurrent function activations in the same container

On 1, I think this is inline with an outstanding desire and corresponding
issues to take the data flow out of the system and off the control message
critical path. As you pointed out, this has a lot of benefits - including
one you didn't mention: streaming (web socket style) in/out of the
container. Related issues although not complete in [1][2].

On 2, I think you are starting to see some of the issues as you think
through the limits on the action and its container and what that means for
the user flow and experience. Both in terms of "time limit" and "memory
limit". I think the logging issue can be solved to disentangle
activations.
But I also wonder if these are going to be longer running "actions" and
hence, the model is different: short running vs long running container for
which there are different life cycles and hence different scheduling
decisions from different container pools.

[1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F788&data=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386&sdata=0I8dFYoUNO%2BA1q2CSMbxCl2ODIERO0letLKUZTuUNtA%3D&reserved=0
[2] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F254&data=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386&sdata=G9vNfGDW7cHffU%2FotAyj1US7Hxd6Mqp7pvbwD4xG%2FZ8%3D&reserved=0

Re: Improving support for UI driven use cases

Posted by Alex Glikson <GL...@il.ibm.com>.

Having different policies for different container pools certainly makes 
sense. Moreover, enhancing the design/implementation so that there is more 
concurrency and less bottlenecks also sounds like an excellent idea. 
However, I am unsure specifically regarding the idea of handling multiple 
requests concurrently by the same container. For example, I wonder how one 
would determine the desired resource allocation for such container? 
Wouldn't this re-introduce issues related to sizing, scaling and 
fragmentation of resources - nicely avoided with single-tasked containers? 
Also, I wonder what would be the main motivation to implement such a 
policy compared to just having a number of hot containers, ready to 
process incoming requests?

Regards,
Alex



From:   Rodric Rabbah <ro...@gmail.com>
To:     dev@openwhisk.apache.org
Cc:     Dragos Dascalita Haut <dd...@adobe.com>
Date:   01/07/2017 06:56 PM
Subject:        Re: Improving support for UI driven use cases



Summarizing the wiki notes:

1. separate control and data plane so that data plane is routed directly 
to
the container
2. desire multiple concurrent function activations in the same container

On 1, I think this is inline with an outstanding desire and corresponding
issues to take the data flow out of the system and off the control message
critical path. As you pointed out, this has a lot of benefits - including
one you didn't mention: streaming (web socket style) in/out of the
container. Related issues although not complete in [1][2].

On 2, I think you are starting to see some of the issues as you think
through the limits on the action and its container and what that means for
the user flow and experience. Both in terms of "time limit" and "memory
limit". I think the logging issue can be solved to disentangle 
activations.
But I also wonder if these are going to be longer running "actions" and
hence, the model is different: short running vs long running container for
which there are different life cycles and hence different scheduling
decisions from different container pools.

[1] https://github.com/apache/incubator-openwhisk/issues/788
[2] https://github.com/apache/incubator-openwhisk/issues/254

Re: Improving support for UI driven use cases

Posted by Rodric Rabbah <ro...@gmail.com>.

Summarizing the wiki notes:

1. separate control and data plane so that data plane is routed directly to
the container
2. desire multiple concurrent function activations in the same container

On 1, I think this is inline with an outstanding desire and corresponding
issues to take the data flow out of the system and off the control message
critical path. As you pointed out, this has a lot of benefits - including
one you didn't mention: streaming (web socket style) in/out of the
container. Related issues although not complete in [1][2].

On 2, I think you are starting to see some of the issues as you think
through the limits on the action and its container and what that means for
the user flow and experience. Both in terms of "time limit" and "memory
limit". I think the logging issue can be solved to disentangle activations.
But I also wonder if these are going to be longer running "actions" and
hence, the model is different: short running vs long running container for
which there are different life cycles and hence different scheduling
decisions from different container pools.

[1] https://github.com/apache/incubator-openwhisk/issues/788
[2] https://github.com/apache/incubator-openwhisk/issues/254