You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@openwhisk.apache.org by GitBox <gi...@apache.org> on 2022/07/14 22:54:35 UTC

[GitHub] [openwhisk] bdoyle0182 opened a new pull request, #5284: WIP: Add scheduler overprovision for new actions before namespace throttling

bdoyle0182 opened a new pull request, #5284:
URL: https://github.com/apache/openwhisk/pull/5284

   
   ## Description
   This is a wip and trying to get feedback if this is an option we could support. This would add two new configs for the scheduler:
   
         ```allow-over-provision-before-throttle = true
         namespace-over-provision-before-throttle-ratio = 1.5``` (would it be preferred by people that this is a fixed value rather than ratio)
   
   If the first config is false, the ratio is not used anywhere. But the purpose of this is to prevent deadlocking on the new scheduler for actions within a namespace due to the new concept of namespace throttling. Since the scheduler can be aggressive with initially over-provisioning, you can get into a scenario where a single action uses up all of the container concurrency of the namespace before another action gets a change to run. What the code change below attempts to do is up the container concurrency limit to the over provision threshold to allow new actions to have a chance to run. This is really needed if one action depends on another. i.e. action a attempts to execute action b, but action a has all of the action concurrency for the namespace. The workload now will never make progress and action a will only ever fail all of its executions. Ultimately the right thing to do is scale out the namespace, but that's manual by a human and can't be done until impact has already occu
 rred and this helps mitigate by slowing throughput rather than total failures.
   
   Example of how it works:
   
   - Namespace A has a container concurrency limit of 30
   - Action A uses all 30 containers
   - With an over provision before throttle ratio of 1.5, the namespace can actually have a max total of 45 containers.
   - Action B comes in and since the namespace has hit its initial limit, the scheduler gives it max of 1 container until the number of containers for the namespace goes below it's normal limit of 30. Since it's not likely provisioned enough it will get action throttled which is still better than not letting any traffic at all.
   - The namespace can get a total of 15 additional actions with a max of 1 container before it is now completely over-provisioned and gets namespace throttled so no additional containers can be created.
   
   I'll admit this is a little hacky, but I do think we need to account for this throttling case until we are able to support action level container concurrency limits. So any better ideas are welcome.
   
   ## Related issue and scope
   <!--- Please include a link to a related issue if there is one. -->
   - [ ] I opened an issue to propose and discuss this change (#????)
   
   ## My changes affect the following components
   <!--- Select below all system components are affected by your change. -->
   <!--- Enter an `x` in all applicable boxes. -->
   - [ ] API
   - [ ] Controller
   - [ ] Message Bus (e.g., Kafka)
   - [ ] Loadbalancer
   - [X] Scheduler
   - [ ] Invoker
   - [ ] Intrinsic actions (e.g., sequences, conductors)
   - [ ] Data stores (e.g., CouchDB)
   - [ ] Tests
   - [ ] Deployment
   - [ ] CLI
   - [ ] General tooling
   - [ ] Documentation
   
   ## Types of changes
   <!--- What types of changes does your code introduce? Use `x` in all the boxes that apply: -->
   - [ ] Bug fix (generally a non-breaking change which closes an issue).
   - [X] Enhancement or new feature (adds new functionality).
   - [ ] Breaking change (a bug fix or enhancement which changes existing behavior).
   
   ## Checklist:
   <!--- Please review the points below which help you make sure you've covered all aspects of the change you're making. -->
   
   - [X] I signed an [Apache CLA](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md).
   - [X] I reviewed the [style guides](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md#coding-standards) and followed the recommendations (Travis CI will check :).
   - [ ] I added tests to cover my changes.
   - [ ] My changes require further changes to the documentation.
   - [ ] I updated the documentation where necessary.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] codecov-commenter commented on pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#issuecomment-1289841014

   # [Codecov](https://codecov.io/gh/apache/openwhisk/pull/5284?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#5284](https://codecov.io/gh/apache/openwhisk/pull/5284?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0d586ac) into [master](https://codecov.io/gh/apache/openwhisk/commit/ef725a653ab112391f79c274d8e3dcfb915d59a3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ef725a6) will **decrease** coverage by `76.95%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #5284       +/-   ##
   ==========================================
   - Coverage   81.49%   4.53%   -76.96%     
   ==========================================
     Files         240     240               
     Lines       14313   14318        +5     
     Branches      610     611        +1     
   ==========================================
   - Hits        11664     650    -11014     
   - Misses       2649   13668    +11019     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/openwhisk/pull/5284?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ntainerpool/v2/FunctionPullingContainerProxy.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9pbnZva2VyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvY29udGFpbmVycG9vbC92Mi9GdW5jdGlvblB1bGxpbmdDb250YWluZXJQcm94eS5zY2FsYQ==) | `0.00% <ø> (-78.41%)` | :arrow_down: |
   | [...rg/apache/openwhisk/core/scheduler/Scheduler.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9zY2hlZHVsZXIvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9vcGVud2hpc2svY29yZS9zY2hlZHVsZXIvU2NoZWR1bGVyLnNjYWxh) | `0.00% <ø> (-10.76%)` | :arrow_down: |
   | [...core/scheduler/queue/SchedulingDecisionMaker.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29yZS9zY2hlZHVsZXIvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9vcGVud2hpc2svY29yZS9zY2hlZHVsZXIvcXVldWUvU2NoZWR1bGluZ0RlY2lzaW9uTWFrZXIuc2NhbGE=) | `0.00% <0.00%> (-97.60%)` | :arrow_down: |
   | [.../main/scala/org/apache/openwhisk/core/WarmUp.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvV2FybVVwLnNjYWxh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ain/scala/org/apache/openwhisk/spi/SpiLoader.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL3NwaS9TcGlMb2FkZXIuc2NhbGE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...n/scala/org/apache/openwhisk/utils/JsHelpers.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL3V0aWxzL0pzSGVscGVycy5zY2FsYQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...scala/org/apache/openwhisk/common/Prometheus.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvbW1vbi9Qcm9tZXRoZXVzLnNjYWxh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...scala/org/apache/openwhisk/common/time/Clock.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvbW1vbi90aW1lL0Nsb2NrLnNjYWxh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...scala/org/apache/openwhisk/core/FeatureFlags.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvcmUvRmVhdHVyZUZsYWdzLnNjYWxh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ala/org/apache/openwhisk/common/ConfigMXBean.scala](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NjYWxhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvb3BlbndoaXNrL2NvbW1vbi9Db25maWdNWEJlYW4uc2NhbGE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [207 more](https://codecov.io/gh/apache/openwhisk/pull/5284/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r1009995134


##########
tests/src/test/scala/org/apache/openwhisk/core/scheduler/queue/test/SchedulingDecisionMakerTests.scala:
##########
@@ -197,7 +197,142 @@ class SchedulingDecisionMakerTests
     testProbe.expectMsg(DecisionResults(EnableNamespaceThrottling(dropMsg = false), 0))
   }
 
-  it should "add an initial container if there is no any" in {
+  it should "add one container when there is no container, and namespace over-provision has capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.5)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = false,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 0, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(AddInitialContainer, 1))
+  }
+
+  it should "enable namespace throttling with dropping msg when there is no container, and namespace over-provision has no capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.0)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 0, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(EnableNamespaceThrottling(dropMsg = true), 0))
+  }
+
+  it should "disable namespace throttling when namespace over-provision has capacity again" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.1)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 1, // there is no container for this action

Review Comment:
   ```suggestion
         existingContainerCount = 1, // there is one container for this action
   ```



##########
tests/src/test/scala/org/apache/openwhisk/core/scheduler/queue/test/SchedulingDecisionMakerTests.scala:
##########
@@ -197,7 +197,142 @@ class SchedulingDecisionMakerTests
     testProbe.expectMsg(DecisionResults(EnableNamespaceThrottling(dropMsg = false), 0))
   }
 
-  it should "add an initial container if there is no any" in {
+  it should "add one container when there is no container, and namespace over-provision has capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.5)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = false,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 0, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(AddInitialContainer, 1))
+  }
+
+  it should "enable namespace throttling with dropping msg when there is no container, and namespace over-provision has no capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.0)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 0, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(EnableNamespaceThrottling(dropMsg = true), 0))
+  }
+
+  it should "disable namespace throttling when namespace over-provision has capacity again" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.1)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 1, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = NamespaceThrottled,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(DisableNamespaceThrottling, 0))
+  }
+
+  it should "enable namespace throttling with dropping msg when there is a container, and namespace over-provision has no additional capacity" in {

Review Comment:
   ```suggestion
     it should "enable namespace throttling without dropping msg when there is a container, and namespace over-provision has no additional capacity" in {
   ```



##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,12 +89,15 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running
+              if !schedulingConfig.allowOverProvisionBeforeThrottle || (schedulingConfig.allowOverProvisionBeforeThrottle && ceiling(
+                limit * schedulingConfig.namespaceOverProvisionBeforeThrottleRatio) - existingContainerCountInNs - inProgressContainerCountInNs <= 0) =>
             logging.info(
               this,
               s"there is no capacity activations will be dropped or throttled, (availableMsg: $availableMsg totalContainers: $totalContainers, limit: $limit, namespaceContainers: ${existingContainerCountInNs}, namespaceInProgressContainer: ${inProgressContainerCountInNs}) [$invocationNamespace:$action]")
             Future.successful(DecisionResults(EnableNamespaceThrottling(dropMsg = totalContainers == 0), 0))
-
+          case NamespaceThrottled if schedulingConfig.allowOverProvisionBeforeThrottle && ceiling(limit * schedulingConfig.namespaceOverProvisionBeforeThrottleRatio) - existingContainerCountInNs - inProgressContainerCountInNs > 0 =>

Review Comment:
   👍 



##########
tests/src/test/scala/org/apache/openwhisk/core/scheduler/queue/test/SchedulingDecisionMakerTests.scala:
##########
@@ -197,7 +197,142 @@ class SchedulingDecisionMakerTests
     testProbe.expectMsg(DecisionResults(EnableNamespaceThrottling(dropMsg = false), 0))
   }
 
-  it should "add an initial container if there is no any" in {
+  it should "add one container when there is no container, and namespace over-provision has capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.5)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = false,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 0, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(AddInitialContainer, 1))
+  }
+
+  it should "enable namespace throttling with dropping msg when there is no container, and namespace over-provision has no capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.0)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 0, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(EnableNamespaceThrottling(dropMsg = true), 0))
+  }
+
+  it should "disable namespace throttling when namespace over-provision has capacity again" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.1)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 1, // there is no container for this action
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = NamespaceThrottled,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.
+    testProbe.expectMsg(DecisionResults(DisableNamespaceThrottling, 0))
+  }
+
+  it should "enable namespace throttling with dropping msg when there is a container, and namespace over-provision has no additional capacity" in {
+    val schedulingConfigNamespaceOverProvisioning =
+      SchedulingConfig(100.milliseconds, 100.milliseconds, 10.seconds, true, 1.0)
+    val decisionMaker =
+      system.actorOf(SchedulingDecisionMaker.props(testNamespace, action, schedulingConfigNamespaceOverProvisioning))
+    val testProbe = TestProbe()
+
+    val msg = QueueSnapshot(
+      initialized = true,
+      incomingMsgCount = new AtomicInteger(0),
+      currentMsgCount = 0,
+      existingContainerCount = 1,
+      inProgressContainerCount = 0,
+      staleActivationNum = 0,
+      existingContainerCountInNamespace = 1, // but there are already 2 containers in this namespace
+      inProgressContainerCountInNamespace = 1,
+      averageDuration = None,
+      limit = 2,
+      stateName = Running,
+      recipient = testProbe.ref)
+
+    decisionMaker ! msg
+
+    // this queue cannot create an initial container so enable throttling and drop messages.

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 merged pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 merged PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r1009796692


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,7 +89,7 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running if totalContainers == 0 || !schedulingConfig.allowOverProvisionBeforeThrottle =>

Review Comment:
   Okay added case to correctly disable namespace throttling if the container that frees up space is a container from the over-provisioning space in the latest commit. Would appreciate approval on this when you have some time @style95 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5284: WIP: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r921636511


##########
core/scheduler/src/main/resources/application.conf:
##########
@@ -68,6 +68,8 @@ whisk {
       stale-threshold = "100 milliseconds"
       check-interval = "100 milliseconds"
       drop-interval = "10 seconds"
+      allow-over-provision-before-throttle = true

Review Comment:
   this would be false if merged making this feature completely opt-in



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r1005946253


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,7 +89,7 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running if totalContainers == 0 || !schedulingConfig.allowOverProvisionBeforeThrottle =>

Review Comment:
   the reasoning for this is so that activations can continue processing when the 15th action gets the last 1 container. If the 15th action can handle the throughput at this point it wouldn't want namespace throttling enabled so it can continue processing activations. You wouldn't want to stop activation processing until the 16th action comes in.
   
   Or am I misremembering and namespace throttling only impacts new actions with 0 containers?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r1007500764


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,7 +89,7 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running if totalContainers == 0 || !schedulingConfig.allowOverProvisionBeforeThrottle =>

Review Comment:
   > Or am I misremembering and namespace throttling only impacts new actions with 0 containers?
   
   This is correct.
   The namespace throttling is used when there is no action throttling data in ETCD which means no queue is created for the action.
   https://github.com/apache/openwhisk/blob/master/core/controller/src/main/scala/org/apache/openwhisk/core/loadBalancer/FPCPoolBalancer.scala#L683



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r1007536497


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,7 +89,7 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running if totalContainers == 0 || !schedulingConfig.allowOverProvisionBeforeThrottle =>

Review Comment:
   I believe I have updated this behavior and added a commit to what you requested in the latest commit.
   
   I think this pr should be good to go except for one remaining issue I just thought of if one of the 15 actions that got the overprovisioning slots and namespace throttling is enabled and one of those actions containers are removed then I'm not sure if namespace throttling will be correctly disabled, I just need to verify that before merging tomorrow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5284: WIP: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r921693145


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,7 +89,7 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running if totalContainers == 0 || !schedulingConfig.allowOverProvisionBeforeThrottle =>

Review Comment:
   If capacity == 0 and feature is turned on and still no containers for action could be created, then turn on namespace throttling. If capacity == 0 and feature is turned off and no containers exist, just skip any considerations and turn on namespace throttling same as existing behavior



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on a diff in pull request #5284: WIP: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r921637864


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -70,7 +70,17 @@ class SchedulingDecisionMaker(
         case _        => Future.successful(DecisionResults(Pausing, 0))
       }
     } else {
-      val capacity = limit - existingContainerCountInNs - inProgressContainerCountInNs
+      val capacity = if (schedulingConfig.allowOverProvisionBeforeThrottle && totalContainers == 0) {

Review Comment:
   if feature is turned on and the action has no containers yet, give it a capacity of 1 on first decision iteration if total containers in use for the namespace are less than the limit * over provision ratio value, thereafter do the normal capacity calculation. 
   
   If this is still over the limit, give 0 capacity and follow the normal code path to turn on namespace throttling



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] bdoyle0182 commented on pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
bdoyle0182 commented on PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#issuecomment-1289815258

   this is now ready for review @style95 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [openwhisk] style95 commented on a diff in pull request #5284: Add scheduler overprovision for new actions before namespace throttling

Posted by GitBox <gi...@apache.org>.
style95 commented on code in PR #5284:
URL: https://github.com/apache/openwhisk/pull/5284#discussion_r1005252691


##########
core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/SchedulingDecisionMaker.scala:
##########
@@ -79,7 +89,7 @@ class SchedulingDecisionMaker(
            *
            * However, if the container exists(totalContainers != 0), the activation is not treated as a failure and the activation is delivered to the container.
            */
-          case Running =>
+          case Running if totalContainers == 0 || !schedulingConfig.allowOverProvisionBeforeThrottle =>

Review Comment:
   I feel there is a subtle difference in the semantics when the feature is enabled.
   Let's say there are one action utilizing 30 containers, and 15 actions with 1 container each.
   The `totalContainers` is always bigger than 0. Then isn't throttling enabled?
   
   The throttling would be enabled when an activation for another action arrives, and no container is created.
   I think it would be better to enable the namespace throttling when the number of containers reaches `namespace limit * overprovisoinRatio`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@openwhisk.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org